访问http://namenode:50070/ 显示hdfs的信息
还有http://namenode:50030/ 显示jobtracker的信息
再可以用一些常用命令将文件放到hdfs上,如
hadoop fs -put test.txt /user/hadoop/test.text
以上可以证明hdfs基本正常.下面要验证jobtracker和taskTracker是否正常,准备运行hadoop example中的wordcount程序。
cd /usr/local/hadoop/hadoop/hadoop-0.20.203.0
hadoop fs -put conf input
将conf目录拷贝到hdfs
hadoop jar hadoop-examples-0.20.203.0.jar wordcount input output
得到大概这样的结果就对了, 即map增长到100%, reduce也增长到100%,
12/03/05 07:52:09 INFO input.FileInputFormat: Total input paths to process : 15
12/03/05 07:52:09 INFO mapred.JobClient: Running job: job_201203050735_0001
12/03/05 07:52:10 INFO mapred.JobClient: map 0% reduce 0%
12/03/05 07:52:24 INFO mapred.JobClient: map 13% reduce 0%
12/03/05 07:52:25 INFO mapred.JobClient: map 26% reduce 0%
12/03/05 07:52:30 INFO mapred.JobClient: map 40% reduce 0%
12/03/05 07:52:31 INFO mapred.JobClient: map 53% reduce 0%
12/03/05 07:52:36 INFO mapred.JobClient: map 66% reduce 13%
12/03/05 07:52:37 INFO mapred.JobClient: map 80% reduce 13%
12/03/05 07:52:39 INFO mapred.JobClient: map 80% reduce 17%
12/03/05 07:52:42 INFO mapred.JobClient: map 100% reduce 17%
12/03/05 07:52:51 INFO mapred.JobClient: map 100% reduce 100%
12/03/05 07:52:56 INFO mapred.JobClient: Job complete: job_201203050735_0001
12/03/05 07:52:56 INFO mapred.JobClient: Counters: 26
12/03/05 07:52:56 INFO mapred.JobClient: Job Counters
12/03/05 07:52:56 INFO mapred.JobClient: Launched reduce tasks=1
12/03/05 07:52:56 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=68532
12/03/05 07:52:56 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0
12/03/05 07:52:56 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0
12/03/05 07:52:56 INFO mapred.JobClient: Rack-local map tasks=7
12/03/05 07:52:56 INFO mapred.JobClient: Launched map tasks=15
12/03/05 07:52:56 INFO mapred.JobClient: Data-local map tasks=8
12/03/05 07:52:56 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=25151
12/03/05 07:52:56 INFO mapred.JobClient: File Output Format Counters
12/03/05 07:52:56 INFO mapred.JobClient: Bytes Written=14249
12/03/05 07:52:56 INFO mapred.JobClient: FileSystemCounters
12/03/05 07:52:56 INFO mapred.JobClient: FILE_BYTES_READ=21493
12/03/05 07:52:56 INFO mapred.JobClient: HDFS_BYTES_READ=27707
12/03/05 07:52:56 INFO mapred.JobClient: FILE_BYTES_WRITTEN=384596
12/03/05 07:52:56 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=14249
12/03/05 07:52:56 INFO mapred.JobClient: File Input Format Counters
12/03/05 07:52:56 INFO mapred.JobClient: Bytes Read=25869
12/03/05 07:52:56 INFO mapred.JobClient: Map-Reduce Framework
12/03/05 07:52:56 INFO mapred.JobClient: Reduce input groups=754
12/03/05 07:52:56 INFO mapred.JobClient: Map output materialized bytes=21577
12/03/05 07:52:56 INFO mapred.JobClient: Combine output records=1047
12/03/05 07:52:56 INFO mapred.JobClient: Map input records=734
12/03/05 07:52:56 INFO mapred.JobClient: Reduce shuffle bytes=21577
12/03/05 07:52:56 INFO mapred.JobClient: Reduce output records=754
12/03/05 07:52:56 INFO mapred.JobClient: Spilled Records=2094
12/03/05 07:52:56 INFO mapred.JobClient: Map output bytes=34601
12/03/05 07:52:56 INFO mapred.JobClient: Combine input records=2526
12/03/05 07:52:56 INFO mapred.JobClient: Map output records=2526
12/03/05 07:52:56 INFO mapred.JobClient: SPLIT_RAW_BYTES=1838
12/03/05 07:52:56 INFO mapred.JobClient: Reduce input records=1047
最后再hadoop fs -get output /home/hadoop 将output目录取到本地来查看结果。
1.8 单机的停止
stop-all.sh
1.9 遇到的问题
Too many fetch-failures问题
运行wordcount示例时,reduce任务无法达到100%总是卡住在0%.
分析log里面有Too many fetch-failures信息,上网查了一下,有人说要将IP地址还有hostname都要写到/etc/hosts里面,才行。
照着做了一下,发现还是不行。正头大之际,经过不懈努力,终于找到症结:原来ubuntu linux给指定的一个主机名是192,这个主机名192已经成了hdfs的标准配置了。即使本人后来将主机名改成有意义的名字,这个顺序已经不对了,因为发现logs目录下的每个任务的配置文件xml文件还是用的老主机名,新改的主机名根本就没有用上,而这个老主机名存在哪里呢,后来发现这个主机名存在hdfs文件系统的那些文件里。所以需要从1.4步开始到1.8步重新做一遍。如此重新做了一遍以后,运行wordcount示例程序就成功了。