mapred.output.compression.type
BLOCK
If the job outputs are to compressed as SequenceFiles, how should
they be compressed? Should be one of NONE, RECORD or BLOCK.
mapred.output.compress
true
Should the job outputs be compressed?
java.lang.RuntimeException: Error in configuring object
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
咨询了大脸,这个错解决了。方法是
将conf/nutch-default.xml中的相关项还原
plugin.folders
plugins
Directories where nutch plugins are located. Each
element may be a relative or absolute path. If absolute, it is used
as is. If relative, it is searched for on the classpath.
(3) 报错
Error: org.apache.nutch.scoring.ScoringFilters.injectedScore(Lorg/apache/hadoop/io/Text;Lorg/apache/nutch/crawl/CrawlDatum;)V
解决方法:
到s6上看tasktracker的log,发现有这么一条,查了下fatal,竟然是“致命的”。。。但也没给出具体原因。
接着找,从浏览器m2:50030进去,找到刚执行的task(中间要重启hadoop),找到报错
2013-04-13 07:45:38,046 ERROR org.apache.hadoop.mapred.TaskTracker: Error running child : java.lang.NoSuchMethodError: org.apache.nutch.scoring.ScoringFilters.injectedScore(Lorg/apache/hadoop/io/Text;Lorg/apache/nutch/crawl/CrawlDatum;)V
at org.apache.nutch.crawl.Injector$InjectMapper.map(Injector.java:141)
at org.apache.nutch.crawl.Injector$InjectMapper.map(Injector.java:59)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
在网上查到,NoSuchMethodError常是因为有功能相同的包在工作,版本又不一致,才导致这个错误。我在看bin/nutch时发现,会读取nutch-1.2/build中的plugins,这就和nutch-1.2/plugins冲突了,即将build这个文件夹重命名,问题就解决了。
(4) 关于分布式搜索,不能搜索HDFS上的内容
终于发现这个WARN,原来是找不到文件,
解决办法:把这个文件夹传到HDFS上解决。 7. 主要参考资料:
1. 《nutch入门学习.pdf》
2. http://hi.baidu.com/erliang20088
3. http://lendfating.blog.163.com/blog/static/1820743672012111311532359/
4. http://blog.iyunv.com/witsmakemen/article/details/8256369