2014-05-12 07:17:39,447 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = DC.aws/127.0.0.1
STARTUP_MSG: args = []
STARTUP_MSG: version = 0.20.205.0
STARTUP_MSG: build = https://svn.apache.org/repos/asf ... h-0.20-security-205 -r 1179940; compiled by 'hortonfo' on Fri Oct 7 06:20:32 UTC 2011
************************************************************/
2014-05-12 07:17:39,600 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2014-05-12 07:17:39,613 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
2014-05-12 07:17:39,614 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2014-05-12 07:17:39,614 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system started
2014-05-12 07:17:39,764 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
2014-05-12 07:17:39,773 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source jvm registered.
2014-05-12 07:17:39,774 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source NameNode registered.
2014-05-12 07:17:39,800 INFO org.apache.hadoop.hdfs.util.GSet: VM type = 64-bit
2014-05-12 07:17:39,800 INFO org.apache.hadoop.hdfs.util.GSet: 2% max memory = 17.77875 MB
2014-05-12 07:17:39,800 INFO org.apache.hadoop.hdfs.util.GSet: capacity = 2^21 = 2097152 entries
2014-05-12 07:17:39,800 INFO org.apache.hadoop.hdfs.util.GSet: recommended=2097152, actual=2097152
2014-05-12 07:17:39,823 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner=root
2014-05-12 07:17:39,823 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup=supergroup
2014-05-12 07:17:39,823 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isPermissionEnabled=true
2014-05-12 07:17:39,829 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: dfs.block.invalidate.limit=100
2014-05-12 07:17:39,829 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
2014-05-12 07:17:40,045 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered FSNamesystemStateMBean and NameNodeMXBean
2014-05-12 07:17:40,065 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Caching file names occuring more than 10 times
2014-05-12 07:17:40,078 INFO org.apache.hadoop.hdfs.server.common.Storage: Number of files = 3349287
2014-05-12 07:18:01,677 ERROR org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem initialization failed.
java.io.EOFException
at java.io.DataInputStream.readFully(DataInputStream.java:180)
at java.io.DataInputStream.readLong(DataInputStream.java:399)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:902)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:817)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:362)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:97)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:384)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:358)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:276)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:497)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1268)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1277)
2014-05-12 07:18:01,678 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: java.io.EOFException
at java.io.DataInputStream.readFully(DataInputStream.java:180)
at java.io.DataInputStream.readLong(DataInputStream.java:399)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:902)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:817)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:362)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:97)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:384)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:358)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:276)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:497)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1268)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1277)
2014-05-12 07:18:01,679 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at DC.aws/127.0.0.1
************************************************************/
找了半天,也没找到解决方法。我们的做的是伪分布式环境,到底该怎么搞呢?
format属于大招了,臣妾办不到啊...
补充说明:
我的namenode中 fsimage 文件为445M
我的secondarynamenode中fsimage文件为281M
很明显是二者不同的. 目前有点头绪,正在解救服务器
……………………………………………………………………………………………………
经过长达两个多小时的奋战,终于搞定了...--主要是和之前离职开发的沟通耗费时间
我查看SNN和NN下的current和image目录大小,发现 产生了文件差异,这已经很说明数据已经产生了丢失,在这种情况下,只能采取如下方式来减小数据丢失,尽快回复程序正常
解决方法核心:
用SNN的current和image目录覆盖NN的current和image目录。--当然了,覆盖之前的备份是运维必须做的! 一定要和开发和老总沟通好,确定风险之后进行操作.