hadoop hive 常见问题解决持续更新

chinaab · 发表于 2018-10-28 12:29:33

　　安装过程中，由于网络终端，导致下面问题：
　　问题1：安装停止在获取安装锁
　　/tmp/scm_prepare_node.tYlmPfrT
　　usingSSH_CLIENT to get the SCM hostname: 172.16.77.20 33950 22
　　opening logging file descriptor
　　正在启动安装脚本...正在获取安装锁...BEGIN flock 4
　　这段大概过了半个小时，一次卸载，一次等了快1个小时，终于过去了，
　　问题2：不能选择主机
　　安装失败了，重新不能选主机
　　图1
　　解决方案，需要清理安装失败文件
　　卸载 Cloudera Manager 5.1.x.和相关软件【官网翻译：高可用】
　　问题3：DNS反向解析PTR localhost：
　　描述：
　　DNS反向解析错误，不能正确解析Cloudera Manager Server主机名
　　日志：
　　Detecting Cloudera Manager Server...
　　Detecting Cloudera Manager Server...
　　BEGIN host -t PTR 192.168.1.198
　　198.1.168.192.in-addr.arpa domain name pointerlocalhost.
　　END (0)
　　using localhost as scm server hostname
　　BEGIN which python
　　/usr/bin/python
　　END (0)
　　BEGIN python -c 'import socket; import sys; s = socket.socket(socket.AF_INET);s.settimeout(5.0); s.connect((sys.argv[1], int(sys.argv[2]))); s.close();'localhost 7182
　　Traceback (most recent call last):
　　File "", line 1, in
　　File "", line 1, in connect
　　socket.error: [Errno 111] Connection refused
　　END (1)
　　could not contact scm server at localhost:7182, giving up
　　waiting for rollback request
　　解决方案：
　　将连不上的机器 /usr/bin/host 文件删掉,执行下面命令：

sudo mv/usr/bin/host /usr/bin/host.bak

　　复制代码
　　说明：
　　不明白cloudera的初衷，这里已经得到 ClouderaManager Server的ip了，却还要把ip解析成主机名来连接
　　由于DNS反向解析没有配置好，根据Cloudera ManagerServer 的ip解析主机名却得到了localhost，造成之后的连接错误
　　这里的解决方案是直接把/usr/bin/host删掉，这样ClouderaManager就会直接使用 ip进行连接，就没有错了
　　参考：
　　问题 4 NTP:
　　问题描述：
　　Bad Health --Clock Offset
　　The host's NTP service did not respond to a request forthe clock offset.
　　解决：
　　配置NTP服务
　　步骤参考：
　　CentOS配置NTP Server:
　　http://www.hailiangchen.com/centos-ntp/
　　国内常用NTP服务器地址及IP
　　http://www.douban.com/note/171309770/
　　修改配置文件：
　　[root@work03 ~]# vim /etc/ntp.conf

Use public servers from the pool.ntp.org project.

Please consider joining the pool (http://www.pool.ntp.org/join.html).
　　server s1a.time.edu.cn prefer
　　server s1b.time.edu.cn
　　server s1c.time.edu.cn
　　restrict 172.16.1.0 mask 255.255.255.0 nomodify /root/ntpdate.log2>&1
　　问题 2.2
　　描述：
　　Clock Offset
　　·       Ensure that thehost's hostname is configured properly.
　　·       Ensure that port7182 is accessible on the Cloudera Manager Server (check firewall rules).
　　·       Ensure that ports9000 and 9001 are free on the host being added.
　　·       Check agent logsin /var/log/cloudera-scm-agent/ on the host being added (some of the logs canbe found in the installation details).
　　问题定位：
　　在对应host（work02、work03）上运行 'ntpdc -c loopinfo'
　　[root@work03 work]# ntpdc -c loopinfo
　　ntpdc: read: Connection refused
　　解决：
　　开启ntp服务：
　　三台机器都开机启动 ntp服务
　　chkconfig ntpd on
　　问题 5 heartbeat:
　　错误信息：
　　Installation failed. Failed to receive heartbeat from agent.
　　解决：关闭防火墙
　　问题 6 Unknow Health：
　　Unknow Health
　　重启后：Request to theHost Monitor failed.
　　service --status-all| grep clo
　　机器上查看scm-agent状态：cloudera-scm-agentdead but pid file exists
　　解决：重启服务
　　service cloudera-scm-agent restart
　　service cloudera-scm-server restart
　　问题 7 canonial name hostnameconsistent：
　　Bad Health
　　The hostname and canonical name for this host are notconsistent when checked from a Java process.
　　canonical name：
　　4092 Monitor-HostMonitor throttling_loggerWARNING  (29 skipped) hostname work02 differs from the canonical namework02.xinzhitang.com
　　解决：修改hosts 使FQDN和 hostname相同
　　ps：虽然解决了但是不明白为什么主机名和主机别名要一样
　　/etc/hosts
　　192.168.1.185 work01 work01
　　192.168.1.141 work02 work02
　　192.168.1.198 work03 work03
　　问题 8 Concerning Health：
　　Concerning Health Issue
　　--  Network Interface Speed --
　　描述：The host has 2 network interface(s) that appear to beoperating at less than full speed. Warning threshold: any.
　　详细：
　　This is a host health test that checks for networkinterfaces that appear to be operating at less than full speed.
　　A failure of this health test may indicate that network interface(s) may beconfigured incorrectly and may be causing performance problems. Use the ethtoolcommand to check and configure the host's network interfaces to use the fastestavailable link speed and duplex mode.
　　解决：
　　本次测试修改了 Cloudera Manager 的配置，应该不算是真正的解决
　　问题10 IOException thrown while collecting data from host: No route to host
　　原因：agent开启了防火墙
　　解决：service iptables stop
　　问题11
　　2、Clouderarecommendssetting /proc/sys/vm/swappiness to 0. Current setting is 60. Use thesysctlcommand to change this setting at runtime and edit /etc/sysctl.conf forthissetting to be saved after a reboot. You may continue with installation, butyoumay run into issues with Cloudera Manager reporting that your hostsareunhealthy because they are swapping. The following hosts are affected:
　　解决：

echo 0>/proc/sys/vm/swappiness （toapply for now）

sysctl-wvm.swappiness=0  （to makethis persistentacross reboots）
　　问题12 时钟不同步（同步至中科大时钟服务器202.141.176.110）

echo "0 3 * **/usr/sbin/ntpdate 202.141.176.110;/sbin/hwclock–w">>/var/spool/cron/root

service crondrestart

ntpdate202.141.176.110
　　问题13 The host's NTPservice didnot respond to a request for the clock offset.
　　#service ntpdstart

ntpdc -cloopinfo (thehealth will be good if this command executed successfully)
　　问题14 The Cloudera ManagerAgentis not able to communicate with this role's web server.
　　一种原因是元数据数据库无法连接，请检查数据库配置：
　　问题15 Hive MetastoreServer无法启动，修改Hive元数据数据库配置（当我们修改主机名后即应修改元数据数据库配置）：
　　问题排查方式
　　一般的错误，查看错误输出，按照关键字google
　　异常错误（如namenode、datanode莫名其妙挂了）：查看hadoop（$HADOOP_HOME/logs）或hive日志
　　hadoop错误
　　问题16 datanode无法正常启动
　　添加datanode后，datanode无法正常启动，进程一会莫名其妙挂掉，查看namenode日志显示如下：
　　

Text代码　　

　　2013-06-21 18:53:39,182 FATALorg.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.getDatanode: Data nodex.x.x.x:50010 is attempting to report storage>　　原因分析：
　　拷贝hadoop安装包时，包含data与tmp文件夹（见本人《hadoop安装》一文），未成功格式化datanode
　　解决办法：
　　

Shell代码　　

　　rm -rf /data/hadoop/hadoop-1.1.2/data
　　rm -rf /data/hadoop/hadoop-1.1.2/tmp
　　hadoop datanode -format
　　问题17  safe mode
　　Text代码
　　2013-06-2010:35:43,758 ERROR org.apache.hadoop.security.UserGroupInformation:PriviledgedActionException as:hadoopcause:org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot renewlease for DFSClient_hb_rs_wdev1.corp.qihoo.net,60020,1371631589073. Name nodeis in safe mode.
　　解决方案：
　　

   Shell代码　　

　　hadoopdfsadmin -safemode leave
　　问题18 连接异常
　　Text代码
　　2013-06-21 19:55:05,801 WARNorg.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Call tohomename/x.x.x.x:9000 failed on local exception: java.io.EOFException
　　可能原因：
　　namenode监听127.0.0.1:9000，而非0.0.0.0:9000或外网IP:9000
　　iptables限制
　　解决方案：
　　检查/etc/hosts配置，使得hostname绑定到非127.0.0.1的IP上
　　iptables放开端口

　　问题19 namenode>　　Text代码
　　ERROR org.apache.hadoop.hdfs.server.datanode.DataNode:java.io.IOException: Incompatible namespaceIDs in/var/lib/hadoop-0.20/cache/hdfs/dfs/data: namenode namespaceID = 240012870;datanode namespaceID = 1462711424 .
　　

问题：Namenode上namespaceID与datanode上namespaceID不一致。　　

　　问题产生原因：每次namenode format会重新创建一个namenodeId,而tmp/dfs/data下包含了上次format下的id,namenode format清空了namenode下的数据,但是没有清空datanode下的数据,所以造成namenode节点上的namespaceID与 datanode节点上的namespaceID不一致。启动失败。
　　解决办法：参考该网址http://blog.csdn.net/wh62592855/archive/2010/07/21/5752199.aspx 给出两种解决方法，我们使用的是第一种解决方法：即:
　　(1)停掉集群服务
　　(2)在出问题的datanode节点上删除data目录，data目录即是在hdfs-site.xml文件中配置的 dfs.data.dir目录，本机器上那个是/var/lib/hadoop-0.20/cache/hdfs/dfs/data/(注：我们当时在所有的datanode和namenode节点上均执行了该步骤。以防删掉后不成功，可以先把data目录保存一个副本).
　　(3)格式化namenode.
　　(4)重新启动集群。
　　问题解决。
　　这种方法带来的一个副作用即是，hdfs上的所有数据丢失。如果hdfs上存放有重要数据的时候，不建议采用该方法，可以尝试提供的网址中的第二种方法。
　　问题20 目录权限
　　

start-dfs.sh执行无错，显示启动datanode，执行完后无datanode。查看datanode机器上的日志，显示因dfs.data.dir目录权限不正确导致：　　

　　
Text代码
　　

　　expected: drwxr-xr-x,current:drwxrwxr-x
　　解决办法：
　　查看dfs.data.dir的目录配置，修改权限即可。
　　hive错误
　　问题21 NoClassDefFoundError

　　Could not initialize>　　将protobuf-***.jar添加到jars路径
　　

   Xml代码　　

　　//$HIVE_HOME/conf/hive-site.xml
　　hive.aux.jars.path
　　file:///data/hadoop/hive-0.10.0/lib/hive-hbase-handler-0.10.0.jar,file:///data/hadoop/hive-0.10.0/lib/hbase-0.94.8.jar,file:///data/hadoop/hive-0.10.0/lib/zookeeper-3.4.5.jar,file:///data/hadoop/hive-0.10.0/lib/guava-r09.jar,file:///data/hadoop/hive-0.10.0/lib/hive-contrib-0.10.0.jar,file:///data/hadoop/hive-0.10.0/lib/protobuf-java-2.4.0a.jar
　　问题22 hive动态分区异常
　　[Fatal Error] Operator FS_2 (id=2): Number of dynamic partitions exceededhive.exec.max.dynamic.partitions.pernode
　　

Shell代码　　

　　
hive> sethive.exec.max.dynamic.partitions.pernode = 10000;
　　

　　问题23 mapreduce进程超内存限制——hadoop Java heap space
　　vim mapred-site.xml添加：
　　

   Xml代码　　

　　//mapred-site.xml
　　

   mapred.child.java.opts　　

　　-Xmx2048m
　　

　　Shell代码
　　

　　#$HADOOP_HOME/conf/hadoop_env.sh
　　exportHADOOP_HEAPSIZE=5000
　　问题24 hive文件数限制
　　[Fatal Error] total number of created files now is 100086, which exceeds 100000
　　

Shell代码　　

　　
hive> sethive.exec.max.created.files=655350;
　　

　　问题25 hive 5.metastore连接超时
　　Text代码
　　FAILED:SemanticException org.apache.thrift.transport.TTransportException:java.net.SocketTimeoutException: Read timed out
　　解决方案：
　　

   Shell代码　　

　　hive>set hive.metastore.client.socket.timeout=500;
　　问题26 hive 6.java.io.IOException: error=7, Argument list too long
　　Text代码
　　Task withthe most failures(5):

　　Task>　　task_201306241630_0189_r_000009
　　URL:
　　http://namenode.godlovesdog.com:50030/taskdetails.jsp?jobid=job_201306241630_0189&tipid=task_201306241630_0189_r_000009
　　DiagnosticMessages for this Task:
　　java.lang.RuntimeException:org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error whileprocessing row (tag=0){"key":{"reducesinkkey0":"164058872","reducesinkkey1":"djh,S1","reducesinkkey2":"20130117170703","reducesinkkey3":"xxx"},"value":{"_col0":"1","_col1":"xxx","_col2":"20130117170703","_col3":"164058872","_col4":"xxx,S1"},"alias":0}
　　

   atorg.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:270)　　

　　at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:520)
　　

　　atorg.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:421)
　　

　　atorg.apache.hadoop.mapred.Child$4.run(Child.java:255)
　　

　　atjava.security.AccessController.doPrivileged(Native Method)
　　

　　at javax.security.auth.Subject.doAs(Subject.java:415)
　　

　　atorg.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
　　

　　atorg.apache.hadoop.mapred.Child.main(Child.java:249)
　　

　　Caused by:org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error whileprocessing row (tag=0){"key":{"reducesinkkey0":"164058872","reducesinkkey1":"xxx,S1","reducesinkkey2":"20130117170703","reducesinkkey3":"xxx"},"value":{"_col0":"1","_col1":"xxx","_col2":"20130117170703","_col3":"164058872","_col4":"djh,S1"},"alias":0}
　　

   atorg.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:258)　　

　　... 7 more
　　

　　Caused by:org.apache.hadoop.hive.ql.metadata.HiveException: [Error 20000]: Unable toinitialize custom script.
　　

   atorg.apache.hadoop.hive.ql.exec.ScriptOperator.processOp(ScriptOperator.java:354)　　

　　atorg.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)
　　

　　atorg.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)
　　

　　atorg.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
　　

　　atorg.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)
　　

　　atorg.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)
　　

　　atorg.apache.hadoop.hive.ql.exec.ExtractOperator.processOp(ExtractOperator.java:45)
　　

　　at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)
　　

　　atorg.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:249)
　　

　　... 7 more
　　

　　Caused by:java.io.IOException: Cannot run program "/usr/bin/python2.7":error=7, 参数列表过长
　　

   at java.lang.ProcessBuilder.start(ProcessBuilder.java:1042)　　

　　atorg.apache.hadoop.hive.ql.exec.ScriptOperator.processOp(ScriptOperator.java:313)
　　

　　... 15 more
　　

　　Caused by:java.io.IOException: error=7, 参数列表过长
　　

   atjava.lang.UNIXProcess.forkAndExec(Native Method)　　

　　at java.lang.UNIXProcess.(UNIXProcess.java:135)
　　

　　atjava.lang.ProcessImpl.start(ProcessImpl.java:130)
　　

　　atjava.lang.ProcessBuilder.start(ProcessBuilder.java:1023)
　　

　　... 16 more
　　

　　FAILED:Execution Error, return code 20000 fromorg.apache.hadoop.hive.ql.exec.MapRedTask. Unable to initialize custom script.
　　解决方案：
　　升级内核或减少分区数https://issues.apache.org/jira/browse/HIVE-2372
　　问题27 hive 6.runtime error
　　

Shell代码　　

　　hive> show tables;
　　FAILED: Error in metadata: java.lang.RuntimeException:Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
　　FAILED: Execution Error, return code 1 fromorg.apache.hadoop.hive.ql.exec.DDLTask
　　问题排查：
　　

   Shell代码　　

　　hive -hiveconf hive.root.logger=DEBUG,console
　　

   Text代码　　

　　13/07/15 16:29:24 INFO hive.metastore: Trying to connectto metastore with URI thrift://xxx.xxx.xxx.xxx:9083
　　13/07/15 16:29:24 WARN hive.metastore: Failed to connectto the MetaStore Server...
　　org.apache.thrift.transport.TTransportException:java.net.ConnectException: 拒绝连接
　　。。。
　　MetaException(message:Could not connect to meta storeusing any of the URIs provided. Most recent failure:org.apache.thrift.transport.TTransportException: java.net.ConnectException: 拒绝连接
　　

尝试连接9083端口，netstat查看该端口确实没有被监听，第一反应是hiveserver没有正常启动。查看hiveserver进程却存在，只是监听10000端口。　　

　　查看hive-site.xml配置，hive客户端连接9083端口，而hiveserver默认监听10000，找到问题根源了
　　解决办法：
　　

Shell代码　　

　　hive --service hiveserver -p 9083
　　//或修改$HIVE_HOME/conf/hive-site.xml的hive.metastore.uris部分
　　//将端口改为10000
　　using /usr/lib/hive as HIVE_HOME
　　using /var/run/cloudera-scm-agent/process/193-hive-HIVEMETASTOREas HIVE_CONF_DIR
　　using /usr/lib/hadoop as HADOOP_HOME
　　using/var/run/cloudera-scm-agent/process/193-hive-HIVEMETASTORE/yarn-conf asHADOOP_CONF_DIR
　　ERROR: Failed to find hive-hbase storage handler jars toadd in hive-site.xml. Hive queries that use Hbase storage handler may not workuntil this is fixed.
　　Wed Oct 22 18:48:53 CST 2014
　　JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
　　using /usr/java/jdk1.7.0_45-cloudera as JAVA_HOME
　　using 5 as CDH_VERSION
　　using /usr/lib/hive as HIVE_HOME
　　using /var/run/cloudera-scm-agent/process/193-hive-HIVEMETASTOREas HIVE_CONF_DIR
　　using /usr/lib/hadoop as HADOOP_HOME
　　using/var/run/cloudera-scm-agent/process/193-hive-HIVEMETASTORE/yarn-conf asHADOOP_CONF_DIR
　　ERROR: Failed to find hive-hbase storage handler jars toadd in hive-site.xml. Hive queries that use Hbase storage handler may not workuntil this is fixed.
　　Wed Oct 22 18:48:55 CST 2014
　　JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
　　using /usr/java/jdk1.7.0_45-cloudera as JAVA_HOME
　　using 5 as CDH_VERSION
　　using /usr/lib/hive as HIVE_HOME
　　using/var/run/cloudera-scm-agent/process/193-hive-HIVEMETASTORE as HIVE_CONF_DIR
　　using /usr/lib/hadoop as HADOOP_HOME
　　using/var/run/cloudera-scm-agent/process/193-hive-HIVEMETASTORE/yarn-conf asHADOOP_CONF_DIR
　　ERROR: Failed to find hive-hbase storage handler jars toadd in hive-site.xml. Hive queries that use Hbase storage handler may not workuntil this is fixed.
　　Wed Oct 22 18:48:58 CST 2014
　　JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
　　using /usr/java/jdk1.7.0_45-cloudera as JAVA_HOME
　　using 5 as CDH_VERSION
　　using /usr/lib/hive as HIVE_HOME
　　using/var/run/cloudera-scm-agent/process/193-hive-HIVEMETASTORE as HIVE_CONF_DIR
　　using /usr/lib/hadoop as HADOOP_HOME
　　using/var/run/cloudera-scm-agent/process/193-hive-HIVEMETASTORE/yarn-conf asHADOOP_CONF_DIR
　　ERROR: Failed to find hive-hbase storage handler jars toadd in hive-site.xml. Hive queries that use Hbase storage handler may not workuntil this is fixed.
　　JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
　　using /usr/java/jdk1.7.0_45-cloudera as JAVA_HOME
　　using 5 as CDH_VERSION
　　using /usr/lib/hive as HIVE_HOME
　　using /var/run/cloudera-scm-agent/process/212-hive-metastore-create-tables as HIVE_CONF_DIR
　　using /usr/lib/hadoop as HADOOP_HOME
　　using /var/run/cloudera-scm-agent/process/212-hive-metastore-create-tables/yarn-conf as HADOOP_CONF_DIR
　　ERROR: Failed to find hive-hbase storage handler jars to add in hive-site.xml. Hive queries that use Hbase storage handler may not work until this is fixed.
　　查看  /usr/lib/hive 是否正常
　　正常的
　　下午3点21:09.801       FATAL       org.apache.hadoop.hbase.master.HMaster
　　Unhandled exception. Starting shutdown.
　　java.io.IOException: error or interruptedwhile splitting logs in[hdfs://master:8020/hbase/WALs/slave2,60020,1414202360923-splitting] Task =installed = 2 done = 1 error = 1
　　

   atorg.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:362)　　

　　atorg.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:409)
　　

　　atorg.apache.hadoop.hbase.master.MasterFileSystem.splitMetaLog(MasterFileSystem.java:301)
　　

　　atorg.apache.hadoop.hbase.master.MasterFileSystem.splitMetaLog(MasterFileSystem.java:292)
　　

　　atorg.apache.hadoop.hbase.master.HMaster.splitMetaLogBeforeAssignment(HMaster.java:1070)
　　

　　atorg.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:854)
　　

　　atorg.apache.hadoop.hbase.master.HMaster.run(HMaster.java:606)
　　

　　atjava.lang.Thread.run(Thread.java:744)
　　

　　下午3点46:12.903       FATAL       org.apache.hadoop.hbase.master.HMaster
　　Unhandled exception. Starting shutdown.
　　java.io.IOException: error or interruptedwhile splitting logs in[hdfs://master:8020/hbase/WALs/slave2,60020,1414202360923-splitting] Task =installed = 1 done = 0 error = 1
　　

   atorg.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:362)　　

　　atorg.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:409)
　　

　　atorg.apache.hadoop.hbase.master.MasterFileSystem.splitMetaLog(MasterFileSystem.java:301)
　　

　　atorg.apache.hadoop.hbase.master.MasterFileSystem.splitMetaLog(MasterFileSystem.java:292)
　　

　　atorg.apache.hadoop.hbase.master.HMaster.splitMetaLogBeforeAssignment(HMaster.java:1070)
　　

　　atorg.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:854)
　　

　　atorg.apache.hadoop.hbase.master.HMaster.run(HMaster.java:606)
　　

　　atjava.lang.Thread.run(Thread.java:744)
　　

　　解决方法：
　　在hbase-site.xml加入一条，让启动hbase集群时不做hlog splitting
　　
　　hbase.master.distributed.log.splitting
　　false
　　
　　[root@master ~]# hadoop fs -mv/hbase/WALs/slave2,60020,1414202360923-splitting/  /test
　　[root@master ~]# hadoop fs -ls /test
　　2014-10-28 14:31:32,879  INFO[hconnection-0xd18e8a7-shared--pool2-t224] (AsyncProcess.java:673) - #3,table=session_service_201410210000_201410312359, attempt=14/35 failed 1383 ops,last exception: org.apache.hadoop.hbase.RegionTooBusyException:org.apache.hadoop.hbase.RegionTooBusyException: Above memstore limit,regionName=session_service_201410210000_201410312359,7499999991,1414203068872.08ee7bb71161cb24e18ddba4c14da0f2.,server=slave1,60020,1414380404290, memstoreSize=271430320,blockingMemStoreSize=268435456
　　

atorg.apache.hadoop.hbase.regionserver.HRegion.checkResources(HRegion.java:2561)　　

　　atorg.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:1963)
　　

　　at org.apache.hadoop.hbase.regionserver.HRegionServer.doBatchOp(HRegionServer.java:4050)
　　

　　atorg.apache.hadoop.hbase.regionserver.HRegionServer.doNonAtomicRegionMutation(HRegionServer.java:3361)
　　

　　atorg.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3265)
　　

　　atorg.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:26935)
　　

　　at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2175)
　　

　　at org.apache.hadoop.hbase.ipc.RpcServer$Handler.run(RpcServer.java:1879)
　　

　　Exception
　　Description
　　ClockOutOfSyncException
　　当一个RegionServer始终偏移太大时，master节点结将会抛出此异常.
　　DoNotRetryIOException
　　用于提示不要再重试的异常子类: 如UnknownScannerException.
　　DroppedSnapshotException
　　如果在flush过程中快照内容并没有正确的存储到文件中时,该异常将被抛出.
　　HBaseIOException
　　所有hbase特定的IOExceptions都是HBaseIOException类的子类.
　　InvalidFamilyOperationException
　　Hbase接收修改表schema的请求，但请求中对应的列族名无效.
　　MasterNotRunningException
　　master节点没有运行的异常
　　NamespaceExistException
　　已存在某namespace的异常
　　NamespaceNotFoundException
　　找不到该namsespace的异常
　　NotAllMetaRegionsOnlineException
　　某操作需要所有root及meta节点同时在线,但实际情况不满足该操作要求
　　NotServingRegionException
　　向某RegionServer发送访问请求，但是它并没有反应或该region不可用.
　　PleaseHoldException
　　当某个ResionServer宕掉并由于重启过快而导致master来不及处理宕掉之前的server实例, 或者用户调用admin级操作时master正处于初始化状态时, 或者在正在启动的RegionServer上进行操作时都会抛出此类异常.
　　RegionException
　　访问region时出现的异常.
　　RegionTooBusyException
　　RegionServer处于繁忙状态并由于阻塞而等待提供服务的异常.
　　TableExistsException
　　已存在某表的异常
　　TableInfoMissingException
　　在table目录下无法找到.tableinfo文件的异常
　　TableNotDisabledException
　　某个表没有正常处于禁用状态的异常
　　TableNotEnabledException
　　某个表没有正常处于启用状态的异常
　　TableNotFoundException
　　无法找到某个表的异常
　　UnknownRegionException
　　访问无法识别的region引起的异常.
　　UnknownScannerException

　　向RegionServer传递了无法识别的scanner>　　YouAreDeadException
　　当一个RegionServer报告它已被处理为dead状态，由master抛出此异常.
　　ZooKeeperConnectionException
　　客户端无法连接到zookeeper的异常.
　　INFO
　　org.apache.hadoop.hbase.regionserver.MemStoreFlusher
　　Waited 90779ms on a compaction to clean up 'too many store files'; waited long enough... proceeding with flush of session_service_201410210000_201410312359,7656249951,1414481868315.bbf0a49fb8a9b650a584769ddd1fdd89.
　　MemStoreFlusher实例生成时会启动MemStoreFlusher.FlushHandler线程实例，
　　此线程个数通过hbase.hstore.flusher.count配置,默认为1
　　一台机器硬盘满，一台机器硬盘不满的情况：
　　群集中有 26,632 个副本不足的块块。群集中共有 84,822 个块。百分比副本不足的块: 31.40%。警告阈值：10.00%。
　　群集中有 27,278 个副本不足的块块。群集中共有 85,476 个块。百分比副本不足的块: 31.91%。警告阈值：10.00%。
　　下午4点08:53.847
　　INFO
　　org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher
　　Flushed, sequenceid=45525, memsize=124.2 M, hasBloomFilter=true, into tmp file hdfs://master:8020/hbase/data/default/session_service_201410260000_201410312359/a3b64675b0069b8323665274e2f95cdc/.tmp/b7fa4f5f85354ecc96aa48a09081f786
　　下午4点08:53.862
　　INFO
　　org.apache.hadoop.hbase.regionserver.HStore
　　Added hdfs://master:8020/hbase/data/default/session_service_201410260000_201410312359/a3b64675b0069b8323665274e2f95cdc/f/b7fa4f5f85354ecc96aa48a09081f786, entries=194552, sequenceid=45525, filesize=47.4 M
　　下午4点09:00.378
　　WARN
　　org.apache.hadoop.ipc.RpcServer
　　(responseTooSlow): {"processingtimems":39279,"call":"Scan(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ScanRequest)","client":"192.168.5.9:41284","starttimems":1414656501099,"queuetimems":0,"class":"HRegionServer","responsesize":16,"method":"Scan"}
　　下午4点09:00.379
　　WARN
　　org.apache.hadoop.ipc.RpcServer

　　RpcServer.respondercallId: 33398 service: ClientService methodName: Scan>　　下午4点09:00.380
　　WARN
　　org.apache.hadoop.ipc.RpcServer
　　RpcServer.handler=79,port=60020: caught a ClosedChannelException, this means that the server was processing a request but the client went away. The error message was: null
　　下午4点09:00.381
　　INFO
　　org.apache.hadoop.hbase.regionserver.HRegion
　　Finished memstore flush of ~128.1 M/134326016, currentsize=2.4 M/2559256 for region session_service_201410260000_201410312359,6406249959,1414571385831.a3b64675b0069b8323665274e2f95cdc. in 8133ms, sequenceid=45525, compaction requested=false
　　下

账号		自动登录	找回密码
密码			立即注册

Centos6.5×64安装配置openmeetings3.0.3详

大疆运维招人啦，

C++ :try 语句块和异常处理

C++的多态

Red Hat RHCE 8 (EX294) Cert Guide

Java/C++ 区别：看完这一篇，就够用！

别再用过时库了！这 13 个顶级 C++ 库才是

[经验分享] hadoop hive 常见问题解决持续更新

浏览过的版块

扫码加入运维网微信交流群