设为首页 收藏本站
查看: 625|回复: 0

[经验分享] 零起步的Hadoop实践日记(内存设置调整)

[复制链接]

尚未签到

发表于 2015-7-13 09:19:38 | 显示全部楼层 |阅读模式
  今天尝试跑了一个这样的Hive SQL,跑过去30天的用户的平均步数和卡路里。



#!/bin/bash
cur_date=`date +%Y%m%d`
pasts=""
for i in `seq 30`
do
iday=`date -d "$i days ago" +%Y%m%d`
if [ 1 -eq $i ]
then
pasts=$iday
else
pasts=$pasts","$iday
fi
done
# echo $pasts
sudo -su hdfs hive -e "select uid,avg(steps),avg(calories) from dailystats where day in ($pasts) group by
uid" > /ad/tongji/output/getAvgStats/$cur_date
  
  结果到Web Tracker(默认8088端口的服务)中观察发现Hive启动了2个Map,然后这个Map就失败重试最后全部失败。
  从Web Tracker返回的结果是:



AttemptID:attempt_1395208369821_0011_m_000004_0 Timed out after 600 secscleanup failed for container container_1395208369821_0011_01_000006 : java.lang.reflect.UndeclaredThrowableException at org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:135) at org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.stopContainer(ContainerManagerPBClientImpl.java:122) at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.kill(ContainerLauncherImpl.java:208) at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:400) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) Caused by: com.google.protobuf.ServiceException: java.net.ConnectException: Call From AY130105124528d0c2393/10.200.134.127 to AY130105124528d0c2393:59937 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:212) at com.sun.proxy.$Proxy29.stopContainer(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.stopContainer(ContainerManagerPBClientImpl.java:119) ... 5 more Caused by: java.net.ConnectException: Call From AY130105124528d0c2393/10.200.134.127 to AY130105124528d0c2393:59937 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:782) at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:729) at org.apache.hadoop.ipc.Client.call(Client.java:1242) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202) ... 7 more Caused by: java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:207) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:528) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:492) at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:510) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:604) at org.apache.hadoop.ipc.Client$Connection.access$2100(Client.java:252) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1291) at org.apache.hadoop.ipc.Client.call(Client.java:1209) ... 8 more
  
  Hive Shell返回的是


  Error during job, obtaining debugging information...
Job Tracking URL: http://AY130105124528d0c2393:8088/proxy/application_1395208369821_0011/
Examining task ID: task_1395208369821_0011_m_000003 (and more) from job job_1395208369821_0011
  Task with the most failures(1):
-----
Task ID:
  task_1395208369821_0011_m_000004
  URL:
  http://0.0.0.0:8088/taskdetails.jsp?jobid=job_1395208369821_0011&tipid=task_1395208369821_0011_m_000004
-----
Diagnostic Messages for this Task:
AttemptID:attempt_1395208369821_0011_m_000004_0 Timed out after 600 secs
cleanup failed for container container_1395208369821_0011_01_000006 : java.lang.reflect.UndeclaredThrowableException
        at org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:135)
        at org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.stopContainer(ContainerManagerPBClientImpl.java:122)
        at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.kill(ContainerLauncherImpl.java:208)
        at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:400)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)
Caused by: com.google.protobuf.ServiceException: java.net.ConnectException: Call From AY130105124528d0c2393/10.200.134.127 to AY130105124528d0c2393:59937 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:212)
        at com.sun.proxy.$Proxy29.stopContainer(Unknown Source)
        at org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.stopContainer(ContainerManagerPBClientImpl.java:119)
        ... 5 more
Caused by: java.net.ConnectException: Call From AY130105124528d0c2393/10.200.134.127 to AY130105124528d0c2393:59937 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
        at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:782)
        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:729)
        at org.apache.hadoop.ipc.Client.call(Client.java:1242)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
        ... 7 more
Caused by: java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
        at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:207)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:528)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:492)
        at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:510)
        at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:604)
        at org.apache.hadoop.ipc.Client$Connection.access$2100(Client.java:252)
        at org.apache.hadoop.ipc.Client.getConnection(Client.java:1291)
        at org.apache.hadoop.ipc.Client.call(Client.java:1209)
        ... 8 more
  
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
  
  可是呢,完全没头脑,最后查Hive Shell找到对应的Application的log位置:



/var/log/hadoop-yarn/containers/application_1395208369821_0011/container_1395208369821_0011_01_000006 # ll
total 16
drwx--x--- 2 yarn yarn 4096 Mar 19 14:44 ./
drwx--x--- 8 yarn yarn 4096 Mar 19 14:44 ../
-rw-rw-r-- 1 yarn yarn    0 Mar 19 14:44 stderr
-rw-rw-r-- 1 yarn yarn  544 Mar 19 14:44 stdout
-rw-rw-r-- 1 yarn yarn 3852 Mar 19 14:44 syslog
  
  查看stdout



/var/log/hadoop-yarn/containers/application_1395208369821_0011/container_1395208369821_0011_01_000006 # ll
more stdout
Java HotSpot(TM) 64-Bit Server VM warning: INFO: os::commit_memory(0x00000000f5e80000, 99090432, 0) faile
d; error='Cannot allocate memory' (errno=12)
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (malloc) failed to allocate 99090432 bytes for committing reserved memory.
# An error report file with more information is saved as:
# /ad/hadoop-yarn/cache/yarn/nm-local-dir/usercache/hdfs/appcache/application_1395208369821_0011/containe
r_1395208369821_0011_01_000006/hs_err_pid2286.log
  
  卡在Map的原因就是 Cannot allocate memory
  可以对Map内存使用进行设置,实际我只修改了mapred-site文件,加入这个property
  



mapreduce.map.memory.mb
800
  
  机器内存4G,我就设置800M,当然也尝试过900和其他数值,这个数值可以了。30天数据大概450W,5分钟跑完。偶也~
  
  参考:
  http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/
  http://woopisy.hatenablog.com/entry/2013/11/19/131033

运维网声明 1、欢迎大家加入本站运维交流群:群②:261659950 群⑤:202807635 群⑦870801961 群⑧679858003
2、本站所有主题由该帖子作者发表,该帖子作者与运维网享有帖子相关版权
3、所有作品的著作权均归原作者享有,请您和我们一样尊重他人的著作权等合法权益。如果您对作品感到满意,请购买正版
4、禁止制作、复制、发布和传播具有反动、淫秽、色情、暴力、凶杀等内容的信息,一经发现立即删除。若您因此触犯法律,一切后果自负,我们对此不承担任何责任
5、所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其内容的准确性、可靠性、正当性、安全性、合法性等负责,亦不承担任何法律责任
6、所有作品仅供您个人学习、研究或欣赏,不得用于商业或者其他用途,否则,一切后果均由您自己承担,我们对此不承担任何法律责任
7、如涉及侵犯版权等问题,请您及时通知我们,我们将立即采取措施予以解决
8、联系人Email:admin@iyunv.com 网址:www.yunweiku.com

所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其承担任何法律责任,如涉及侵犯版权等问题,请您及时通知我们,我们将立即处理,联系人Email:kefu@iyunv.com,QQ:1061981298 本贴地址:https://www.yunweiku.com/thread-86074-1-1.html 上篇帖子: hadoop运行原理之Job运行(四) JobTracker端心跳机制分析 下篇帖子: [hadoop源码阅读][6]-org.apache.hadoop.ipc-ipc总体结构和RPC
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

扫码加入运维网微信交流群X

扫码加入运维网微信交流群

扫描二维码加入运维网微信交流群,最新一手资源尽在官方微信交流群!快快加入我们吧...

扫描微信二维码查看详情

客服E-mail:kefu@iyunv.com 客服QQ:1061981298


QQ群⑦:运维网交流群⑦ QQ群⑧:运维网交流群⑧ k8s群:运维网kubernetes交流群


提醒:禁止发布任何违反国家法律、法规的言论与图片等内容;本站内容均来自个人观点与网络等信息,非本站认同之观点.


本站大部分资源是网友从网上搜集分享而来,其版权均归原作者及其网站所有,我们尊重他人的合法权益,如有内容侵犯您的合法权益,请及时与我们联系进行核实删除!



合作伙伴: 青云cloud

快速回复 返回顶部 返回列表