设为首页 收藏本站
查看: 765|回复: 0

[经验分享] hadoop2.2.0伪分布式搭建(mac os)

[复制链接]

尚未签到

发表于 2015-12-30 09:31:56 | 显示全部楼层 |阅读模式
  
  从网上搜索配置教程,以下博文的作者写的非常好,完全可以实施,便将内容拷贝至此,加上适当的注解,以作备用。 http://www.micmiu.com/bigdata/hadoop/hadoop2x-single-node-setup/
  本文是详细记录Hadoop 2.2.0 在Mac OSX系统下单节点安装配置启动的详细步骤,并且演示运行一个简单的job。目录结构如下:


  • 基础环境配置

  • Hadoop安装配置
  • 启动及演示
  [一]、基础环境配置
  1、OS: Mac OSX 10.9.1
  2、JDK 1.6.0_65
  不管是安装包还是自己编译源码安装都可以,这个就不多介绍了,搜索下有很多文章介绍的,只要确保环境变量配置正确即可,我的JAVA_HOME配置如下:
  



1 micmiu-mbp:~ micmiu$ echo $JAVA_HOME
2 /System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Home
3 micmiu-mbp:~ micmiu$ java -version
4 java version "1.6.0_65"
5 Java(TM) SE Runtime Environment (build 1.6.0_65-b14-462-11M4609)
6 Java HotSpot(TM) 64-Bit Server VM (build 20.65-b04-462, mixed mode)
7 micmiu-mbp:~ micmiu$
    (注:此处jdk的安装,我直接官网安装eclipse4.3,此时如果没装jdk,eclipse运行时会默认提示下载安装jdk,傻瓜式点击确认就可以了,默认安装1.6.0_65版本。
    mac10.9的JAVA_HOME可以配置在当前用户目录下的 .bash_profile中,顺带附上我的.bash_profile内容:




export CLICOLOR=1
export LSCOLORS=GxFxDxBxegedabagaced
export PS1="\[\e[0;31m\]\u@\h\[\e[0;33m\]:\[\e[1;34m\]\w \[\e[1;37m\]$ \[\e[m\]"
export HADOOP_HOME=~/hadoop-2.2.0
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
export JAVA_HOME=`/usr/libexec/java_home`
    前3行是mac终端颜色配置方案,可无视。最后一行必须配置。HADOOP_HOME的配置是为了输入命令方便,最好配置下。下面有更多详细参数配置亦可参照。
    )






3、无密码SSH登录  由于是单节点的应用,只要实现localhost 的无密码ssh登录即可,这个比较简单:



micmiu-mbp:~ micmiu$ cd ~
micmiu-mbp:~ micmiu$ ssh-keygen -t rsa -P ''
micmiu-mbp:~ micmiu$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
  
  验证是否成功:



micmiu-mbp:~ micmiu$ ssh localhost
Last login: Sat Jan 18 10:17:19 2014
micmiu-mbp:~ micmiu$
  这样就表示SSH无密码登录成功了。
  有关SSH无密码登录的详细介绍可以参见:Linux(Centos)配置OpenSSH无密码登陆
  [二]、Hadoop安装配置
  1、下载发布包
  打开官方下载链接 http://hadoop.apache.org/releases.html#Download  ,选择2.2.0版本的发布包下载 后解压到指定路径下:micmiu$ tar -zxf hadoop-2.2.0.tar.gz -C /usr/local/share,那么本文中HADOOP_HOME = /usr/local/share/hadoop-2.2.0/。
  2、配置系统环境变量 vi ~/.profile ,添加如下内容:



# Hadoop settings by Michael@micmiu.com
export HADOOP_HOME="/usr/local/share/hadoop-2.2.0"
export HADOOP_PREFIX=${HADOOP_HOME}
export HADOOP_COMMON_HOME=${HADOOP_PREFIX}
export HADOOP_HDFS_HOME=${HADOOP_PREFIX}
export HADOOP_MAPRED_HOME=${HADOOP_PREFIX}
export HADOOP_YARN_HOME=${HADOOP_PREFIX}
export HADOOP_CONF_DIR="$HADOOP_HOME/etc/hadoop/"
export YARN_CONF_DIR=${HADOOP_CONF_DIR}
export PATH=$PATH:$HADOOP_PREFIX/bin:$HADOOP_PREFIX/sbin
  
  3、修改 <HADOOP_HOME>/etc/hadoop/hadoop-env.sh
  Mac OSX配置如下:



# The java implementation to use.
#export JAVA_HOME=${JAVA_HOME}
export JAVA_HOME=$(/usr/libexec/java_home -d 64 -v 1.6)
#找到HADOOP_OPTS 配置增加下面参数
export HADOOP_OPTS="$HADOOP_OPTS -Djava.security.krb5.realm=OX.AC.UK -Djava.security.krb5.kdc=kdc0.ox.ac.uk:kdc1.ox.ac.uk"
  
  跟多可以参见:$JAVA_HOME环境变量在Mac OS X中设置的问题
  Linux|Unix 配置如下:



# The java implementation to use.
#export JAVA_HOME=${JAVA_HOME}
export JAVA_HOME=系统中JDK实际路径
  
  4、修改 <HADOOP_HOME>/etc/hadoop/core-site.xml
  在<configuration>节点下添加或者更新下面的配置信息:



<!-- 新变量f:s.defaultFS 代替旧的:fs.default.name |micmiu.com-->
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
<description>The name of the default file system.</description>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/Users/micmiu/tmp/hadoop</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>io.native.lib.available</name>
<value>false</value>
<description>default value is true:Should native hadoop libraries, if present, be used.</description>
</property>
  
  5、修改 <HADOOP_HOME>/etc/hadoop/hdfs-site.xml
  在<configuration>节点下添加或者更新下面的配置信息:



<property>
<name>dfs.replication</name>
<value>1</value>
<!-- 如果是单节点配置为1,如果是集群根据实际集群数量配置 | micmiu.com -->
</property>
  
  6、修改 <HADOOP_HOME>/etc/hadoop/yarn-site.xml
  在<configuration>节点下添加或者更新下面的配置信息:



<!-- micmiu.com -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
  
  7、修改 <HADOOP_HOME>/etc/hadoop/mapred-site.xml
  默认没有mapred-site.xml 文件,copy  mapred-site.xml.template 一份为 mapred-site.xml即可
  在<configuration>节点下添加或者更新下面的配置信息:



<!-- micmiu.com -->
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
<final>true</final>
</property>
  
  [三]、启动及演示
  1、启动Hadoop
  首先执行hdfs namenode -format :

micmiu-mbp:~ micmiu$ hdfs namenode -format
14/01/18 23:07:07 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = micmiu-mbp.local/192.168.1.103
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 2.2.0
.................
.................
.................
14/01/18 23:07:08 INFO util.GSet: VM type       = 64-bit
14/01/18 23:07:08 INFO util.GSet: 0.029999999329447746% max memory = 991.7 MB
14/01/18 23:07:08 INFO util.GSet: capacity      = 2^15 = 32768 entries
Re-format filesystem in Storage Directory /Users/micmiu/tmp/hadoop/dfs/name ? (Y or N) Y
14/01/18 23:07:26 INFO common.Storage: Storage directory /Users/micmiu/tmp/hadoop/dfs/name has been successfully formatted.
14/01/18 23:07:26 INFO namenode.FSImage: Saving image file /Users/micmiu/tmp/hadoop/dfs/name/current/fsimage.ckpt_0000000000000000000 using no compression
14/01/18 23:07:26 INFO namenode.FSImage: Image file /Users/micmiu/tmp/hadoop/dfs/name/current/fsimage.ckpt_0000000000000000000 of size 198 bytes saved in 0 seconds.
14/01/18 23:07:27 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
14/01/18 23:07:27 INFO util.ExitUtil: Exiting with status 0
14/01/18 23:07:27 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at micmiu-mbp.local/192.168.1.103
************************************************************/
  然后执行 start-dfs.sh :



micmiu-mbp:~ micmiu$ start-dfs.sh
Starting namenodes on [localhost]
localhost: starting namenode, logging to /usr/local/share/hadoop-2.2.0/logs/hadoop-micmiu-namenode-micmiu-mbp.local.out
localhost: starting datanode, logging to /usr/local/share/hadoop-2.2.0/logs/hadoop-micmiu-datanode-micmiu-mbp.local.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /usr/local/share/hadoop-2.2.0/logs/hadoop-micmiu-secondarynamenode-micmiu-mbp.local.out
micmiu-mbp:~ micmiu$ jps
1522 NameNode
1651 DataNode
1794 SecondaryNameNode
1863 Jps
micmiu-mbp:~ micmiu$
  
  再执行 start-yarn.sh :



micmiu-mbp:~ micmiu$ start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /usr/local/share/hadoop-2.2.0/logs/yarn-micmiu-resourcemanager-micmiu-mbp.local.out
localhost: starting nodemanager, logging to /usr/local/share/hadoop-2.2.0/logs/yarn-micmiu-nodemanager-micmiu-mbp.local.out
micmiu-mbp:~ micmiu$ jps
2033 NodeManager
1900 ResourceManager
1522 NameNode
1651 DataNode
2058 Jps
1794 SecondaryNameNode
micmiu-mbp:~ micmiu$
  
  启动日志没有错误信息,并确认上面的相关进程存在,就表示启动成功了。
  2、演示
  演示hdfs 一些常用命令,为wordcount演示做准备:



micmiu-mbp:~ micmiu$ hdfs dfs -ls /
micmiu-mbp:~ micmiu$ hdfs dfs -mkdir /user
micmiu-mbp:~ micmiu$ hdfs dfs -ls /
Found 1 items
drwxr-xr-x   - micmiu supergroup          0 2014-01-18 23:20 /user
micmiu-mbp:~ micmiu$ hdfs dfs -mkdir -p /user/micmiu/wordcount/in
micmiu-mbp:~ micmiu$ hdfs dfs -ls /user/micmiu/wordcount
Found 1 items
drwxr-xr-x   - micmiu supergroup          0 2014-01-18 23:21 /user/micmiu/wordcount/in
  
  本地创建一个文件 micmiu-word.txt, 写入如下内容:

Hi Michael welcome to Hadoop
Hi Michael welcome to BigData
Hi Michael welcome to Spark
more see micmiu.com
  把 micmiu-word.txt 文件上传到hdfs:
hdfs dfs -put micmiu-word.txt  /user/micmiu/wordcount/in
  然后cd 切换到Hadoop的根目录下执行:
  hadoop jar
share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar wordcount
/user/micmiu/wordcount/in /user/micmiu/wordcount/out
  ps: /user/micmiu/wordcount/out 目录不能存在 否则运行报错。
  看到类似如下的日志信息:



micmiu-mbp:hadoop-2.2.0 micmiu$ hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar wordcount  /user/micmiu/wordcount/in /user/micmiu/wordcount/out
14/01/19 20:02:29 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
14/01/19 20:02:29 INFO input.FileInputFormat: Total input paths to process : 1
14/01/19 20:02:29 INFO mapreduce.JobSubmitter: number of splits:1
............
............
............
14/01/19 20:02:29 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1390131922557_0001
14/01/19 20:02:30 INFO impl.YarnClientImpl: Submitted application application_1390131922557_0001 to ResourceManager at /0.0.0.0:8032
14/01/19 20:02:30 INFO mapreduce.Job: The url to track the job: http://micmiu-mbp.local:8088/proxy/application_1390131922557_0001/
14/01/19 20:02:30 INFO mapreduce.Job: Running job: job_1390131922557_0001
14/01/19 20:02:38 INFO mapreduce.Job: Job job_1390131922557_0001 running in uber mode : false
14/01/19 20:02:38 INFO mapreduce.Job:  map 0% reduce 0%
14/01/19 20:02:43 INFO mapreduce.Job:  map 100% reduce 0%
14/01/19 20:02:50 INFO mapreduce.Job:  map 100% reduce 100%
14/01/19 20:02:50 INFO mapreduce.Job: Job job_1390131922557_0001 completed successfully
14/01/19 20:02:51 INFO mapreduce.Job: Counters: 43
File System Counters
FILE: Number of bytes read=129
FILE: Number of bytes written=158647
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=228
HDFS: Number of bytes written=83
HDFS: Number of read operations=6
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Launched map tasks=1
Launched reduce tasks=1
Data-local map tasks=1
Total time spent by all maps in occupied slots (ms)=3346
Total time spent by all reduces in occupied slots (ms)=3799
Map-Reduce Framework
Map input records=4
Map output records=18
Map output bytes=179
Map output materialized bytes=129
Input split bytes=120
Combine input records=18
Combine output records=10
Reduce input groups=10
Reduce shuffle bytes=129
Reduce input records=10
Reduce output records=10
Spilled Records=20
Shuffled Maps =1
Failed Shuffles=0
Merged Map outputs=1
GC time elapsed (ms)=30
CPU time spent (ms)=0
Physical memory (bytes) snapshot=0
Virtual memory (bytes) snapshot=0
Total committed heap usage (bytes)=283127808
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=108
File Output Format Counters
Bytes Written=83
micmiu-mbp:hadoop-2.2.0 micmiu$
  
  到此 wordcount的job已经执行完成,执行如下命令可以查看刚才job的执行结果:



micmiu-mbp:hadoop-2.2.0 micmiu$ hdfs dfs -ls /user/micmiu/wordcount/out
Found 2 items
-rw-r--r--   1 micmiu supergroup          0 2014-01-19 20:02 /user/micmiu/wordcount/out/_SUCCESS
-rw-r--r--   1 micmiu supergroup         83 2014-01-19 20:02 /user/micmiu/wordcount/oummmicmiu-mbp:hadoop-2.2.0 micmiu$ hdfs dfs -cat /user/micmiu/wordcount/out/part-r-00000
BigData    1
Hadoop    1
Hi    3
Michael    3
Spark    1
micmiu.com    1
more    1
see    1
to    3
welcome    3
  
  

运维网声明 1、欢迎大家加入本站运维交流群:群②:261659950 群⑤:202807635 群⑦870801961 群⑧679858003
2、本站所有主题由该帖子作者发表,该帖子作者与运维网享有帖子相关版权
3、所有作品的著作权均归原作者享有,请您和我们一样尊重他人的著作权等合法权益。如果您对作品感到满意,请购买正版
4、禁止制作、复制、发布和传播具有反动、淫秽、色情、暴力、凶杀等内容的信息,一经发现立即删除。若您因此触犯法律,一切后果自负,我们对此不承担任何责任
5、所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其内容的准确性、可靠性、正当性、安全性、合法性等负责,亦不承担任何法律责任
6、所有作品仅供您个人学习、研究或欣赏,不得用于商业或者其他用途,否则,一切后果均由您自己承担,我们对此不承担任何法律责任
7、如涉及侵犯版权等问题,请您及时通知我们,我们将立即采取措施予以解决
8、联系人Email:admin@iyunv.com 网址:www.yunweiku.com

所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其承担任何法律责任,如涉及侵犯版权等问题,请您及时通知我们,我们将立即处理,联系人Email:kefu@iyunv.com,QQ:1061981298 本贴地址:https://www.yunweiku.com/thread-158199-1-1.html 上篇帖子: Mac OS Top 50 Terminal commands/Mac常用50个终端指令(1/2) 下篇帖子: [转]在 Mac OS X上编译 libimobiledevice 的方法
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

扫码加入运维网微信交流群X

扫码加入运维网微信交流群

扫描二维码加入运维网微信交流群,最新一手资源尽在官方微信交流群!快快加入我们吧...

扫描微信二维码查看详情

客服E-mail:kefu@iyunv.com 客服QQ:1061981298


QQ群⑦:运维网交流群⑦ QQ群⑧:运维网交流群⑧ k8s群:运维网kubernetes交流群


提醒:禁止发布任何违反国家法律、法规的言论与图片等内容;本站内容均来自个人观点与网络等信息,非本站认同之观点.


本站大部分资源是网友从网上搜集分享而来,其版权均归原作者及其网站所有,我们尊重他人的合法权益,如有内容侵犯您的合法权益,请及时与我们联系进行核实删除!



合作伙伴: 青云cloud

快速回复 返回顶部 返回列表