设为首页 收藏本站
查看: 1267|回复: 0

[经验分享] centos虚拟机上安装运行hadoop(伪分布)

[复制链接]

尚未签到

发表于 2016-5-12 13:02:12 | 显示全部楼层 |阅读模式
  1、先在确认能否不输入口令就用ssh登录localhost:

  $ ssh localhost


  
  如果不输入口令就无法用ssh登陆localhost,执行下面的命令:
[iyunv@localhost ~]# ssh-keygen -t  rsa       (注意-keygen前面没有空格)
然后就回车,O(∩_∩)O哈哈~
日志如下:
  [iyunv@localhost ~]# ssh-keygen -t rsa Generating public/private rsa key pair. Enter file in which to save the key (/root/.ssh/id_rsa): Created directory '/root/.ssh'. Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /root/.ssh/id_rsa. Your public key has been saved in /root/.ssh/id_rsa.pub. The key fingerprint is: a8:7a:3e:f6:92:85:b8:c7:be:d9:0e:45:9c:d1:36:3b root@localhost.localdomain [iyunv@localhost ~]# [iyunv@localhost ~]# cd .. [iyunv@localhost /]# cd root [iyunv@localhost ~]# ls anaconda-ks.cfg Desktop install.log install.log.syslog [iyunv@localhost ~]# cd .ssh [iyunv@localhost .ssh]# cat id_rsa.pub > authorized_keys [iyunv@localhost .ssh]# [iyunv@localhost .ssh]# ssh localhost The authenticity of host 'localhost (127.0.0.1)' can't be established. RSA key fingerprint is 41:c8:d4:e4:60:71:6f:6a:33:6a:25:27:62:9b:e3:90. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'localhost' (RSA) to the list of known hosts. Last login: Tue Jun 21 22:40:31 2011 [iyunv@localhost ~]#


  
  
2、解压hadoop
重新创建了一个hadoop用户,解压hadoop
  [iyunv@localhost hadoop]# tar zxvf hadoop-0.20.2.tar.gz ...... ...... ...... hadoop-0.20.203.0/src/contrib/ec2/bin/image/create-hadoop-image-remote hadoop-0.20.203.0/src/contrib/ec2/bin/image/ec2-run-user-data hadoop-0.20.203.0/src/contrib/ec2/bin/launch-hadoop-cluster hadoop-0.20.203.0/src/contrib/ec2/bin/launch-hadoop-master hadoop-0.20.203.0/src/contrib/ec2/bin/launch-hadoop-slaves hadoop-0.20.203.0/src/contrib/ec2/bin/list-hadoop-clusters hadoop-0.20.203.0/src/contrib/ec2/bin/terminate-hadoop-cluster [iyunv@localhost hadoop]#


  
3、安装jdk1.6,设置hadoop home
# set java environment export JAVA_HOME=/home/yqf/jdk/jdk1.6.0_13 export HADOOP_HOME=/home/hadoop/hadoop-0.20.2 export JRE_HOME=$JAVA_HOME/jre export CLASSPATH=$JAVA_HOME/lib/:$JRE_HOME/lib export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:

  


  
  4、修改、启动hadoop
修改hadoop配置
进入hadoop目录, conf下
  #################################### [iyunv@localhost conf]# vi hadoop-env.sh # set java environment export JAVA_HOME=/home/yqf/jdk/jdk1.6.0_13 (你自己的JAVA_HOME) ##################################### [iyunv@localhost conf]# vi core-site.xml <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>fs.default.name</name> <value>hdfs://namenode:9000/</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/usr/local/hadoop/hadooptmp</value> </property> </configuration> ####################################### [iyunv@localhost conf]# vi hdfs-site.xml <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>dfs.name.dir</name> <value>/usr/local/hadoop/hdfs/name</value> </property> <property> <name>dfs.data.dir</name> <value>/usr/local/hadoop/hdfs/data</value> </property> <property> <name>dfs.replication</name> <value>1</value> </property> </configuration> ######################################### [iyunv@localhost conf]# vi mapred-site.xml <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>mapred.job.tracker</name> <value>namenode:9001</value> </property> <property> <name>mapred.local.dir</name> <value>/usr/local/hadoop/mapred/local</value> </property> <property> <name>mapred.system.dir</name> <value>/tmp/hadoop/mapred/system</value> </property> </configuration> ######################################### [iyunv@localhost conf]# vi masters #localhost namenode ######################################### [iyunv@localhost conf]# vi slaves #localhost datanode01


  
  
  
启动hadoop
  
  [iyunv@localhost bin]# hadoop namenode -format 11/06/23 00:43:54 INFO namenode.NameNode: STARTUP_MSG: /************************************************************ STARTUP_MSG: Starting NameNode STARTUP_MSG: host = localhost.localdomain/127.0.0.1 STARTUP_MSG: args = [-format] STARTUP_MSG: version = 0.20.203.0 STARTUP_MSG: build = http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20-security-203 -r 1099333; compiled by 'oom' on Wed May 4 07:57:50 PDT 2011 ************************************************************/ 11/06/23 00:43:55 INFO util.GSet: VM type = 32-bit 11/06/23 00:43:55 INFO util.GSet: 2% max memory = 19.33375 MB 11/06/23 00:43:55 INFO util.GSet: capacity = 2^22 = 4194304 entries 11/06/23 00:43:55 INFO util.GSet: recommended=4194304, actual=4194304 11/06/23 00:43:56 INFO namenode.FSNamesystem: fsOwner=root 11/06/23 00:43:56 INFO namenode.FSNamesystem: supergroup=supergroup 11/06/23 00:43:56 INFO namenode.FSNamesystem: isPermissionEnabled=true 11/06/23 00:43:56 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100 11/06/23 00:43:56 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s) 11/06/23 00:43:56 INFO namenode.NameNode: Caching file names occuring more than 10 times 11/06/23 00:43:57 INFO common.Storage: Image file of size 110 saved in 0 seconds. 11/06/23 00:43:57 INFO common.Storage: Storage directory /usr/local/hadoop/hdfs/name has been successfully formatted. 11/06/23 00:43:57 INFO namenode.NameNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at localhost.localdomain/127.0.0.1 ************************************************************/ [iyunv@localhost bin]# ########################################### [iyunv@localhost bin]# ./start-all.sh starting namenode, logging to /usr/local/hadoop/hadoop-0.20.203/bin/../logs/hadoop-root-namenode-localhost.localdomain.out datanode01: starting datanode, logging to /usr/local/hadoop/hadoop-0.20.203/bin/../logs/hadoop-root-datanode-localhost.localdomain.out namenode: starting secondarynamenode, logging to /usr/local/hadoop/hadoop-0.20.203/bin/../logs/hadoop-root-secondarynamenode- localhost.localdomain.out starting jobtracker, logging to /usr/local/hadoop/hadoop-0.20.203/bin/../logs/hadoop-root-jobtracker-localhost.localdomain.out datanode01: starting tasktracker, logging to /usr/local/hadoop/hadoop-0.20.203/bin/../logs/hadoop-root-tasktracker-localhost.localdomain.out [iyunv@localhost bin]# jps 11971 TaskTracker 11807 SecondaryNameNode 11599 NameNode 12022 Jps 11710 DataNode 11877 JobTracker


  
  
  5、使用自带例子测试hadoop
步骤一:准备输入数据
  
    


  
  在当前目录(如hadoop安装目录)下新建文件夹input,并在文件夹下新建两个文件file01、file02,这两个文件内容分别如下
file01中内容为:
  
  Hello World Bye World


  
file02中内容为:
  Hello Hadoop Goodbye Hadoop


  
  
  步骤二:将文件夹input上传到分布式文件系统中  


  cd 到hadoop安装目录,运行下面命令:
  
  bin/hadoop fs -put input input01


  
  
这个命令将input文件夹上传到了hadoop文件系统了,在该系统下就多了一个input01文件夹,你可以使用下面命令查看:
  bin/hadoop fs -ls


  
  
步骤三:运行hadoop mapper reduce
    


  
  运行命令:
  
  bin/hadoop jar hadoop-*-examples.jar wordcount input01 output2


  
  运行日志如下:
  
  [iyunv@localhost hadoop-0.20.2]# bin/hadoop jar hadoop-*-examples.jar wordcount input01 output2 12/11/14 22:51:51 INFO input.FileInputFormat: Total input paths to process : 4 12/11/14 22:51:52 INFO mapred.JobClient: Running job: job_201211141815_0003 12/11/14 22:51:53 INFO mapred.JobClient: map 0% reduce 0% ^[[3~12/11/14 22:53:03 INFO mapred.JobClient: map 50% reduce 0% 12/11/14 22:53:07 INFO mapred.JobClient: map 75% reduce 0% ^[[B12/11/14 22:53:12 INFO mapred.JobClient: map 100% reduce 0% ^[[3~12/11/14 22:53:17 INFO mapred.JobClient: map 100% reduce 25% 12/11/14 22:53:31 INFO mapred.JobClient: map 100% reduce 100% 12/11/14 22:53:34 INFO mapred.JobClient: Job complete: job_201211141815_0003 12/11/14 22:53:34 INFO mapred.JobClient: Counters: 17 12/11/14 22:53:34 INFO mapred.JobClient: Job Counters 12/11/14 22:53:34 INFO mapred.JobClient: Launched reduce tasks=1 12/11/14 22:53:34 INFO mapred.JobClient: Launched map tasks=4 12/11/14 22:53:34 INFO mapred.JobClient: Data-local map tasks=2 12/11/14 22:53:34 INFO mapred.JobClient: FileSystemCounters 12/11/14 22:53:34 INFO mapred.JobClient: FILE_BYTES_READ=79 12/11/14 22:53:34 INFO mapred.JobClient: HDFS_BYTES_READ=55 12/11/14 22:53:34 INFO mapred.JobClient: FILE_BYTES_WRITTEN=304 12/11/14 22:53:34 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=41 12/11/14 22:53:34 INFO mapred.JobClient: Map-Reduce Framework 12/11/14 22:53:34 INFO mapred.JobClient: Reduce input groups=5 12/11/14 22:53:34 INFO mapred.JobClient: Combine output records=6 12/11/14 22:53:34 INFO mapred.JobClient: Map input records=2 12/11/14 22:53:34 INFO mapred.JobClient: Reduce shuffle bytes=97 12/11/14 22:53:34 INFO mapred.JobClient: Reduce output records=5 12/11/14 22:53:34 INFO mapred.JobClient: Spilled Records=12 12/11/14 22:53:34 INFO mapred.JobClient: Map output bytes=82 12/11/14 22:53:34 INFO mapred.JobClient: Combine input records=8 12/11/14 22:53:34 INFO mapred.JobClient: Map output records=8 12/11/14 22:53:34 INFO mapred.JobClient: Reduce input records=6


  
  查看文件,多了一个output2。
[iyunv@localhost hadoop-0.20.2]# bin/hadoop fs -ls
Found 2 items
drwxr-xr-x   - root supergroup          0 2012-11-14 22:41 /user/root/input01
drwxr-xr-x   - root supergroup          0 2012-11-14 22:53 /user/root/output2
  
  
  
查看output2/下面的内容
[iyunv@localhost hadoop-0.20.2]# bin/hadoop fs -cat output2/*
Bye     1
Goodbye 1
Hadoop  2
Hello   2
World   2
  
  
  
  
  wordcount应该是计算输入里面出现单词的个数。
  
  
  

运维网声明 1、欢迎大家加入本站运维交流群:群②:261659950 群⑤:202807635 群⑦870801961 群⑧679858003
2、本站所有主题由该帖子作者发表,该帖子作者与运维网享有帖子相关版权
3、所有作品的著作权均归原作者享有,请您和我们一样尊重他人的著作权等合法权益。如果您对作品感到满意,请购买正版
4、禁止制作、复制、发布和传播具有反动、淫秽、色情、暴力、凶杀等内容的信息,一经发现立即删除。若您因此触犯法律,一切后果自负,我们对此不承担任何责任
5、所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其内容的准确性、可靠性、正当性、安全性、合法性等负责,亦不承担任何法律责任
6、所有作品仅供您个人学习、研究或欣赏,不得用于商业或者其他用途,否则,一切后果均由您自己承担,我们对此不承担任何法律责任
7、如涉及侵犯版权等问题,请您及时通知我们,我们将立即采取措施予以解决
8、联系人Email:admin@iyunv.com 网址:www.yunweiku.com

所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其承担任何法律责任,如涉及侵犯版权等问题,请您及时通知我们,我们将立即处理,联系人Email:kefu@iyunv.com,QQ:1061981298 本贴地址:https://www.yunweiku.com/thread-216144-1-1.html 上篇帖子: [centos] 只列出目录的四种方法 下篇帖子: centos 6.x 下禁用ipv6 IPv6 禁用selinux
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

扫码加入运维网微信交流群X

扫码加入运维网微信交流群

扫描二维码加入运维网微信交流群,最新一手资源尽在官方微信交流群!快快加入我们吧...

扫描微信二维码查看详情

客服E-mail:kefu@iyunv.com 客服QQ:1061981298


QQ群⑦:运维网交流群⑦ QQ群⑧:运维网交流群⑧ k8s群:运维网kubernetes交流群


提醒:禁止发布任何违反国家法律、法规的言论与图片等内容;本站内容均来自个人观点与网络等信息,非本站认同之观点.


本站大部分资源是网友从网上搜集分享而来,其版权均归原作者及其网站所有,我们尊重他人的合法权益,如有内容侵犯您的合法权益,请及时与我们联系进行核实删除!



合作伙伴: 青云cloud

快速回复 返回顶部 返回列表