设为首页 收藏本站
查看: 758|回复: 0

[经验分享] 1-hadoop-1.03单节点的安装

[复制链接]

尚未签到

发表于 2016-12-9 07:07:23 | 显示全部楼层 |阅读模式
  
 1)确认安装了JDK,没有如下安装
 
[iyunv@primary ~]# cd /home
[iyunv@primary home]# cp jdk-6u31-linux-i586-rpm.bin /usr/local/
[iyunv@primary home]# cd /usr/local/
[iyunv@primary local]# chmod +x jdk-6u31-linux-i586-rpm.bin
[iyunv@primary local]# ./jdk-6u31-linux-i586-rpm.bin
 
Press Enter to continue.....
[iyunv@primary local]# 

DSC0000.jpg
 

 
[iyunv@primary local]# chmod +x jdk-6u31-linux-i586.rpm
[iyunv@primary local]# rpm -ivh jdk-6u31-linux-i586.rpm
 
 
[iyunv@primary ~]# vi /etc/profile
export JAVA_HOME=/usr/java/jdk1.6.0_31
export PATH=$JAVA_HOME/bin:$PATH
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar

 

source /etc/profile
 
 
                  
[iyunv@primary ~]# java -version
java version "1.6.0_31"
Java(TM) SE Runtime Environment (build 1.6.0_31-b04)
Java HotSpot(TM) Client VM (build 20.6-b01, mixed mode, sharing)
 
 
我用来安装的用户为hadoop
 
2、建立ssh无密码登录
删除了机器上所有的认证文件
[hadoop@primary ~]$ rm ~/.ssh/*  
[hadoop@primary ~]$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
Generating public/private dsa key pair.
Your identification has been saved in /home/hadoop/.ssh/id_dsa.
Your public key has been saved in /home/hadoop/.ssh/id_dsa.pub.
The key fingerprint is:
8b:db:12:21:57:1c:25:32:d9:4d:4d:16:98:6b:66:88 hadoop@primary
[hadoop@primary ~]$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
ssh对文件的权限要求
[hadoop@primary ~]$ chmod 644 ~/.ssh/authorized_keys
 
[hadoop@primary ~]$
[hadoop@primary ~]$ ssh localhost
 
 

  • rsync  默认安装了
[hadoop@primary ~]$ rsync --version
rsync  version 2.6.3  protocol version 28
Copyright (C) 1996-2004 by Andrew Tridgell and others
<http://rsync.samba.org/>
Capabilities: 64-bit files, socketpairs, hard links, symlinks, batchfiles,
              inplace, IPv6, 64-bit system inums, 64-bit internal inums
 
rsync comes with ABSOLUTELY NO WARRANTY.  This is free software, and you
are welcome to redistribute it under certain conditions.  See the GNU
General Public Licence for details
 
 
 
 
4)安装HADOOP
[iyunv@primary home]# chmod 777 hadoop-1.0.4-bin.tar.gz
 
 
[hadoop@primary home]$ tar xzvf hadoop-1.0.4-bin.tar.gz
进入HADOOP解压的目录
[hadoop@primary home]$ cd hadoop
查看JAVAHOME
[hadoop@primary ~]$ env |grep JAVA     
JAVA_HOME=/usr/java/jdk1.6.0_31
 
我在根目录下创建了一个文件。如下操作的
chown -R  hadoop:root /hadoop 
-bash-3.00$ id
uid=503(hadoop) gid=505(hadoop) groups=0(root),505(hadoop)
-bash-3.00$  tar xzvf hadoop-1.0.4-bin.tar.gz
-bash-3.00$ pwd
/hadoop/hadoop-1.0.4
-bash-3.00$ vi conf/hadoop-env.sh
conf/hadoop-env.sh 中配置 Java 环境

DSC0001.jpg
 

配置conf/core-site.xml HDFS 地址和端口
 
hadoop@station1 hadoop-1.0.3]$ cat conf/core-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
 
<!-- Put site-specific property overrides in this file. -->
 
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://station1:9000</value>
</property>
 
<property>
<name>dfs.data.dir</name>
<value>/hadoop/web/data/</value>
</property>
 
<property>
<name>dfs.tem.dir</name>
<value>/hadoop/web/tem/</value>
</property>
 
<property>
<name>dfs.name.dir</name>
<value>/hadoop/web/name/</value>
</property>
</configuration>
 
 
配置hdfs-site.xml   replication
 
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
 
<!-- Put site-specific property overrides in this file. -->
 
<configuration>
<property>
         <name>dfs.replication</name>
         <value>1</value>
</property>
 
<property>
<name>dfs.data.dir</name>
<value>/hadoop/web/data/</value>
</property>
 
<property>
<name>dfs.tem.dir</name>
<value>/hadoop/web/tem/</value>
</property>
 
<property>
<name>dfs.name.dir</name>
<value>/hadoop/web/name/</value>
</property>
 
 
</configuration>
 
 
 
配置 conf/mapred-site.xml
 
[hadoop@station1 hadoop-1.0.3]$ cat conf/mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
 
<!-- Put site-specific property overrides in this file. -->
 
<configuration>
<property>
         <name>mapred.job.tracker</name>
         <value>localhost:9001</value>
     </property>
 
 
</configuration>
 
 
 
mapred-queue-acls.xml  mapred-site.xml        masters               
[cloud@station1 conf]$ vi mapred-site.xml
 
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
 
<!-- Put site-specific property overrides in this file. -->
 
<configuration>
<property>
         <name>mapred.job.tracker</name>
         <value>station1:9001</value>
</property>
</configuration>
 
 
 
 
 
 
5)
运行hadoop,进入bin
  1)格式化文件系统
       hadoop namenode – format
 

-bash-3.00$ ./bin/hadoop - format
Unrecognized option: -
Could not create the Java virtual machine.
报错了。命令错了。
[hadoop@primary hadoop-1.0.4]$ ./bin/hadoop  namenode  -format
12/10/25 16:53:38 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = primary/192.168.26.83
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 1.0.4
STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r 1393290; compiled by 'hortonfo' on Wed Oct  3 05:13:58 UTC 2012
************************************************************/
12/10/25 16:53:39 INFO util.GSet: VM type       = 32-bit
12/10/25 16:53:39 INFO util.GSet: 2% max memory = 19.33375 MB
12/10/25 16:53:39 INFO util.GSet: capacity      = 2^22 = 4194304 entries
12/10/25 16:53:39 INFO util.GSet: recommended=4194304, actual=4194304
12/10/25 16:53:39 INFO namenode.FSNamesystem: fsOwner=hadoop
12/10/25 16:53:39 INFO namenode.FSNamesystem: supergroup=supergroup
12/10/25 16:53:39 INFO namenode.FSNamesystem: isPermissionEnabled=true
12/10/25 16:53:39 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
12/10/25 16:53:39 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
12/10/25 16:53:39 INFO namenode.NameNode: Caching file names occuring more than 10 times
12/10/25 16:53:39 INFO common.Storage: Image file of size 112 saved in 0 seconds.
12/10/25 16:53:39 INFO common.Storage: Storage directory /tmp/hadoop-hadoop/dfs/name has been successfully formatted.
12/10/25 16:53:39 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at primary/192.168.26.83
************************************************************/
 
 
 
 
 
    2)启动hadoop 
              start-all.sh 
 
 
[hadoop@primary hadoop-1.0.4]$ cd bin
[hadoop@primary bin]$ pwd
/hadoop/hadoop-1.0.4/bin
[hadoop@primary bin]$ ./start-all.sh
namenode running as process 7387. Stop it first.
localhost: starting datanode, logging to /hadoop/hadoop-1.0.4/libexec/../logs/hadoop-hadoop-datanode-primary.out
localhost: starting secondarynamenode, logging to /hadoop/hadoop-1.0.4/libexec/../logs/hadoop-hadoop-secondarynamenode-primary.out
starting jobtracker, logging to /hadoop/hadoop-1.0.4/libexec/../logs/hadoop-hadoop-jobtracker-primary.out
localhost: starting tasktracker, logging to /hadoop/hadoop-1.0.4/libexec/../logs/hadoop-hadoop-tasktracker-primary.out
 
 
    3)查看进程
               jps
     4)查看集群状态
               hadoop dfsadmin -report 
              web 方式查看:http://localhost:50070        http://localhost:50030 

DSC0002.jpg
 
DSC0003.jpg
 

但没想到的是按网上的步骤配置后datanode节点怎么也没办法启动。后来通过分析启动日志后发现fs.data.dir参数设置的目录权限必需为755,要不启动datanode节点启动就会因为权限检测错误而自动关闭。提示信息如下:
 
 

  • WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Invalid directory in dfs.data.dir: Incorrect permission for /data/hadoop/disk, expected: rwxr-xr-x, while actual: rwxrwxrwx 
  • RROR org.apache.hadoop.hdfs.server.datanode.DataNode: All directories in dfs.data.dir are invalid. 
 
源文档 <http://gghandsome.blog.iyunv.com/398804/822983>

DSC0004.jpg
 

5)测试向DATANODE 添加文件
[hadoop@primary bin]$ pwd
/hadoop/hadoop-1.0.4/bin
[hadoop@primary bin]$ hadoop fs -put slaves.sh hdfs://localhost:9000/
Warning: $HADOOP_HOME is deprecated.
 
[hadoop@primary bin]$


DSC0005.jpg
 
DSC0006.jpg
 

6)测试MapReduce
创建目录
[hadoop@primary bin]$ hadoop fs -mkdir   hdfs://localhost:9000/input/


DSC0007.jpg
 

放入文件
 
[hadoop@primary bin]$ hadoop fs -put *.sh     /input/

 
DSC0008.jpg
 
 

计算这个SH文件里的单词数量(注意这个目录在hadoop 的目录,不在BIN目录)
 
[hadoop@station1 hadoop-1.0.3]$ bin/hadoop jar hadoop-examples-1.0.3.jar wordcount /input /out
Warning: $HADOOP_HOME is deprecated.
 
12/10/29 15:00:55 INFO input.FileInputFormat: Total input paths to process : 14
12/10/29 15:00:55 INFO util.NativeCodeLoader: Loaded the native-hadoop library
12/10/29 15:00:55 WARN snappy.LoadSnappy: Snappy native library not loaded
12/10/29 15:00:56 INFO mapred.JobClient: Running job: job_201210291430_0002
12/10/29 15:00:57 INFO mapred.JobClient:  map 0% reduce 0%
12/10/29 15:01:10 INFO mapred.JobClient:  map 14% reduce 0%
12/10/29 15:01:19 INFO mapred.JobClient:  map 28% reduce 0%
12/10/29 15:01:28 INFO mapred.JobClient:  map 42% reduce 9%
12/10/29 15:01:34 INFO mapred.JobClient:  map 57% reduce 9%
12/10/29 15:01:37 INFO mapred.JobClient:  map 57% reduce 14%
12/10/29 15:01:40 INFO mapred.JobClient:  map 71% reduce 14%
12/10/29 15:01:43 INFO mapred.JobClient:  map 71% reduce 19%
12/10/29 15:01:47 INFO mapred.JobClient:  map 85% reduce 19%
12/10/29 15:01:53 INFO mapred.JobClient:  map 100% reduce 23%
12/10/29 15:02:00 INFO mapred.JobClient:  map 100% reduce 28%
12/10/29 15:02:06 INFO mapred.JobClient:  map 100% reduce 100%
12/10/29 15:02:11 INFO mapred.JobClient: Job complete: job_201210291430_0002
12/10/29 15:02:11 INFO mapred.JobClient: Counters: 29
12/10/29 15:02:11 INFO mapred.JobClient:   Job Counters
12/10/29 15:02:11 INFO mapred.JobClient:     Launched reduce tasks=1
12/10/29 15:02:11 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=87869
12/10/29 15:02:11 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0
12/10/29 15:02:11 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0
12/10/29 15:02:11 INFO mapred.JobClient:     Launched map tasks=14
12/10/29 15:02:11 INFO mapred.JobClient:     Data-local map tasks=14
12/10/29 15:02:11 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=55240
12/10/29 15:02:11 INFO mapred.JobClient:   File Output Format Counters
12/10/29 15:02:11 INFO mapred.JobClient:     Bytes Written=6173
12/10/29 15:02:11 INFO mapred.JobClient:   FileSystemCounters
12/10/29 15:02:11 INFO mapred.JobClient:     FILE_BYTES_READ=28724
12/10/29 15:02:11 INFO mapred.JobClient:     HDFS_BYTES_READ=23858
12/10/29 15:02:11 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=381199
12/10/29 15:02:11 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=6173
12/10/29 15:02:11 INFO mapred.JobClient:   File Input Format Counters
12/10/29 15:02:11 INFO mapred.JobClient:     Bytes Read=22341
12/10/29 15:02:11 INFO mapred.JobClient:   Map-Reduce Framework
12/10/29 15:02:11 INFO mapred.JobClient:     Map output materialized bytes=28802
12/10/29 15:02:11 INFO mapred.JobClient:     Map input records=691
12/10/29 15:02:11 INFO mapred.JobClient:     Reduce shuffle bytes=28802
12/10/29 15:02:11 INFO mapred.JobClient:     Spilled Records=4018
12/10/29 15:02:11 INFO mapred.JobClient:     Map output bytes=34161
12/10/29 15:02:11 INFO mapred.JobClient:     Total committed heap usage (bytes)=2264064000
12/10/29 15:02:11 INFO mapred.JobClient:     CPU time spent (ms)=9340
12/10/29 15:02:11 INFO mapred.JobClient:     Combine input records=3137
12/10/29 15:02:11 INFO mapred.JobClient:     SPLIT_RAW_BYTES=1517
12/10/29 15:02:11 INFO mapred.JobClient:     Reduce input records=2009
12/10/29 15:02:11 INFO mapred.JobClient:     Reduce input groups=497
12/10/29 15:02:11 INFO mapred.JobClient:     Combine output records=2009
12/10/29 15:02:11 INFO mapred.JobClient:     Physical memory (bytes) snapshot=1989103616
12/10/29 15:02:11 INFO mapred.JobClient:     Reduce output records=497
12/10/29 15:02:11 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=5140561920
12/10/29 15:02:11 INFO mapred.JobClient:     Map output records=3137
 
 
 下面是上传文件文件的统计
[hadoop@station1 bin]$ grep apache *.sh|wc
     14      28     915
 

hadoop  统计结果如下

 
查看的路径  (因为面页太大了。我只显视了一部分)
DSC0009.jpg
 
结果
DSC00010.jpg
 查看运行任务
DSC00011.jpg
 
DSC00012.jpg
 
 

 
        集群的安装请参考:http://pftzzg.iyunv.com/blog/1910171
  hadoop-0.20 请参考:http://pftzzg.iyunv.com/admin/blogs/1911023
  1.03  请参考:http://pftzzg.iyunv.com/admin/blogs/1910153

  
 

运维网声明 1、欢迎大家加入本站运维交流群:群②:261659950 群⑤:202807635 群⑦870801961 群⑧679858003
2、本站所有主题由该帖子作者发表,该帖子作者与运维网享有帖子相关版权
3、所有作品的著作权均归原作者享有,请您和我们一样尊重他人的著作权等合法权益。如果您对作品感到满意,请购买正版
4、禁止制作、复制、发布和传播具有反动、淫秽、色情、暴力、凶杀等内容的信息,一经发现立即删除。若您因此触犯法律,一切后果自负,我们对此不承担任何责任
5、所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其内容的准确性、可靠性、正当性、安全性、合法性等负责,亦不承担任何法律责任
6、所有作品仅供您个人学习、研究或欣赏,不得用于商业或者其他用途,否则,一切后果均由您自己承担,我们对此不承担任何法律责任
7、如涉及侵犯版权等问题,请您及时通知我们,我们将立即采取措施予以解决
8、联系人Email:admin@iyunv.com 网址:www.yunweiku.com

所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其承担任何法律责任,如涉及侵犯版权等问题,请您及时通知我们,我们将立即处理,联系人Email:kefu@iyunv.com,QQ:1061981298 本贴地址:https://www.yunweiku.com/thread-311556-1-1.html 上篇帖子: hadoop学习笔记之--完全分布模式安装 下篇帖子: Apache Hadoop Wins Terabyte Sort Benchmark
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

扫码加入运维网微信交流群X

扫码加入运维网微信交流群

扫描二维码加入运维网微信交流群,最新一手资源尽在官方微信交流群!快快加入我们吧...

扫描微信二维码查看详情

客服E-mail:kefu@iyunv.com 客服QQ:1061981298


QQ群⑦:运维网交流群⑦ QQ群⑧:运维网交流群⑧ k8s群:运维网kubernetes交流群


提醒:禁止发布任何违反国家法律、法规的言论与图片等内容;本站内容均来自个人观点与网络等信息,非本站认同之观点.


本站大部分资源是网友从网上搜集分享而来,其版权均归原作者及其网站所有,我们尊重他人的合法权益,如有内容侵犯您的合法权益,请及时与我们联系进行核实删除!



合作伙伴: 青云cloud

快速回复 返回顶部 返回列表