设为首页 收藏本站
查看: 904|回复: 0

[经验分享] hadoop完全分布式安装记录_部分

[复制链接]

尚未签到

发表于 2018-10-31 09:05:31 | 显示全部楼层 |阅读模式
  集群节点:共4台机器,1台master,3台slave
  各节点信息:
  hadoop各节点管理用户均为jared,组均为hadoop,配置步骤略
  |hostname| 内网ip(static)  | 出口ip(dhcp) |
  | master | 192.168.255.25  | 192.168.1.10 |
  | node1  | 192.168.255.26  | 192.168.1.11 |
  | node2  | 192.168.255.27  | 192.168.1.12 |
  | node3  | 192.168.255.28  | 192.168.1.13 |

  系统环境:CentOS>  hadoop版本:apache hadoop0.20.2
  java版本:jdk1.7(建议最好是1.6)
  环境变量设置
  export JAVA_HOME=/usr/java/jdk1.7.0_51
  export HADOOP_INSTALL=/home/jared/hadoop
  export PATH=$PATH:$HADOOP_INSTALL/bin
  各节点配置ssh互信:
  配置思路:
  1.各节点分别以jared用户生成一对秘钥,分别为id_rsa和id_rsa.pub
  2.将各个节点的公钥(id_rsa.pub)里面的内容全部放到一个名字为authorized_keys的文件中
  3.将authorized_keys文件分别copy到各个节点的/home/jared/.ssh路径下
  4.各个节点互相ssh登陆测试,首次登陆需要输入"yes",以后就不需要输入了
  本系统环境中authorized_keys的内容如下所示:
  vim  authorized_keys
  ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEA5u69aO7lSceNLuQFbZkRt/V4O6nxc4QNXQRNxiar+k15c3Fe+5pFMOBpQZFxgk6w4490Z/koM6HJ7Kg2s9jnSSkyhJk7YuzYvUkmQbZG0uEyxX1uor/lTlySXuwlokSzLwTaKnEk1Wkq/s7eR3zcItrX++fAnKas9IZcziZJ+fCWBH3c2BNql2/K0j3jT+oTUaNY4mPZwnYljPZr/eldQOQcM0dDtS5Q/UWHC8USXQrBtCzOTiRlIVyFC7KEMThkkfSfvPjG7bT5O2Rg9R5gzMgIsku6d0KQMQ1GKmTbV3OYStUx7ByhM8GmDN/FFZU94lW/pjcTLeqjE61FJHJ1HQ== jared@master
  ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAqtK8BTl3wLy4Oc9mg6Xj+APrjATDWd5vFdNzP6VKi2ZWV9YuN+8Snsj6Vay6d9w7CrVzO8lShSIG2PId9YwiwBnvFzPigF2Gk9ncsSNbLzOX+9OR3jGe1NNIdfBJQfMuD/l42X4sMwJKDjK+Wpp5bQSQ63qO4vtBJ1MbM7D8FyUTIse9GgPP7otdKWEEDMQHPKXmHoKWhhg26ht3wfICqrLzLyhQhFjpYCo32d6rhLfe844ICaqEfrLnlN4wfHb19pRXhuQpMCwdsnRarGKBkQmsRW2+LtvjDvARBdefpuAEtATWfcY/48nwibOp/xPkdYKbaNSceEbDWists5tXFw== jared@node1
  ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEA6tSOqPq76+s8EU8qj5wtcHRan9MHHWJD9HmJkhtstcDyXnoBVU0sEJdJ5sAr/2B7pq8NMAloD54KcjxhRzbj0gKO3NDBwE4Yg69hoo+uD7rNRW6yqPoONVpKEr5ngMEwjh0xh6U4whWORHfhI8sqEJX+snTNxMed3Vv7OqJVno+MplyEpTrf+vlZa9nG9Woe1QONM8s5/lJMsZHY+lgT0e1u3jR+Kedc9RMch4hfOowc1BA4IQI/bhuYAgClYkTiFZzFlX/Crio4rq22XzpFFB5+QWiUKqMCrdo9ikPhlfw3MSnnEb+/GqP8LDGuuCuzrrLj7y1184QBydFOMZPLCQ== jared@node2
  ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAw7Y53YhJ5L9OANGmGE6bzCT82QIVXR+AJycIGk/O5NDMiuPYKOU+HUYfWyNiY/yPKYiQbFLb4o0rTIbUOvpTLGEz8Tz7pm5Dd8OJca4DlUN/PK8Cp1osXsZa2IeGyeL/yP+RplK/zDm1xrldDjSUhFyPTOGMcAzMQkB3N+hc6s6UleV+J78YJVBeaz5foGir/gR5MBr5bpZpiYH0KVxDw65rwsBHu7KlVy5Q4lKkMUmccnKLdyVO0gnwWWenpc71UHJ0yADOzdQSpZtDjgf0dyrfiVpWDzLbj49Ie34X1kzKKXtrOeLZfSYssjf7585Qra3L+TO52Sq7yHc7oVBVLQ== jared@node3
  配置hosts文件,并且分别拷贝至各个节点的相同路径下
  vim /etc/hosts
  添加如下内容:
  192.168.255.25 master
  192.168.255.26 node1
  192.168.255.27 node2
  192.168.255.28 node3
  hadoop配置文件
  vim hadoop-env.sh
  添加内容:
  export JAVA_HOME=/usr/java/jdk1.7.0_51
  核心配置文件
  vim core-site.xml
  新添加如下内容:
  
  fs.default.name
  hdfs://master:9000
  true
  
  
  hadoop.tmp.dir
  /home/jared/hadoop/tmp
  A base for other temporary directories
  
  hdfs配置文件
  vim hdfs-site.xml
  新添加如下内容:
  
  dfs.name.dir
  /home/jared/hadoop/name
  true
  
  
  dfs.data.dir
  /home/jared/hadoop/data
  true
  
  
  dfs.replication
  2
  true
  
  mapreduce配置文件
  vim mapred-site.xml
  新添加如下内容:
  
  mapred.job.tracker
  192.168.255.25:9001>
  
  指定master节点
  vim masters
  新添加如下内容:
  master
  指定slave节点
  vim slaves
  新添加如下内容:
  node1
  node2
  node3
  向各节点复制hadoop,路径均为/home/jared/
  copy方法:
  [jared@master ~]$ scp -r ./hadoop/ node1:~
  [jared@master ~]$ scp -r ./hadoop/ node2:~
  [jared@master ~]$ scp -r ./hadoop/ node3:~
  首次启动hadoop需要先格式化文件系统
  [jared@master conf]$ hadoop namenode -format
  14/02/20 23:36:55 INFO namenode.NameNode: STARTUP_MSG:
  /************************************************************
  STARTUP_MSG: Starting NameNode
  STARTUP_MSG:   host = master/192.168.255.25
  STARTUP_MSG:   args = [-format]
  STARTUP_MSG:   version = 0.20.2
  STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
  ************************************************************/
  14/02/20 23:36:55 INFO namenode.FSNamesystem: fsOwner=jared,hadoop,adm
  14/02/20 23:36:55 INFO namenode.FSNamesystem: supergroup=supergroup
  14/02/20 23:36:55 INFO namenode.FSNamesystem: isPermissionEnabled=true

  14/02/20 23:36:56 INFO common.Storage: Image file of>  14/02/20 23:36:56 INFO common.Storage: Storage directory /home/jared/hadoop/name has been successfully formatted.
  14/02/20 23:36:56 INFO namenode.NameNode: SHUTDOWN_MSG:
  /************************************************************
  SHUTDOWN_MSG: Shutting down NameNode at master/192.168.255.25
  ************************************************************/
  启动hadoop
  [jared@master ~]$ start-all.sh
  starting namenode, logging to /home/jared/hadoop/bin/../logs/hadoop-jared-namenode-master.out
  node1: starting datanode, logging to /home/jared/hadoop/bin/../logs/hadoop-jared-datanode-node1.out
  node2: starting datanode, logging to /home/jared/hadoop/bin/../logs/hadoop-jared-datanode-node2.out
  master: starting secondarynamenode, logging to /home/jared/hadoop/bin/../logs/hadoop-jared-secondarynamenode-master.out
  starting jobtracker, logging to /home/jared/hadoop/bin/../logs/hadoop-jared-jobtracker-master.out
  node1: starting tasktracker, logging to /home/jared/hadoop/bin/../logs/hadoop-jared-tasktracker-node1.out
  node2: starting tasktracker, logging to /home/jared/hadoop/bin/../logs/hadoop-jared-tasktracker-node2.out
  用jps检验各后台进程是否启动
  [jared@master ~]$ /usr/java/jdk1.7.0_51/bin/jps
  22642 SecondaryNameNode
  22503 NameNode
  22810 Jps
  22705 JobTracker
  [jared@node1 conf]$ /usr/java/jdk1.7.0_51/bin/jps
  22703 Jps
  22610 TaskTracker
  22542 DataNode
  [root@node2 conf]# /usr/java/jdk1.7.0_51/bin/jps
  22609 Jps
  22503 TaskTracker
  22445 DataNode
  测试hadoop wordcount
  [jared@master ~]$ /usr/java/jdk1.7.0_51/bin/jps
  22642 SecondaryNameNode
  22503 NameNode
  23874 Jps
  22705 JobTracker
  [jared@master ~]$ pwd
  /home/jared
  [jared@master ~]$ mkdir input
  [jared@master ~]$ cd input/
  [jared@master input]$ ls
  [jared@master input]$ echo "hello world">test1.txt
  [jared@master input]$ echo "hello hadoop">test2.txt
  [jared@master input]$ ls
  test1.txt  test2.txt
  [jared@master input]$ cat test1.txt
  hello world
  [jared@master input]$ cat test2.txt
  hello hadoop
  上传文件到HDFS
  [jared@master input]$ hadoop dfs -put ../input in
  [jared@master input]$ hadoop dfs -ls in
  Found 2 items
  -rw-r--r--   2 jared supergroup         12 2014-02-21 00:14 /user/jared/in/test1.txt
  -rw-r--r--   2 jared supergroup         13 2014-02-21 00:14 /user/jared/in/test2.txt
  hadoop测试,统计文件中单词的数量, wordcount
  [jared@master input]$ hadoop jar /home/jared/hadoop/hadoop-0.20.2-examples.jar wordcount in out
  14/02/21 00:17:01 INFO input.FileInputFormat: Total input paths to process : 2
  14/02/21 00:17:02 INFO mapred.JobClient: Running job: job_201402202338_0001
  14/02/21 00:17:03 INFO mapred.JobClient:  map 0% reduce 0%
  14/02/21 00:17:12 INFO mapred.JobClient:  map 50% reduce 0%
  14/02/21 00:17:13 INFO mapred.JobClient:  map 100% reduce 0%
  14/02/21 00:17:24 INFO mapred.JobClient:  map 100% reduce 100%
  14/02/21 00:17:26 INFO mapred.JobClient: Job complete: job_201402202338_0001
  14/02/21 00:17:26 INFO mapred.JobClient: Counters: 17
  14/02/21 00:17:26 INFO mapred.JobClient:   Job Counters
  14/02/21 00:17:26 INFO mapred.JobClient:     Launched reduce tasks=1
  14/02/21 00:17:26 INFO mapred.JobClient:     Launched map tasks=2
  14/02/21 00:17:26 INFO mapred.JobClient:     Data-local map tasks=2
  14/02/21 00:17:26 INFO mapred.JobClient:   FileSystemCounters
  14/02/21 00:17:26 INFO mapred.JobClient:     FILE_BYTES_READ=55
  14/02/21 00:17:26 INFO mapred.JobClient:     HDFS_BYTES_READ=25
  14/02/21 00:17:26 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=180
  14/02/21 00:17:26 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=25
  14/02/21 00:17:26 INFO mapred.JobClient:   Map-Reduce Framework
  14/02/21 00:17:26 INFO mapred.JobClient:     Reduce input groups=3
  14/02/21 00:17:26 INFO mapred.JobClient:     Combine output records=4
  14/02/21 00:17:26 INFO mapred.JobClient:     Map input records=2
  14/02/21 00:17:26 INFO mapred.JobClient:     Reduce shuffle bytes=61
  14/02/21 00:17:26 INFO mapred.JobClient:     Reduce output records=3
  14/02/21 00:17:26 INFO mapred.JobClient:     Spilled Records=8
  14/02/21 00:17:26 INFO mapred.JobClient:     Map output bytes=41
  14/02/21 00:17:26 INFO mapred.JobClient:     Combine input records=4
  14/02/21 00:17:26 INFO mapred.JobClient:     Map output records=4
  14/02/21 00:17:26 INFO mapred.JobClient:     Reduce input records=4
  列出HDFS下的文件
  [jared@master input]$ hadoop dfs -ls
  Found 2 items
  drwxr-xr-x   - jared supergroup          0 2014-02-21 00:14 /user/jared/in
  drwxr-xr-x   - jared supergroup          0 2014-02-21 00:17 /user/jared/out
  [jared@master input]$ hadoop dfs -ls out
  Found 2 items
  drwxr-xr-x   - jared supergroup          0 2014-02-21 00:17 /user/jared/out/_logs
  -rw-r--r--   2 jared supergroup         25 2014-02-21 00:17 /user/jared/out/part-r-00000
  查看HDFS下某个文件的内容
  [jared@master input]$ hadoop dfs -cat out/part-r-00000
  hadoop  1
  hello   2
  world   1
  [jared@master input]$
  [jared@node1 data]$ pwd
  /home/jared/hadoop/data
  [jared@node1 data]$ ls -lR
  .:
  total 16
  drwxr-xr-x. 2 jared hadoop 4096 Feb 20 23:50 current
  drwxr-xr-x. 2 jared hadoop 4096 Feb 20 22:53 detach
  -rw-r--r--. 1 jared hadoop    0 Feb 20 22:53 in_use.lock
  -rw-r--r--. 1 jared hadoop  157 Feb 20 22:53 storage
  drwxr-xr-x. 2 jared hadoop 4096 Feb 20 23:50 tmp
  ./current:
  total 368
  -rw-r--r--. 1 jared hadoop      4 Feb 20 22:54 blk_1488417308273842703
  -rw-r--r--. 1 jared hadoop     11 Feb 20 22:54 blk_1488417308273842703_1001.meta
  -rw-r--r--. 1 jared hadoop  16746 Feb 20 23:49 blk_1659744422027317455
  -rw-r--r--. 1 jared hadoop    139 Feb 20 23:49 blk_1659744422027317455_1033.meta
  -rw-r--r--. 1 jared hadoop   8690 Feb 20 23:50 blk_-3027154220961892181
  -rw-r--r--. 1 jared hadoop     75 Feb 20 23:50 blk_-3027154220961892181_1034.meta
  -rw-r--r--. 1 jared hadoop 142466 Feb 20 23:38 blk_3123495904277639429
  -rw-r--r--. 1 jared hadoop   1123 Feb 20 23:38 blk_3123495904277639429_1013.meta
  -rw-r--r--. 1 jared hadoop     12 Feb 20 23:49 blk_5040281988852807225
  -rw-r--r--. 1 jared hadoop     11 Feb 20 23:49 blk_5040281988852807225_1028.meta
  -rw-r--r--. 1 jared hadoop     25 Feb 20 23:50 blk_-538339897708158192
  -rw-r--r--. 1 jared hadoop     11 Feb 20 23:50 blk_-538339897708158192_1034.meta
  -rw-r--r--. 1 jared hadoop     13 Feb 20 23:49 blk_6041811899305324558
  -rw-r--r--. 1 jared hadoop     11 Feb 20 23:49 blk_6041811899305324558_1027.meta
  -rw-r--r--. 1 jared hadoop 142466 Feb 20 23:38 blk_-7701193131489368534
  -rw-r--r--. 1 jared hadoop   1123 Feb 20 23:38 blk_-7701193131489368534_1010.meta
  -rw-r--r--. 1 jared hadoop   1540 Feb 20 23:50 dncp_block_verification.log.curr
  -rw-r--r--. 1 jared hadoop    155 Feb 20 22:53 VERSION
  ./detach:
  total 0
  ./tmp:
  total 0
  [jared@node1 data]$
  查看HDFS基本统计信息
  [jared@node1 data]$ hadoop dfsadmin -report
  Configured Capacity: 103366975488 (96.27 GB)
  Present Capacity: 94912688128 (88.39 GB)
  DFS Remaining: 94911893504 (88.39 GB)
  DFS Used: 794624 (776 KB)
  DFS Used%: 0%
  Under replicated blocks: 2
  Blocks with corrupt replicas: 0
  Missing blocks: 0
  -------------------------------------------------
  Datanodes available: 2 (2 total, 0 dead)
  Name: 192.168.255.27:50010
  Decommission Status : Normal
  Configured Capacity: 51683487744 (48.13 GB)
  DFS Used: 397312 (388 KB)
  Non DFS Used: 4226998272 (3.94 GB)
  DFS Remaining: 47456092160(44.2 GB)
  DFS Used%: 0%
  DFS Remaining%: 91.82%
  Last contact: Fri Feb 21 08:32:47 EST 2014
  Name: 192.168.255.26:50010
  Decommission Status : Normal
  Configured Capacity: 51683487744 (48.13 GB)
  DFS Used: 397312 (388 KB)
  Non DFS Used: 4227289088 (3.94 GB)
  DFS Remaining: 47455801344(44.2 GB)
  DFS Used%: 0%
  DFS Remaining%: 91.82%
  Last contact: Fri Feb 21 08:32:46 EST 2014
  [jared@node1 data]$
  进入和退出安全模式
  [jared@node1 data]$ hadoop dfsadmin -safemode enter
  Safe mode is ON
  [jared@node1 data]$ hadoop dfsadmin -safemode leave
  Safe mode is OFF
  [jared@node1 data]$
  添加新节点步骤:
  在新节点安装好hadoop
  把namenode的有关配置文件复制到该节点
  修改masters和slaves文件,增加该节点
  设置ssh免密码进出该节点
  单独启动该节点上的datanode和tasktracker(hadoop-daemon.sh start datanode/tasktracker)
  运行start-balancer.sh进行数据负载均衡  作用:当节点出现故障,或新增加节点时,数据块分布可能不均匀,负载均衡可以重 新平衡各个datanode上数据块的分布
  新加节点操作历史:
  用户:root
  54  groupadd hadoop
  55  useradd -s /bin/bash -d /home/jared -m jared -g hadoop -G adm
  56  passwd jared
  57  rpm -ivh /home/jared/jdk-7u51-linux-x64.rpm
  58  vim /etc/profile
  59  source /etc/profile
  60  exit
  用户:jared
  1  ssh-keygen -t rsa
  2  cd .ssh/
  3  ls

  4  vim>  5  ls
  6  vim authorized_keys
  8  ssh node3
  9  ssh node1
  10  ssh node2
  11  ssh master
  12  ls
  13  cd
  14  ls
  15  cd /usr/java/
  16  ls
  17  cd
  18  ls
  19  vim /etc/profile
  20  su - root
  21  echo $JAVA_HOME
  22  source /etc/profile
  23  echo $JAVA_HOME
  24  cat /etc/hosts
  26  ls
  27  ll
  28  ls
  29  cd hadoop/
  30  ls
  31  cd
  32  vim /etc/profile
  33  echo $HADOOP_INSTALL
  34  hadoop-daemon.sh start datanode
  35  hadoop-daemon.sh start tasktracker
  36  /usr/java/jdk1.7.0_51/jps
  37  start-balancer.sh
  38  /usr/java/jdk1.7.0_51/jps
  39  source /etc/profile
  40  /usr/java/jdk1.7.0_51/bin/jps
  42  history


运维网声明 1、欢迎大家加入本站运维交流群:群②:261659950 群⑤:202807635 群⑦870801961 群⑧679858003
2、本站所有主题由该帖子作者发表,该帖子作者与运维网享有帖子相关版权
3、所有作品的著作权均归原作者享有,请您和我们一样尊重他人的著作权等合法权益。如果您对作品感到满意,请购买正版
4、禁止制作、复制、发布和传播具有反动、淫秽、色情、暴力、凶杀等内容的信息,一经发现立即删除。若您因此触犯法律,一切后果自负,我们对此不承担任何责任
5、所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其内容的准确性、可靠性、正当性、安全性、合法性等负责,亦不承担任何法律责任
6、所有作品仅供您个人学习、研究或欣赏,不得用于商业或者其他用途,否则,一切后果均由您自己承担,我们对此不承担任何法律责任
7、如涉及侵犯版权等问题,请您及时通知我们,我们将立即采取措施予以解决
8、联系人Email:admin@iyunv.com 网址:www.yunweiku.com

所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其承担任何法律责任,如涉及侵犯版权等问题,请您及时通知我们,我们将立即处理,联系人Email:kefu@iyunv.com,QQ:1061981298 本贴地址:https://www.yunweiku.com/thread-628752-1-1.html 上篇帖子: Pig安装及简单使用(pig0.12.0 hadoop2.2.0) 下篇帖子: Hadoop数据传输工具sqoop-DavideyLee
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

扫码加入运维网微信交流群X

扫码加入运维网微信交流群

扫描二维码加入运维网微信交流群,最新一手资源尽在官方微信交流群!快快加入我们吧...

扫描微信二维码查看详情

客服E-mail:kefu@iyunv.com 客服QQ:1061981298


QQ群⑦:运维网交流群⑦ QQ群⑧:运维网交流群⑧ k8s群:运维网kubernetes交流群


提醒:禁止发布任何违反国家法律、法规的言论与图片等内容;本站内容均来自个人观点与网络等信息,非本站认同之观点.


本站大部分资源是网友从网上搜集分享而来,其版权均归原作者及其网站所有,我们尊重他人的合法权益,如有内容侵犯您的合法权益,请及时与我们联系进行核实删除!



合作伙伴: 青云cloud

快速回复 返回顶部 返回列表