一、准备软件环境:
hadoop-2.6.0.tar.gz
CentOS-5.11-i386
jdk-6u24-linux-i586
Master:hadoop02 192.168.20.129
Slave01:hadoop03 192.168.20.130
Slave02:hadoop04 192.168.20.131
二、安装JDK、SSH环境和hadoop【先在hadoop02下】 对于JDK 1
2
3
| chmod u+x jdk-6u24-linux-i586.bin
./jdk-6u24-linux-i586.bin
mv jdk-1.6.0_24 /home/jdk
|
注:证明JDK安装成功命令: #java -version 对于SSH 1
2
| ssh-keygen -t rsa
cp ~/.ssh/id_rsa.pub ~/.ssh/authorized_keys
|
注:证明SSH无密码登录成功命令: #ssh localhost
对于Hadoop 1
2
| tar -zxvf hadoop-2.6.0.tar.gz
mv hadoop-2.6.0 /home/hadoop
|
#vim /etc/profile 1
2
3
| export JAVA_HOME=/home/jdk
export HADOOP_HOME=/home/hadoop
export PATH=.:$JAVA_HOME/bin:$HADOOP_HOME/bin:$PATH
|
#source /etc/profile #vim /etc/hosts 1
2
3
| 192.168.20.129 hadoop02
192.168.20.130 hadoop03
192.168.20.131 hadoop04
|
三、配置Hadoop环境【先在hadoop02下】
1)配置文件1:hadoop-env.sh
1
| export JAVA_HOME=/home/jdk
|
2)配置文件2:yarn-env.sh
1
| export JAVA_HOME=/home/jdk
|
3)配置文件3:slaves
4)配置文件4:core-site.xml
1
2
3
4
5
6
7
8
9
10
| <configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/data/hadoop-${user.name}</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://hadoop02:9000</value>
</property>
</configuration>
|
5)配置文件5:hdfs-site.xml 1
2
3
4
5
6
7
8
9
10
11
12
13
14
| <configuration>
<property>
<name>dfs.http.address</name>
<value>hadoop02:50070</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>hadoop02:50090</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
|
6)配置文件6:mapred-site.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
| <configuration>
<property>
<name>mapred.job.tracker</name>
<value>hadoop02:9001</value>
</property>
<property>
<name>mapred.map.tasks</name>
<value>20</value>
</property>
<property>
<name>mapred.reduce.tasks</name>
<value>4</value>
</property>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>hadoop02:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>hadoop02:19888</value>
</property>
</configuration>
|
7)配置文件7:yarn-site.xml 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
| <configuration>
<property>
<name>yarn.resourcemanager.address</name>
<value>hadoop02:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>hadoop02:8030</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>hadoop02:8088</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>hadoop02:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>hadoop02:8033</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>
|
四、配置hadoop03和hadoop04 他们的配置同hadoop02一样,同理: 1
2
3
4
5
6
7
8
| scp -r /root/.ssh/ root@hadoop03:/root/.ssh/
scp -r /root/.ssh/ root@hadoop04:/root/.ssh/
scp /etc/profile root@hadoop03:/etc/
scp /etc/profile root@hadoop04:/etc/
scp /etc/hosts root@hadoop03:/etc/
scp /etc/hosts root@hadoop04:/etc/
scp -r /home/ root@hadoop03:/home/
scp -r /home/ root@hadoop04:/home/
|
五、启动hadoop集群 1)格式化namenode: 1
| /home/hadoop/bin/hdfs namenode -format
|
2)启动hdfs: 1
| /home/hadoop/sbin/start-dfs.sh
|
此时在Master上面运行的进程有:namenode secondarynamenode Slave1和Slave2上面运行的进程有:datanode 3)启动yarn: 1
| /home/hadoop/sbin/start-yarn.sh
|
此时在Master上面运行的进程有:namenode secondarynamenode resourcemanager Slave1和Slave2上面运行的进程有:datanode nodemanaget 4)检查启动结果 查看集群状态:
查看HDFS:
六、总结实验--错误:
1
| 15/05/11 13:41:55 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
|
这个错误是/home/hadoop/lib/native/libhadoop.so.1.0.0是64位系统,而我所用是32位系统, 但不影响系统 1
| #file libhadoop.so.1.0.0
|
|