hadoop 安装配置
1、安装java 选择适合自己的jdk 我选择的是jdk7,下载地址如下http://www.oracle.com/technetwork/java/javase/archive-139210.html
解压
配置环境变量 vim /etc/profile
---------------------
JAVA_HOME=/usr/java/jdk1.7.0_45
JRE_HOME=/usr/java/jdk1.7.0_45/jre
PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin
CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JRE_HOME/lib
export JAVA_HOME JRE_HOME PATH> ---------------------
source /etc/profile
================jdk 在线安装====
1.查找java相关得列表
$ yum -y list java*
2.使用root用户安装
安装时提醒必须使用root用户,sudo都不行。
$ yum -y install java-1.7.0-openjdk*
3.确认是否安装成功
$ java -version
4.默认情况下jdk安装得路径
/usr/lib/jvm
==========================
2、安装scala
scala下载地址:http://www.scala-lang.org/download/2.10.3.html
解压到/root/software
配置环境变量
vim /etc/profile
添加
SCALA_HOME=/root/software/scala-2.10.4
PATH=$SCALA_HOME/bin:$PATH
source /etc/profile
3、安装spark可以参考下面的
http://blog.csdn.net/supingemail/article/details/46713851
http://my.oschina.net/hanzhankang/blog/204100
3、安装hadoop
转载来自:http://blog.csdn.net/stark_summer/article/details/43484545
hadoop下载网址:
wget http://apache.fayea.com/hadoop/common/hadoop-2.6.0/hadoop-2.6.0.tar.gz
1)修改下主机名为master
sudo vim/etc/sysconfig/network
http://img.blog.csdn.net/20150204123443714?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQvc3Rhcmtfc3VtbWVy/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center
修改结果后:
http://img.blog.csdn.net/20150204123541995?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQvc3Rhcmtfc3VtbWVy/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center
重启电脑:
Linux centos重启命令:
[*] 1、reboot
[*] 2、shutdown -r now 立刻重启(root用户使用)
[*] 3、shutdown -r 10 过10分钟自动重启(root用户使用)
[*] 4、shutdown -r 20:35 在时间为20:35时候重启(root用户使用)
查看结果:
http://img.blog.csdn.net/20150204130926942?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQvc3Rhcmtfc3VtbWVy/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center
修改主机名成功
2)修改hosts中的主机名:
http://img.blog.csdn.net/20150204131050215?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQvc3Rhcmtfc3VtbWVy/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center
http://img.blog.csdn.net/20150204131123972?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQvc3Rhcmtfc3VtbWVy/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center
修改后:
http://img.blog.csdn.net/20150204131416150?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQvc3Rhcmtfc3VtbWVy/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center
http://img.blog.csdn.net/20150204131758529?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQvc3Rhcmtfc3VtbWVy/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center
3)配置SSH
http://img.blog.csdn.net/20150204132159596?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQvc3Rhcmtfc3VtbWVy/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center
进入.ssh目录并生成authorized_keys文件:
http://img.blog.csdn.net/20150204132649944?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQvc3Rhcmtfc3VtbWVy/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center
授权.ssh/文件夹权限为700,authorized_keys文件权限为600(or 644):
http://img.blog.csdn.net/20150204132832935?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQvc3Rhcmtfc3VtbWVy/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center
ssh验证:
http://img.blog.csdn.net/20150204133049857?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQvc3Rhcmtfc3VtbWVy/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center
4)hadoop安装:
去官网下载hadoop最新安装包:http://apache.fayea.com/hadoop/common/hadoop-2.6.0/
下载并解压后:
hadoop-2.6.0.tar.gz 放在/root/software文件中
解压
tar zvxf hadoop-2.6.0.tar.gz
配置系统环境
vim /etc/profile
----------------------------
export HADOOP_INSTALL=/root/sherry/hadoop-2.6.0
export PATH=$PATH:$HADOOP_INSTALL/bin
export PATH=$PATH:$HADOOP_INSTALL/sbin
export HADOOP_MAPRED_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_HOME=$HADOOP_INSTALL
export HADOOP_HDFS_HOME=$HADOOP_INSTALL
-------------------------------
source /etc/profile
在hadoop目录下创建文件夹:
http://img.blog.csdn.net/20150204141440910?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQvc3Rhcmtfc3VtbWVy/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center
接下来开始修改hadoop的配置文件,首先进入hadoop2.6配置文件夹:
http://img.blog.csdn.net/20150204141720217?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQvc3Rhcmtfc3VtbWVy/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center
第一步修改配置文件hadoop-env.sh,加入"JAVA-HOME",如下所示:
http://img.blog.csdn.net/20150204142427315?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQvc3Rhcmtfc3VtbWVy/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center
指定我们安装的“JAVA_HOME”:
http://img.blog.csdn.net/20150204142619527?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQvc3Rhcmtfc3VtbWVy/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center
第二步修改配置文件"yarn-env.sh",加入"JAVA_HOME",如下所示:
http://img.blog.csdn.net/20150204142837524?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQvc3Rhcmtfc3VtbWVy/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center
指定我们安装的“JAVA_HOME”:
http://img.blog.csdn.net/20150204142955181?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQvc3Rhcmtfc3VtbWVy/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center
第三步 修改配置文件“mapred-env.sh”,加入“JAVA_HOME”,如下所示:
http://img.blog.csdn.net/20150204143200075?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQvc3Rhcmtfc3VtbWVy/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center
指定我们安装的“JAVA_HOME”:
http://img.blog.csdn.net/20150204143629331?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQvc3Rhcmtfc3VtbWVy/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center
第四步 修改配置文件slaves,如下所示:
http://img.blog.csdn.net/20150204143920729?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQvc3Rhcmtfc3VtbWVy/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center
设置从节点为master,因为我们是伪分布式,如下所示:
http://img.blog.csdn.net/20150204144044969?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQvc3Rhcmtfc3VtbWVy/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center
第五步 修改配置文件core-site.xml,如下所示:
http://img.blog.csdn.net/20150204201253567?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQvc3Rhcmtfc3VtbWVy/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center
修改core-site.xml文件后:
http://img.blog.csdn.net/20150204210818635?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQvc3Rhcmtfc3VtbWVy/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center
目前来说,core-site.xml文件的最小化配置,core-site.xml各项配置可参考:http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-common/core-default.xml
我自己的配置如下:
--------------------
fs.defaultFS
hdfs://10.118.46.22:9000
io.file.buffer.size
131072
hadoop.tmp.dir
file:/root/sherry/tmp
Abase for other temporary directories.
hadoop.proxyuser.hduser.hosts
*
hadoop.proxyuser.hduser.groups
*
ha.zookeeper.quorm
slave4:2181
----------------------
第六步 修改配置文件 hdfs-site.xml,如下所示:
http://img.blog.csdn.net/20150204204354978?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQvc3Rhcmtfc3VtbWVy/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center
hdfs-site.xml文件修改后:
http://img.blog.csdn.net/20150204205013419?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQvc3Rhcmtfc3VtbWVy/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center
上述是hdfs-site.xml文件的最小化配置,hdfs-site.xml各项配置可参考:http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml
我自己的配置
---------------------------
dfs.namenode.secondary.http-address
localhost:9001
dfs.namenode.name.dir
file:/hdfs/namenode
dfs.datanode.data.dir
file:/hdfs/datanode
dfs.replication
3
dfs.webhdfs.enable
true
---------------------------
第七步 修改配置文件 mapred-site.xml,如下所示:
copy mapred-site.xml.template命名为mapred-site.xml,打开mapred-site.xml,如下所示:
http://img.blog.csdn.net/20150204205443113?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQvc3Rhcmtfc3VtbWVy/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center
mapred-site.xml 修改后:
http://img.blog.csdn.net/20150204205734058?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQvc3Rhcmtfc3VtbWVy/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center
上述是mapred-site.xml最小化配置,mapred-site.xml各项配置可参考:http://hadoop.apache.org/docs/r2.6.0/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml
我自己的配置
-----------------
mapreduce.framework.name
yarn
------------------
第八步 配置文件yarn-site.xml,如下所示:
http://img.blog.csdn.net/20150204205954364?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQvc3Rhcmtfc3VtbWVy/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center
yarn-site.xml修改后:
http://img.blog.csdn.net/20150204210247946?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQvc3Rhcmtfc3VtbWVy/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center
上述内容是yarn-site.xml的最小化配置,yarn-site文件配置的各项内容可参考:http://hadoop.apache.org/docs/r2.6.0/hadoop-yarn/hadoop-yarn-common/yarn-default.xml
也可以增加spark_shuffle,配置如下
yarn.nodemanager.aux-services
mapreduce_shuffle,spark_shuffle
yarn.nodemanager.aux-services.mapreduce_shuffle.class
org.apache.hadoop.mapred.ShuffleHandler
yarn.nodemanager.aux-services.spark_shuffle.class
org.apache.spark.network.yarn.YarnShuffleService
PS: 当提交hadoop MR 就启用,mapreduce_shuffle,当提交spark作业 就使用spark_shuffle,但个人感觉spark_shuffle 效率一般,shuffle是很大瓶颈,还有 如果你使用spark_shuffle 你需要把spark-yarn_2.10-1.4.1.jar 这个jar copy 到HADOOP_HOME/share/hadoop/lib下 ,否则 hadoop 运行报错> 我自己的配置
------------------------------------
yarn.resourcemanager.hostname
localhost
yarn.nodemanager.aux-services
mapreduce_shuffle
-----------------------------------
5、启动并验证hadoop伪分布式
新版启动:
进入hadoop/sbin
start-all.sh
关闭:
stop-all.sh
验证启动是否成功
jps
第一步:格式化hdfs文件系统:
http://img.blog.csdn.net/20150204210935060?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQvc3Rhcmtfc3VtbWVy/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center
http://img.blog.csdn.net/20150204211003140?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQvc3Rhcmtfc3VtbWVy/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center
第二步:进入sbin中启动hdfs,执行如下命令:
http://img.blog.csdn.net/20150204211255691?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQvc3Rhcmtfc3VtbWVy/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center
此刻我们发现在master上启动了NameNode、DataNode、SecondaryNameNode
此刻通过web控制台查看hdfs,http://master:50070/
页:
[1]