spark@master:~/spark$ java -version
java version "1.7.0_79"
Java(TM) SE Runtime Environment (build 1.7.0_79-b15)
Java HotSpot(TM) 64-Bit Server VM (build 24.79-b02, mixed mode) java环境已经配置成功
spark@master:~/spark$ scala -version
Scala code runner version 2.11.6 -- Copyright 2002-2013, LAMP/EPFL scala已经配置成功
至此上述是三个节点同时执行的。
接下来,master节点上的配置
spark@master:~/spark$ hadoop version
Hadoop 2.7.3
Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r baa91f7c6bc9cb92be5982de4719c1c8af91ccff
Compiled by root on 2016-08-18T01:41Z
Compiled with protoc 2.5.0
From source with checksum 2e4ce5f957ea4db193bce3734ff29ff4
This command was run using /home/spark/spark/hadoop-2.7.3/share/hadoop/common/hadoop-common-2.7.3.jar hadoop环境成功
接下来开始配置hadoop
spark@master:~/spark$ cd hadoop/etc/hadoop/
spark@master:~/spark/hadoop/etc/hadoop$ vim slaves 删除里面内容,并添加一下内容:
保存即可
依照下述命令进行文件的更改
spark@master:~/spark/hadoop/etc/hadoop$ vim hadoop-env.sh 添加或更改文件中相关的变量,本人在这个地方踩了不少坑,如果不添加,会在后面报错。
添加完毕后,记得保存。
spark@master:~/spark/hadoop/etc/hadoop$ vim core-site.xml 添加内容到
内容
内容如下:
fs.default.name
hdfs://master:9000
The name of the default file system. A URI whose scheme and authority determine the FileSystem implementation. The uri's scheme determines the config property (fs.SCHEME.impl) naming the FileSystem implementation class. The uri's authority is used to determine the host, port, etc. for a filesystem.
hadoop.tmp.dir
/home/spark/spark/hadoop/tmp
A base for other temporary directories.
spark@master:~/spark/hadoop/etc/hadoop$ vim hdfs-site.xml添加内容到
内容
内容如下:
dfs.replication
3
Default block replication.The actual number of replications can be specified when the file iscreated.The default is used if replication is not specified in create time.
spark@master:~/spark/hadoop/etc/hadoop$ vim yarn-site.xml添加内容到
内容
内容如下:
执行
spark@worker1:~/spark$ scp -r spark@master:/home/spark/spark/hadoop ./hadoop 注意:./hadoop,代表将master中spark用户下的/home/spark/spark/hadoop复制为hadoop,此名称要跟之前在/etc/profile中设置的hadoop环境变量名称一致。
在worker1中做下测试。
spark@worker1:~/spark$ hadoop version
Hadoop 2.7.3
Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r baa91f7c6bc9cb92be5982de4719c1c8af91ccff
Compiled by root on 2016-08-18T01:41Z
Compiled with protoc 2.5.0
From source with checksum 2e4ce5f957ea4db193bce3734ff29ff4
This command was run using /home/spark/spark/hadoop/share/hadoop/common/hadoop-common-2.7.3.jar 显示成功
在worker2下做下测试。
spark@worker2:~/spark$ hadoop version
Hadoop 2.7.3
Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r baa91f7c6bc9cb92be5982de4719c1c8af91ccff
Compiled by root on 2016-08-18T01:41Z
Compiled with protoc 2.5.0
From source with checksum 2e4ce5f957ea4db193bce3734ff29ff4
This command was run using /home/spark/spark/hadoop/share/hadoop/common/hadoop-common-2.7.3.jar 显示成功