bgey 发表于 2018-10-30 08:31:19

Hadoop2.7的配置部署及测试

  1.环境准备:
  安装Centos6.5的操作系统
  下载hadoop2.7版本的软件
  wget http://124.205.69.132/files/224400000162626A/mirrors.hust.edu.cn/apache/hadoop/common/stable/hadoop-2.7.1.tar.gz
  下载jdk1.87版本的软件
  wget http://download.oracle.com/otn-pub/java/jdk/8u60-b27/jdk-8u60-linux-x64.tar.gz?AuthParam=1443446776_174368b9ab1a6a92468aba5cd4d092d0
  2.修改/etc/hosts文件及配置互信:
  在/etc/hosts文件中增加如下内容:
  192.168.1.61 host61
  192.168.1.62 host62
  192.168.1.63 host63
  配置好各服务器之间的ssh互信
  3.添加用户,解压文件并配置环境变量:
  useradd hadoop
  passwd hadoop
  tar -zxvf hadoop-2.7.1.tar.gz
  mv hadoop-2.7.1 /usr/local
  ln -s hadoop-2.7.1 hadoop
  chown -R hadoop:hadoop hadoop-2.7.1
  tar -zxvf jdk-8u60-linux-x64.tar.gz
  mv jdk1.8.0_60 /usr/local
  ln -s jdk1.8.0_60 jdk
  chown -R root:root jdk1.8.0_60
  echo 'export JAVA_HOME=/usr/local/jdk' >>/etc/profile
  echo 'export PATH=/usr/local/jdk/bin:$PATH' >/etc/profile.d/java.sh
  4.修改hadoop配置文件:
  1)修改hadoop-env.sh文件:
  cd /usr/local/hadoop/etc/hadoop/hadoop-env.sh
  sed -i 's%#export JAVA_HOME=${JAVA_HOME}%export JAVA_HOME=/usr/local/jdk%g' hadoop-env.sh
  2)修改core-site.xml,在最后添加如下内容:
  
  
  fs.default.name
  hdfs://host61:9000/
  
  
  hadoop.tmp.dir
  /home/hadoop/temp
  
  
  3)修改hdfs-site.xml文件:
  
  
  dfs.replication
  3
  
  
  4)修改mapred-site.xml
  
  
  mapred.job.tracker
  host61:9001
  
  
  5)配置masters
  host61
  6)配置slaves
  host62
  host63
  5.用同样的方式配置host62及host63
  6.格式化分布式文件系统
  /usr/local/hadoop/bin/hadoop namenode format
  7.替换hadoop的库文件:
  mv /usr/local/hadoop/lib/native /usr/local/hadoop/lib/native_old
  将编译好的hadoop文件下的lib/native文件夹复制过来;
  8.运行hadoop
  1)/usr/local/hadoop/sbin/start-dfs.sh
  2)/usr/local/hadoop/sbin/start-yarn.sh
  9.检查:
  # jps
  4532 ResourceManager
  4197 NameNode
  4793 Jps
  4364 SecondaryNameNode
  # jps
  32052 DataNode
  32133 NodeManager
  32265 Jps
  # jps
  6802 NodeManager
  6963 Jps
  6717 DataNode
  10.通过web了解hadoop:
  namenode的信息:
  http://192.168.1.61:50070/
  secondnamenode的信息:
  http://192.168.1.61:50090/
  datanode的信息:
  http://192.168.1.62:50075/
  11.测试
  echo "this is the first file" >/tmp/mytest1.txt
  echo "this is the second file" >/tmp/mytest2.txt
  cd /usr/local/hadoop/bin;
  $ ./hadoop fs -mkdir /in
  $ ./hadoop fs -put /tmp/mytest*.txt /in
  $ ./hadoop fs -ls /in
  Found 2 items
  -rw-r--r--   3 hadoop supergroup         23 2015-10-02 18:45 /in/mytest1.txt
  -rw-r--r--   3 hadoop supergroup         24 2015-10-02 18:45 /in/mytest2.txt
  $ ./bin/hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jarwordcount /in /out
  15/10/02 18:53:30 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
  15/10/02 18:53:30 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
  15/10/02 18:53:34 INFO input.FileInputFormat: Total input paths to process : 2
  15/10/02 18:53:35 INFO mapreduce.JobSubmitter: number of splits:2
  15/10/02 18:53:38 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local1954603964_0001
  15/10/02 18:53:40 INFO mapreduce.Job: The url to track the job: http://localhost:8080/
  15/10/02 18:53:40 INFO mapreduce.Job: Running job: job_local1954603964_0001
  15/10/02 18:53:40 INFO mapred.LocalJobRunner: OutputCommitter set in config null
  15/10/02 18:53:40 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
  15/10/02 18:53:40 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
  15/10/02 18:53:41 INFO mapred.LocalJobRunner: Waiting for map tasks
  15/10/02 18:53:41 INFO mapred.LocalJobRunner: Starting task: attempt_local1954603964_0001_m_000000_0
  15/10/02 18:53:41 INFO mapreduce.Job: Job job_local1954603964_0001 running in uber mode : false
  15/10/02 18:53:41 INFO mapreduce.Job:map 0% reduce 0%
  15/10/02 18:53:41 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
  15/10/02 18:53:41 INFO mapred.Task:Using ResourceCalculatorProcessTree : [ ]
  15/10/02 18:53:41 INFO mapred.MapTask: Processing split: hdfs://host61:9000/in/mytest2.txt:0+24
  15/10/02 18:53:51 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
  15/10/02 18:53:51 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
  15/10/02 18:53:51 INFO mapred.MapTask: soft limit at 83886080
  15/10/02 18:53:51 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
  15/10/02 18:53:51 INFO mapred.MapTask: kvstart = 26214396; length = 6553600

  15/10/02 18:53:51 INFO mapred.MapTask: Map output collector>  15/10/02 18:53:52 INFO mapred.LocalJobRunner:
  15/10/02 18:53:52 INFO mapred.MapTask: Starting flush of map output
  15/10/02 18:53:52 INFO mapred.MapTask: Spilling map output
  15/10/02 18:53:52 INFO mapred.MapTask: bufstart = 0; bufend = 44; bufvoid = 104857600
  15/10/02 18:53:52 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26214380(104857520); length = 17/6553600
  15/10/02 18:53:52 INFO mapred.MapTask: Finished spill 0
  15/10/02 18:53:52 INFO mapred.Task: Task:attempt_local1954603964_0001_m_000000_0 is done. And is in the process of committing
  15/10/02 18:53:53 INFO mapred.LocalJobRunner: map
  15/10/02 18:53:53 INFO mapred.Task: Task 'attempt_local1954603964_0001_m_000000_0' done.
  15/10/02 18:53:53 INFO mapred.LocalJobRunner: Finishing task: attempt_local1954603964_0001_m_000000_0
  15/10/02 18:53:53 INFO mapred.LocalJobRunner: Starting task: attempt_local1954603964_0001_m_000001_0
  15/10/02 18:53:53 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
  15/10/02 18:53:53 INFO mapred.Task:Using ResourceCalculatorProcessTree : [ ]
  15/10/02 18:53:53 INFO mapred.MapTask: Processing split: hdfs://host61:9000/in/mytest1.txt:0+23
  15/10/02 18:53:53 INFO mapreduce.Job:map 100% reduce 0%
  15/10/02 18:53:53 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
  15/10/02 18:53:53 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
  15/10/02 18:53:53 INFO mapred.MapTask: soft limit at 83886080
  15/10/02 18:53:53 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
  15/10/02 18:53:53 INFO mapred.MapTask: kvstart = 26214396; length = 6553600

  15/10/02 18:53:53 INFO mapred.MapTask: Map output collector>  15/10/02 18:53:54 INFO mapred.LocalJobRunner:
  15/10/02 18:53:54 INFO mapred.MapTask: Starting flush of map output
  15/10/02 18:53:54 INFO mapred.MapTask: Spilling map output
  15/10/02 18:53:54 INFO mapred.MapTask: bufstart = 0; bufend = 43; bufvoid = 104857600
  15/10/02 18:53:54 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26214380(104857520); length = 17/6553600
  15/10/02 18:53:54 INFO mapred.MapTask: Finished spill 0
  15/10/02 18:53:54 INFO mapred.Task: Task:attempt_local1954603964_0001_m_000001_0 is done. And is in the process of committing
  15/10/02 18:53:54 INFO mapreduce.Job:map 50% reduce 0%
  15/10/02 18:53:54 INFO mapred.LocalJobRunner: map
  15/10/02 18:53:54 INFO mapred.Task: Task 'attempt_local1954603964_0001_m_000001_0' done.
  15/10/02 18:53:54 INFO mapred.LocalJobRunner: Finishing task: attempt_local1954603964_0001_m_000001_0
  15/10/02 18:53:54 INFO mapred.LocalJobRunner: map task executor complete.
  15/10/02 18:53:54 INFO mapred.LocalJobRunner: Waiting for reduce tasks
  15/10/02 18:53:54 INFO mapred.LocalJobRunner: Starting task: attempt_local1954603964_0001_r_000000_0
  15/10/02 18:53:54 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
  15/10/02 18:53:54 INFO mapred.Task:Using ResourceCalculatorProcessTree : [ ]
  15/10/02 18:53:54 INFO mapred.ReduceTask: Using ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@5205a129
  15/10/02 18:53:55 INFO reduce.MergeManagerImpl: MergerManager: memoryLimit=363285696, maxSingleShuffleLimit=90821424, mergeThreshold=239768576, ioSortFactor=10, memToMemMergeOutputsThreshold=10
  15/10/02 18:53:55 INFO reduce.EventFetcher: attempt_local1954603964_0001_r_000000_0 Thread started: EventFetcher for fetching Map Completion Events
  15/10/02 18:53:55 INFO mapreduce.Job:map 100% reduce 0%
  15/10/02 18:53:56 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local1954603964_0001_m_000001_0 decomp: 55 len: 59 to MEMORY
  15/10/02 18:53:56 INFO reduce.InMemoryMapOutput: Read 55 bytes from map-output for attempt_local1954603964_0001_m_000001_0

  15/10/02 18:53:56 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of>  15/10/02 18:53:56 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local1954603964_0001_m_000000_0 decomp: 56 len: 60 to MEMORY
  15/10/02 18:53:56 INFO reduce.InMemoryMapOutput: Read 56 bytes from map-output for attempt_local1954603964_0001_m_000000_0

  15/10/02 18:53:56 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of>  15/10/02 18:53:56 INFO reduce.EventFetcher: EventFetcher is interrupted.. Returning
  15/10/02 18:53:56 INFO mapred.LocalJobRunner: 2 / 2 copied.
  15/10/02 18:53:56 INFO reduce.MergeManagerImpl: finalMerge called with 2 in-memory map-outputs and 0 on-disk map-outputs
  15/10/02 18:53:57 INFO mapred.Merger: Merging 2 sorted segments

  15/10/02 18:53:57 INFO mapred.Merger: Down to the last merge-pass, with 2 segments left of total>  15/10/02 18:53:57 INFO reduce.MergeManagerImpl: Merged 2 segments, 111 bytes to disk to satisfy reduce memory limit
  15/10/02 18:53:57 INFO reduce.MergeManagerImpl: Merging 1 files, 113 bytes from disk
  15/10/02 18:53:57 INFO reduce.MergeManagerImpl: Merging 0 segments, 0 bytes from memory into reduce
  15/10/02 18:53:57 INFO mapred.Merger: Merging 1 sorted segments

  15/10/02 18:53:57 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total>  15/10/02 18:53:57 INFO mapred.LocalJobRunner: 2 / 2 copied.
  15/10/02 18:53:57 INFO Configuration.deprecation: mapred.skip.on is deprecated. Instead, use mapreduce.job.skiprecords
  15/10/02 18:53:59 INFO mapred.Task: Task:attempt_local1954603964_0001_r_000000_0 is done. And is in the process of committing
  15/10/02 18:53:59 INFO mapred.LocalJobRunner: 2 / 2 copied.
  15/10/02 18:53:59 INFO mapred.Task: Task attempt_local1954603964_0001_r_000000_0 is allowed to commit now
  15/10/02 18:53:59 INFO output.FileOutputCommitter: Saved output of task 'attempt_local1954603964_0001_r_000000_0' to hdfs://host61:9000/out/_temporary/0/task_local1954603964_0001_r_000000
  15/10/02 18:53:59 INFO mapred.LocalJobRunner: reduce > reduce
  15/10/02 18:53:59 INFO mapred.Task: Task 'attempt_local1954603964_0001_r_000000_0' done.
  15/10/02 18:53:59 INFO mapred.LocalJobRunner: Finishing task: attempt_local1954603964_0001_r_000000_0
  15/10/02 18:53:59 INFO mapred.LocalJobRunner: reduce task executor complete.
  15/10/02 18:53:59 INFO mapreduce.Job:map 100% reduce 100%
  15/10/02 18:53:59 INFO mapreduce.Job: Job job_local1954603964_0001 completed successfully
  15/10/02 18:54:00 INFO mapreduce.Job: Counters: 35
  File System Counters
  FILE: Number of bytes read=821850
  FILE: Number of bytes written=1655956
  FILE: Number of read operations=0
  FILE: Number of large read operations=0
  FILE: Number of write operations=0
  HDFS: Number of bytes read=118
  HDFS: Number of bytes written=42
  HDFS: Number of read operations=22
  HDFS: Number of large read operations=0
  HDFS: Number of write operations=5
  Map-Reduce Framework
  Map input records=2
  Map output records=10
  Map output bytes=87
  Map output materialized bytes=119
  Input split bytes=196
  Combine input records=10
  Combine output records=10
  Reduce input groups=6
  Reduce shuffle bytes=119
  Reduce input records=10
  Reduce output records=6
  Spilled Records=20
  Shuffled Maps =2
  Failed Shuffles=0
  Merged Map outputs=2
  GC time elapsed (ms)=352
  Total committed heap usage (bytes)=457912320
  Shuffle Errors
  BAD_ID=0
  CONNECTION=0
  IO_ERROR=0
  WRONG_LENGTH=0
  WRONG_MAP=0
  WRONG_REDUCE=0
  File Input Format Counters
  Bytes Read=47
  File Output Format Counters
  Bytes Written=42
  $
  $ ./bin/hadoop fs -ls /out
  Found 2 items
  -rw-r--r--   3 hadoop supergroup          0 2015-10-02 18:53 /out/_SUCCESS
  -rw-r--r--   3 hadoop supergroup         42 2015-10-02 18:53 /out/part-r-00000
  $ ./bin/hadoop fs -cat /out/_SUCCESS
  $ ./bin/hadoop fs -cat /out/part-r-00000
  file2
  first1
  is2
  second1
  the2
  this2
  $
  12.至此hadoop的配置部署工作顺利完成;

页: [1]
查看完整版本: Hadoop2.7的配置部署及测试