设为首页 收藏本站
查看: 1041|回复: 0

[经验分享] 云帆大数据学院_hadoop 2.2.0源码编译

[复制链接]

尚未签到

发表于 2018-10-30 10:49:21 | 显示全部楼层 |阅读模式
2.1下载地址
  1、ApacheHadoop(100%永久开源)下载地址:
  - http://hadoop.apache.org/releases.html
  - SVN:http://svn.apache.org/repos/asf/hadoop/common/branches/
  2、CDH(ClouderaDistributed Hadoop,100%永久开源)下载地址:
  - http://archive.cloudera.com/cdh4/cdh/4/(是tar.gz文件!)
  - http://archive.cloudera.com/cdh5/cdh/ (是tar.gz文件!)
2.2官方版本说明
  (1)  官网:http://hadoop.apache.org
  (2)  下载Hadoop包

  (3)  官方版本存在的问题
  官方版本是在Linux 32位环境下编译的,在Linux64为环境下运行会出错:

  u  错误警告:WARNutil.NativeCodeLoader: Unable to load native-hadoop library for yourplatform... using builtin-java>  u  官网提供的二进制包,里面的native库,是32位的可以通过以下命令进行查看:
  $file $HADOOP_PREFIX/lib/native/libhadoop.so.1.0.0
  可以看到该库是基于32位的
  libhadoop.so.1.0.0: ELF 32-bit LSBshared object, Intel 80386, version 1 (SYSV), dynamically linked,BuildID[sha1]=0x9eb1d49b05f67d38454e42b216e053a27ae8bac9, not stripped。
2.3官方编译说明
  在下载下来的hadoop-2.2.0-src.tar.gz包下有个BUILDING.txt文件,这个文件详细说明了编译步骤
  Build instructions for Hadoop
  ----------------------------------------------------------------------------------
  Requirements:先决条件
  * Unix System          (这里采用社区版Linux CentOS 6.4版本 64位
  * JDK 1.6+             (JDK 1.6以上
  * Maven 3.0 or later    (建议最好采用 3.0.5版本
  * Findbugs 1.3.9 (if running findbugs)
  * ProtocolBuffer 2.5.0
  * CMake 2.6 or newer (if compiling native code)     (编译本地库
  * Internet connection for first build (to fetch allMaven and Hadoop dependencies) (联网下载依赖包
  ----------------------------------------------------------------------------------
  Maven main modules:
  hadoop                            (Main Hadoopproject)
  -hadoop-project           (Parent POM forall Hadoop Maven modules.             )
  (Allplugins & dependencies versions are defined here.)
  -hadoop-project-dist      (Parent POM formodules that generate distributions.)
  -hadoop-annotations       (Generates theHadoop doclet used to generated the Javadocs)
  -hadoop-assemblies        (Mavenassemblies used by the different modules)
  -hadoop-common-project    (Hadoop Common)
  -hadoop-hdfs-project      (Hadoop HDFS)
  -hadoop-mapreduce-project (Hadoop MapReduce)
  -hadoop-tools             (Hadoop toolslike Streaming, Distcp, etc.)
  -hadoop-dist              (Hadoopdistribution assembler)
  ----------------------------------------------------------------------------------
  Where to run Maven from?
  It can berun from any module. The only catch is that if not run from utrunk  all modules that are not part of the buildrun must be installed in the local  Mavencache or available in a Maven repository.
  ----------------------------------------------------------------------------------
  Maven build goals:
  * Clean                     : mvn clean
  *Compile                   : mvn compile[-Pnative]
  * Runtests                 : mvn test[-Pnative]
  * CreateJAR                : mvn package
  * Runfindbugs              : mvn compilefindbugs:findbugs
  * Runcheckstyle            : mvn compilecheckstyle:checkstyle
  * InstallJAR in M2 cache   : mvn install
  * Deploy JARto Maven repo  : mvn deploy
  * Runclover                : mvn test -Pclover[-DcloverLicenseLocation=${user.name}/.clover.license]
  * RunRat                   : mvnapache-rat:check
  * Buildjavadocs            : mvn javadoc:javadoc
  * Builddistribution        : mvn package[-Pdist][-Pdocs][-Psrc][-Pnative][-Dtar]
  * Change Hadoopversion     : mvn versions:set-DnewVersion=NEWVERSION
  Buildoptions:
  * Use-Pnative to compile/bundle native code
  * Use-Pdocs to generate & bundle the documentation in the distribution (using-Pdist)
  * Use -Psrcto create a project source TAR.GZ
  * Use -Dtarto create a TAR with the distribution (using -Pdist)
  Snappybuild options:
  Snappy isa compression library that can be utilized by the native code. It is currentlyan optional component, meaning that Hadoop can be built with  or without this dependency.
  * Use-Drequire.snappy to fail the build if libsnappy.so is not found. If this optionis not specified and the snappy library is missing,   we silently build a version of libhadoop.sothat cannot make use of snappy.  Thisoption is recommended if you plan on making use of snappy and want  to get more repeatable builds.
  * Use-Dsnappy.prefix to specify a nonstandard location for the libsnappy headerfiles and library files. You do not need this option if you have installedsnappy using a package manager.
  * Use-Dsnappy.lib to specify a nonstandard location for the libsnappy library   files. Similarly to nappy.prefix, you do not need this option if you have  installed snappy using a package manager.
  * Use-Dbundle.snappy to copy the contents of the snappy.lib directory into the finaltar file. This option requires that -Dsnappy.lib is also given, and it ignoresthe -Dsnappy.prefix option.
  ---------------------------------------------------------------------------------
  Building components separately
  If you are building a submodule directory, all thehadoop dependencies this submodule has will be resolved as all other 3rd partydependencies. This is,from the Maven cache or from a Maven repository (if notavailable in the cache or the SNAPSHOT 'timed out').
  An>mvn install -DskipTests' from Hadoop source top levelonce; and then work from the submodule. Keep in mind that SNAPSHOTs time outafter a while, using the Maven '-nsu' will stop Maven from trying to updateSNAPSHOTs from external repos.
  ----------------------------------------------------------------------------------
  Protocol Buffer compiler
  The version of Protocol Buffer compiler, protoc,must match the version of the protobuf JAR.
  If you have multiple versions of protoc in yoursystem, you can set in your build shell the HADOOP_PROTOC_PATH environmentvariable to point to the one you want to use for the Hadoop build. If you don'tdefine this environment variable,protoc is looked up in the PATH.
  ----------------------------------------------------------------------------------
  Importing projects to eclipse
  When you import the project to eclipse, installhadoop-maven-plugins at first.
  $ cdhadoop-maven-plugins
  $ mvninstall
  Then, generate eclipse project files.
  $ mvneclipse:eclipse -DskipTests
  At last, import to eclipse by specifying the rootdirectory of the project via
  [File] > [Import] > [Existing Projects intoWorkspace].
  ----------------------------------------------------------------------------------
  Building distributions: (编译发布)
  Create binary distribution without native codeand without documentation:(二进制源码)
  $ mvnpackage -Pdist -DskipTests –Dtar
  Create binary distribution with native code andwith documentation:(二进制源码+本地库+文档)
  $ mvnpackage -Pdist,native,docs -DskipTests –Dtar
  Create source distribution:(源码)
  $ mvnpackage -Psrc –DskipTests
  Create source and binarydistributions with native code and documentation:(源码+二进制源码+本地库+文档)
  $ mvnpackage -Pdist,native,docs,src -DskipTests –Dtar
  Create a local staging version of the website (in/tmp/hadoop-site)
  $ mvn cleansite; mvn site:stage -DstagingDirectory=/tmp/hadoop-site
  ----------------------------------------------------------------------------------
  Handling out of memory errors in builds(解决内存溢出问题)
  If the build process fails with an out of memoryerror, you should be able to fix it by increasing the memory used by maven-which can be done via the environment variable MAVEN_OPTS.
  Here is an example setting to allocate between 256and 512 MB of heap space to Maven
  export MAVEN_OPTS="-Xms256m -Xmx512m"
  ----------------------------------------------------------------------------------
2.4编译步骤
Step1:安装VMware 10 (略)
Step2:安装 Linux操作系统 64bit(略)
  这里采用社区版CentOS 6.4版本 64位. 下载地址:http://www.centoscn.com/CentosSoft/
Step3:设置Linux联网
  (1)  设置VMware虚拟机网络模式为:NAT模式
  (2)  设置Linux操作系统的网络类型为:动态获取DHCP服务器地址,与宿主机共享网络


  (3)  测试:ping www.baidu.com

Step4:安装JDK
  说明: JDK版本为1.5以上 ; 64位编译版本 (本环境采用jdk-6u45-linux-x64.bin)
  (1)使用FTP工具(WinSCP工具或FileZilla)将jdk-6u45-linux-x64.bin上传到Linxu系统/software/目录下
  (2)安装jdk
  cd /software/
  chmod u+x jdk-6u45-linux-x64.bin   --授予执行权限
  mkdir /workDir                       --创建一个软件安装目录(个人习惯而已)
  cp jdk-6u45-linux-x64.bin /workDir  --复制到workDir目录
  ./ jdk-6u45-linux-x64.bin        --执行自解压文件
  mv jdk1.6.0_45 jdk6u45           --方便起见,对文件夹重命名
  (3)配置环境变量
  Vi /etc/profile
  增加如下配置:
  export JAVA_HOME=/workDir/jdk6u45
  export PATH=.:$PATH:$JAVA_HOME/bin
  (1)  使环境变量生效
  source /etc/profile
  (5)验证jdk是否安装成功
  java –verson
Step5:安装依赖包
  yum install autoconf -y
  yum install automake -y
  yum install libtool -y
  yum install cmake -y
  yum installncurses-devel -y
  yum installopenssl-devel -y
  yum installgcc -y
  yum install gcc-c++ -y
  yum install lzo-devel -y
  yum installzlib-devel -y
  说明:-y 代表在安装过程中提示选择默认为“yes”
  验证:
  rpm –qa | grep autoconf
  【yum命令简介】:
  yum(全称为 Yellow dog Updater, Modified)是一个在Fedora和RedHat以及SUSE中的Shell前端软件包管理器。基於RPM包管理,能够从指定的服务器自动下载RPM包并且安装,可以自动处理依赖性关系,并且一次安装所有依赖的软体包,无须繁琐地一次次下载、安装。yum提供了查找、安装、删除某一个、一组甚至全部软件包的命令,而且命令简洁而又好记。
  yum的命令形式一般是如下:yum [options] [command] [package...]
  其中的[options]是可选的,选项包括-h(帮助),-y(当安装过程提示选择全部为"yes"),-q(不显示安装的过程)等等。[command]为所要进行的操作,[package ...]是操作的对象。
  - 部分常用的命令包括:
  自动搜索最快镜像插件:   yum install yum-fastestmirror
  安装yum图形窗口插件:     yum install yumex
  查看可能批量安装的列表:  yum grouplist
  - 安装
  yuminstall 全部安装
  yuminstall package1 安装指定的安装包package1
  yumgroupinsall group1 安装程序组group1
Step6:安装Maven
  (1)  Maven 版本下载apache-maven-3.0.5-bin.tar.gz
  说明:不要使用最新的Maven 3.1.1,Hadoop2.2.0的源码与Maven3.x存在兼容性问题,所以会出现
  java.lang.NoClassDefFoundError:org/sonatype/aether/graph/DependencyFilter
  建议使用Maven3.0.5版本
  (2)  下载
  地址: http://maven.apache.org/download.cgi
  选择 apache-maven-3.0.5-bin.tar.gz下载
  (3)  上传到Linux并解压到安装目录
  tar –zxvf apache-maven-3.0.5-bin.tar.gz –C/workDir
  (4)  设置环境变量
  vi/etc/profile
  新增:
  exportMAVEN_HOME=/workDir/apache-maven-3.0.5
  exportPATH=$PATH:$MAVEN_HOME/bin
  执行命令:source /etc/profile   或者 .  /etc/profile
  验证:
  mvn-v
Step7:配置Maven国内镜像
  (1)  编辑 settings.xml文件
  进入安装目录 /workDir/apache-maven-3.0.5/conf
  * 修改内容:
  
  nexus-osc
  *
  Nexusosc
  http://maven.oschina.net/content/groups/public/
  
  * 修改内容:
  
  jdk-1.6
  
  1.6
  
  
  
  nexus
  localprivate nexus
  http://maven.oschina.net/content/groups/public/
  
  true
  
  
  false
  
  
  
  
  
  nexus
  localprivate nexus
  http://maven.oschina.net/content/groups/public/
  
  true
  
  
  false
  
  
  
  
  (2)  复制配置
  说明:将settings.xml文件复制到用户目录,使得每次对maven创建时,都采用该配置
  cd /home/Hadoop    --*查看用户目录【/home/hadoop】是否存在【.m2】文件夹,如没有,则创建
  mkdir .m2
  cp /workDir/apache-maven-3.0.5/conf/settings.xml~/.m2    --复制文件
  (3)  配置DNS
  vi /etc/resolv.conf
  修改如下:
  nameserver 8.8.8.8
  nameserver 8.8.4.4
Step8:安装protobuf
  (1)  下载protobuf-2.5.0.tar.gz
  https://protobuf.googlecode.com/files/protobuf-2.5.0.tar.gz
  (2)  解压到安装目录
  cd /software
  tar-zxvf protobuf-2.5.0.tar.gz –C /wrokDir
  (3)  安装下面3个依赖包(如果已经安装可以跳过)
  yuminstall gcc -y
  yuminstall gcc-c++ -y
  yuminstall make  -y
  【说明】:如果缺少这个3个依赖包,会报下面的错误:
  ERROR]Failed to execute goal org.apache.hadoop:hadoop-maven-plugins:2.2.0:protoc(compile-protoc) on project hadoop-common:org.apache.maven.plugin.MojoExecutionException: 'protoc --version' did notreturn a version -> [Help 1]
  [ERROR]
  [ERROR]To see the full stack trace of the errors, re-run Maven with the -eswitch.
  [ERROR]Re-run Maven using the -X switch to enable full debug logging.
  [ERROR]
  [ERROR]For more information about the errors and possible solutions, please read thefollowing articles:
  [ERROR][Help 1]http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
  [ERROR]
  [ERROR]After correcting the problems, you can resume the build with the command
  [ERROR]   mvn  -rf :hadoop-common
  (4)  编译安装,执行配置文件
  进入安装目录,执行configure文件
  cd/workDir/protobuf-2.5.0       --进入安装目录
  ./configure                     --执行配置文件
  (5)  安装
  make& make check & make install
  说明:安装protobuf需要安装gcc gcc-c++系统包(如果之前安装的话就不用再安装)
  (6)  配置环境变量
  vi /etc/profile
  新增:
  export PROTOBUF_HOME=/workDir/ protobuf-2.5.0
  export PATH=$PATH:$PROTOBUF_HOME/bin
  使配置生效:
  source /etc/profile   或者  .  /etc/profile
  验证:
  protoc --version
Step9:安装findbugs-3.0.0
  (1)  下载:findbugs-3.0.0.tar.gz
  http://sourceforge.jp/projects/sfnet_findbugs/releases/
  (2)  解压到安装目录
  cd /software
  tar -zxvf findbugs-3.0.0.tar.gz-C /workDir
  (3)  设置环境变量
  vi/etc/profile
  增加如下内容:
  exportFINDBUGS_HOME=/wrokDir/findbugs-3.0.0
  exportPATH=$PATH:$FINDBUGS_HOME/bin
  (4)  使环境变量生效
  source/etc/profile   或者  ./etc/profile
  (5)  验证
  findbugs-version
  重要说明
  如果出现以下错误,说明jdk版本不兼容导致。findbugs-2.5.0和findbugs3.0.0是在jdk7以上编译的,所以需要在Linux上安装jdk7才可以。
  错误提示:

Step10:编译hadoop-src-2.2.0源码
  (1)  下载:hadoop-2.2.0-src.tar.gz
  http://mirror.bit.edu.cn/apache/hadoop/common/hadoop-2.2.0/hadoop-2.2.0-src.tar.gz
  (2)  解压到安装目录
  cd/software
  tar-zxvf  hadoop-2.2.0-src.tar.gz –C/workDir
  (3)  源码包打Patch
  - 重要说明:hadoop-2.2.0版本的源码存在bug,在apache官方JIRA上有说明:
  JIRA地址:https://issues.apache.org/jira/browse/HADOOP-10110
  - Bug修复办法
  Index: hadoop-common-project/hadoop-auth/pom.xml
  ===================================================================
  --- hadoop-common-project/hadoop-auth/pom.xml  (revision 1543124)
  +++ hadoop-common-project/hadoop-auth/pom.xml  (working copy)
  @@ -54,6 +54,11 @@
  
  
  org.mortbay.jetty
  +     jetty-util
  +     test
  +   
  +   
  +     org.mortbay.jetty
  jetty
  test
  
  从上面官方的bug修复说明中可以看到,需要编辑目录$HADOOP_SRC_HOME/hadoop-common-project/hadoop-auth中的pom.xml文件,在第55行下增加以下内容:
  
  org.mortbay.jetty
  jetty-util
  test
  
  否则会报下面的错误:
  [ERROR]Failed to execute goalorg.apache.maven.plugins:maven-compiler-plugin:2.5.1:testCompile(default-testCompile) on project hadoop-auth: Compilation failure: Compilationfailure:
  [ERROR]/home/chuan/trunk/hadoop-common-project/hadoop-auth/src/test/java/org/apache/hadoop/security/authentication/client/AuthenticatorTestCase.java:[84,13]cannot access org.mortbay.component.AbstractLifeCycle
  [ERROR]class file for org.mortbay.component.AbstractLifeCycle not found
  (4)  编译
  官方编译说明:
  Createsource and binary distributions with native code and documentation:(源码+二进制源码+本地库+文档)
  $ mvnpackage -Pdist,native,docs,src -DskipTests –Dtar
  cd/wrokDir/Hadoop-2.2.0-src
  mvnpackage -DskipTests -Pdist,native -Dtar
  说明:如果在编译过程中出现内存溢出的情况时,可以调整一下内存大小
  export MAVEN_OPTS="-Xms256m -Xmx512m"
  这个过程时间比较久,需要上网下载依赖包……
  直到看到下面的信息,说明编译成功:
  [INFO]------------------------------------------------------------------------
  [INFO]BUILD SUCCESS
  [INFO]------------------------------------------------------------------------
  [INFO]Total time: 11:53.144s
  [INFO]Finished at: Fri Nov 22 16:58:32 CST 2013
  [INFO]Final Memory: 70M/239M
  [INFO]------------------------------------------------------------------------
Step11:编译后说明
  1.   查看编译后的文件
  编译后的路径在:hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0
  cd /workDir/ hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0
  ll   --查看编译好的目录
  编译后hadoop-2.2.0目录下的目录:
  drwxr-xr-x. 2 root root 4096 Aug 11 12:00 bin
  drwxr-xr-x. 3 root root 4096 Aug 11 12:00 etc
  drwxr-xr-x. 2 root root 4096 Aug 11 12:00 include
  drwxr-xr-x. 3 root root 4096 Aug 11 12:00 lib
  drwxr-xr-x. 2 root root 4096 Aug 11 12:00 libexec
  drwxr-xr-x. 2 root root 4096 Aug 11 12:00 sbin
  drwxr-xr-x. 4 root root 4096 Aug 11 12:00 share
  进入 bin目录,执行hadoop命令查看脚本
  cd bin
  ./Hadoop version
  可以看到所有版本:
  [root@localhost bin]# ./hadoop version
  Hadoop 2.2.0
  Subversion Unknown -r Unknown
  Compiled by root on 2014-08-11T18:34Z
  Compiled with protoc 2.5.0
  From source with checksum79e53ce7994d1628b240f09af91e1af4
  This command was run using /workDir/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/
  hadoop-common-2.2.0.jar
  2.   查看本地库编译版本
  cd /workDir/ hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0
  file lib//native/*
  可以看到是64位的版本了(红色字部分):
  [root@localhost hadoop-2.2.0]# file lib//native/*
  lib//native/libhadoop.a:        current ar archive
  lib//native/libhadooppipes.a:   current ar archive
  lib//native/libhadoop.so:       symbolic link to `libhadoop.so.1.0.0'
  lib//native/libhadoop.so.1.0.0: ELF 64-bit LSB shared object, x86-64, version 1(SYSV), dynamically linked, not stripped
  lib//native/libhadooputils.a:   current ar archive
  lib//native/libhdfs.a:          current ar archive
  lib//native/libhdfs.so:         symbolic link to `libhdfs.so.0.0.0'
  lib//native/libhdfs.so.0.0.0:   ELF 64-bit LSBshared object, x86-64, version 1 (SYSV), dynamically linked, not stripped
  至此,编译成功!


运维网声明 1、欢迎大家加入本站运维交流群:群②:261659950 群⑤:202807635 群⑦870801961 群⑧679858003
2、本站所有主题由该帖子作者发表,该帖子作者与运维网享有帖子相关版权
3、所有作品的著作权均归原作者享有,请您和我们一样尊重他人的著作权等合法权益。如果您对作品感到满意,请购买正版
4、禁止制作、复制、发布和传播具有反动、淫秽、色情、暴力、凶杀等内容的信息,一经发现立即删除。若您因此触犯法律,一切后果自负,我们对此不承担任何责任
5、所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其内容的准确性、可靠性、正当性、安全性、合法性等负责,亦不承担任何法律责任
6、所有作品仅供您个人学习、研究或欣赏,不得用于商业或者其他用途,否则,一切后果均由您自己承担,我们对此不承担任何法律责任
7、如涉及侵犯版权等问题,请您及时通知我们,我们将立即采取措施予以解决
8、联系人Email:admin@iyunv.com 网址:www.yunweiku.com

所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其承担任何法律责任,如涉及侵犯版权等问题,请您及时通知我们,我们将立即处理,联系人Email:kefu@iyunv.com,QQ:1061981298 本贴地址:https://www.yunweiku.com/thread-628387-1-1.html 上篇帖子: 企业级Hadoop 2.x入门系列之一Apache Hadoop 2.x简介与版本_云帆大数据学院 下篇帖子: 分布式计算开源框架Hadoop入门实践(一)
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

扫码加入运维网微信交流群X

扫码加入运维网微信交流群

扫描二维码加入运维网微信交流群,最新一手资源尽在官方微信交流群!快快加入我们吧...

扫描微信二维码查看详情

客服E-mail:kefu@iyunv.com 客服QQ:1061981298


QQ群⑦:运维网交流群⑦ QQ群⑧:运维网交流群⑧ k8s群:运维网kubernetes交流群


提醒:禁止发布任何违反国家法律、法规的言论与图片等内容;本站内容均来自个人观点与网络等信息,非本站认同之观点.


本站大部分资源是网友从网上搜集分享而来,其版权均归原作者及其网站所有,我们尊重他人的合法权益,如有内容侵犯您的合法权益,请及时与我们联系进行核实删除!



合作伙伴: 青云cloud

快速回复 返回顶部 返回列表