Apache Oozie Installation
oozie就是一个workflow协调系统,主要用来管理Hadoop作业(job)。属于web应用程序,由oozie client和oozie server两个组件构成。oozie server运行于java servlet容器(tomcat)中的web程序。由于使用HUE需要oozie的支持,所以先介绍oozie的安装配置,后续增加HUE的安装配置文档。1、环境介绍
前期已配置好Hadoop集群服务。如下图所示:
2、源代码编译
这里下载的是源代码软件包,需要重新编译。
2.1 更改java版本
$ wget https://mirrors.cnnic.cn/apache/oozie/4.3.0/oozie-4.3.0.tar.gz
$ tar -xzf oozie-4.3.0.tar.gz;mv oozie-4.3.0 oozie
$ cd oozie
$ vi pom.xml --将文件里的targetJavaVersion版本改为1.8
2.2 更改hadoop版本
将hadooplibs下的hadoop-auth-2、hadoop-distcp-2、hadoop-utils-2中的pom.xml里对应的hadoop版本改成系统当前运行的版本2.7.4。
2.3 编译
此编译过程中,需要下载大量的依赖包,比较耗时。我这里网速还好,前后一共用了1小时左右编译完成。
$ cd /u01/oozie
$ bin/mkdistro.sh -DskipTests -Dhadoop.version=2.7.4
编译成功后,会在/u01/oozie/distro/target目录下生成二进制软件包。如下图所示:
3、安装配置
3.1 安装
这里还是安装在/u01下,将之前的oozie文件夹已迁移到其他目录。
$ cd /u01;tar -xzf /u01/oozie/distro/target/oozie-4.3.0-distro.tar.gz
$ mv oozie-4.3.0 oozie;cd oozie
编辑oozie-site.xml文件,加入以下内容:
oozie.service.HadoopAccessorService.hadoop.configurations=*=/u01/hadoop/etc/hadoop
oozie.service.WorkflowAppService.system.libpath=hdfs://192.168.120.96:9000/user/hadoop/share/lib
oozie.service.ProxyUserService.proxyuser.#USER#.
oozie.service.ProxyUserService.proxyuser.#USER#.groups
将以上的#USER#替换为运行oozie服务的系统用户。这里为hadoop,如下图所示:
3.2 ExtJS library
Oozie的web控制需要一个ExtJS库,而且还需要hadoop的相关库文件,所以需要将hadoop的相关jar文件复制到libext,并下载 js库 :ext-2.2.zip。oozie server默认使用tomcat 6.0.41,而hadoop也有内置的server,如果按照上面两个命令把hadoop依赖的jar包都拷贝过去,有可能出现冲突,这两个server使用的servlet、jsp版本很可能不一样。所以需要删除libext下的以下jar包(我这里只找到jsp-api一个库):
[*]jasper-compiler-5.5.23.jar
[*]jasper-runtime-5.5.23.jar
[*]jsp-api-2.1.jar
$ mkdir libext
$ cd libext/
$ wget http://archive.cloudera.com/gplextras/misc/ext-2.2.zip
$ cp /u01/hive/lib/mysql-connector-java-5.1.44-bin.jar .
$ cp /u01/hadoop/share/hadoop/*/*.jar .
$ cp /u01/hadoop/share/hadoop/*/lib/*.jar .
$ rm -rf jsp-api-2.1.jar
oozie Server还需要依赖数据库,会把元数据和一些流程信息数据存储在数据库中。这里也一并将mysql的驱动包加进去,方便后续使用MySQL数据库。
3.3 打包库文件
$ oozie-setup.sh prepare-war
setting CATALINA_OPTS="$CATALINA_OPTS -Xmx1024m"
……..
New Oozie WAR file with added 'ExtJS library, JARs' at /u01/oozie/oozie-server/webapps/oozie.war
INFO: Oozie is ready to be started
3.4 创建库文件路径
这里直接在hdfs上创建存放库文件的路径:
$ oozie-setup.shsharelibcreate -fs hdfs://hdp01:9000
setting CATALINA_OPTS="$CATALINA_OPTS -Xmx1024m"
……
the destination path for sharelib is: /user/hadoop/share/lib/lib_20171214123559
3.5 创建oozie元数据库
默认情况下,oozie使用的是嵌入式数据库Derby存放元数据等信息。
$ oozie-setup.sh db create -run
setting CATALINA_OPTS="$CATALINA_OPTS -Xmx1024m"
Validate DB Connection
DONE
DB schema does not exist
Check OOZIE_SYS table does not exist
DONE
Create SQL schema
DONE
Create OOZIE_SYS table
DONE
Oozie DB has been created for Oozie version '4.3.0'
The SQL commands have been written to: /tmp/ooziedb-8084517656754581469.sql
3.6 启动oozie
$ oozied.sh start
Setting OOZIE_HOME: /u01/oozie
Setting OOZIE_CONFIG: /u01/oozie/conf
Sourcing: /u01/oozie/conf/oozie-env.sh
setting CATALINA_OPTS="$CATALINA_OPTS -Xmx1024m"
Setting OOZIE_CONFIG_FILE: oozie-site.xml
Setting OOZIE_DATA: /u01/oozie/data
Setting OOZIE_LOG: /u01/oozie/logs
Setting OOZIE_LOG4J_FILE: oozie-log4j.properties
Setting OOZIE_LOG4J_RELOAD:10
Setting OOZIE_HTTP_HOSTNAME: hdp01
Setting OOZIE_HTTP_PORT: 11000
Setting OOZIE_ADMIN_PORT: 11001
Setting OOZIE_HTTPS_PORT: 11443
Setting OOZIE_BASE_URL: http://hdp01:11000/oozie
Setting CATALINA_BASE: /u01/oozie/oozie-server
Setting OOZIE_HTTPS_KEYSTORE_FILE: /home/hadoop/.keystore
Setting OOZIE_HTTPS_KEYSTORE_PASS: password
Setting OOZIE_INSTANCE_ID: hdp01
Setting CATALINA_OUT: /u01/oozie/logs/catalina.out
Setting CATALINA_PID: /u01/oozie/oozie-server/temp/oozie.pid
Using CATALINA_OPTS: -Xmx1024m -Dderby.stream.error.file=/u01/oozie/logs/derby.log
Adding to CATALINA_OPTS: -Doozie.home.dir=/u01/oozie -Doozie.config.dir=/u01/oozie/conf -Doozie.log.dir=/u01/oozie/logs -Doozie.data.dir=/u01/oozie/data -Doozie.instance.id=hdp01 -Doozie.config.file=oozie-site.xml -Doozie.log4j.file=oozie-log4j.properties -Doozie.log4j.reload=10 -Doozie.http.hostname=hdp01 -Doozie.admin.port=11001 -Doozie.http.port=11000 -Doozie.https.port=11443 -Doozie.base.url=http://hdp01:11000/oozie -Doozie.https.keystore.file=/home/hadoop/.keystore -Doozie.https.keystore.pass=password -Djava.library.path=
Setting up oozie DB
setting CATALINA_OPTS="$CATALINA_OPTS -Xmx1024m"
Validate DB Connection
DONE
DB schema exists
The SQL commands have been written to: /tmp/ooziedb-4054281256507508551.sql
Using CATALINA_BASE: /u01/oozie/oozie-server
Using CATALINA_HOME: /u01/oozie/oozie-server
Using CATALINA_TMPDIR: /u01/oozie/oozie-server/temp
Using JRE_HOME: /usr/java/jdk1.8.0_152
Using CLASSPATH: /u01/oozie/oozie-server/bin/bootstrap.jar
Using CATALINA_PID: /u01/oozie/oozie-server/temp/oozie.pid
3.7 验证
$ oozie admin -oozie http://localhost:11000/oozie -status
System mode: NORMAL
上述命令返回NORMAL说明系统正常启动。
4、MySQL DataStore for Oozie
这里使用的是远程MySQL数据库。
4.1 创建数据库
mysql> create database oozie;
mysql> create user oozie identified by "abcABC@12";
mysql> grant all privileges on oozie.* to 'oozie'@'%' identified by "abcABC@12";
mysql> flush privileges;
4.2 编辑oozie-site.xml文件,加入以下内容:
oozie.service.JPAService.create.db.schema=false
oozie.service.JPAService.jdbc.driver=com.mysql.jdbc.Driver
oozie.service.JPAService.jdbc.url=jdbc:mysql://mydb01:3306/oozie?useSSL=false
oozie.service.JPAService.jdbc.username=oozie
oozie.service.JPAService.jdbc.password=abcABC@12
oozie.service.HadoopAccessorService.hadoop.configurations=*=/u01/hadoop/etc/hadoop
oozie.service.WorkflowAppService.system.libpath=hdfs://192.168.120.96:9000/user/hadoop/share/lib
4.3 启动oozie
$ oozie-start.sh
WARN: Use of this script is deprecated; use 'oozied.sh start' instead
Setting OOZIE_HOME: /u01/oozie
Setting OOZIE_CONFIG: /u01/oozie/conf
Sourcing: /u01/oozie/conf/oozie-env.sh
setting CATALINA_OPTS="$CATALINA_OPTS -Xmx1024m"
Setting OOZIE_CONFIG_FILE: oozie-site.xml
Setting OOZIE_DATA: /u01/oozie/data
Setting OOZIE_LOG: /u01/oozie/logs
Setting OOZIE_LOG4J_FILE: oozie-log4j.properties
Setting OOZIE_LOG4J_RELOAD:10
Setting OOZIE_HTTP_HOSTNAME: hdp01
Setting OOZIE_HTTP_PORT: 11000
Setting OOZIE_ADMIN_PORT: 11001
Setting OOZIE_HTTPS_PORT: 11443
Setting OOZIE_BASE_URL: http://hdp01:11000/oozie
Setting CATALINA_BASE: /u01/oozie/oozie-server
Setting OOZIE_HTTPS_KEYSTORE_FILE: /home/hadoop/.keystore
Setting OOZIE_HTTPS_KEYSTORE_PASS: password
Setting OOZIE_INSTANCE_ID: hdp01
Setting CATALINA_OUT: /u01/oozie/logs/catalina.out
Setting CATALINA_PID: /u01/oozie/oozie-server/temp/oozie.pid
Using CATALINA_OPTS: -Xmx1024m -Dderby.stream.error.file=/u01/oozie/logs/derby.log
Adding to CATALINA_OPTS: -Doozie.home.dir=/u01/oozie -Doozie.config.dir=/u01/oozie/conf -Doozie.log.dir=/u01/oozie/logs -Doozie.data.dir=/u01/oozie/data -Doozie.instance.id=hdp01 -Doozie.config.file=oozie-site.xml -Doozie.log4j.file=oozie-log4j.properties -Doozie.log4j.reload=10 -Doozie.http.hostname=hdp01 -Doozie.admin.port=11001 -Doozie.http.port=11000 -Doozie.https.port=11443 -Doozie.base.url=http://hdp01:11000/oozie -Doozie.https.keystore.file=/home/hadoop/.keystore -Doozie.https.keystore.pass=password -Djava.library.path=
Setting up oozie DB
setting CATALINA_OPTS="$CATALINA_OPTS -Xmx1024m"
Validate DB Connection
DONE
DB schema exists
The SQL commands have been written to: /tmp/ooziedb-1436191594180946798.sql
Using CATALINA_BASE: /u01/oozie/oozie-server
Using CATALINA_HOME: /u01/oozie/oozie-server
Using CATALINA_TMPDIR: /u01/oozie/oozie-server/temp
Using JRE_HOME: /usr/java/jdk1.8.0_152
Using CLASSPATH: /u01/oozie/oozie-server/bin/bootstrap.jar
Using CATALINA_PID: /u01/oozie/oozie-server/temp/oozie.pid
$ oozie admin -oozie http://localhost:11000/oozie -status
System mode: NORMAL
参考文献:
1、Oozie Installation and Configuration
页:
[1]