docker安装hive1.2
构建数据仓库 (Slave1)1、解压数据库
tar -zxvf apache-hive-1.2.2-bin.tar.gz /usr/local/
cd/usr/local/
mv apache-hive-1.2.2 hive
2、为hive添加环境变量
编辑/etc/profile文件,增加hive相关的环境变量配置
https://s1.运维网.com/images/blog/201901/09/327ab1dd7c8261058b45ad005318e2ee.png
profile文件编辑完成后,执行下面命令,让配置生效,命令是
https://s1.运维网.com/images/blog/201901/09/643ac1fb5efce4fd2f242e55c02a8f0b.png
3、配置hive-site.xml
hive-site.xml相关的配置
cd /usr/local/hive/conf
cp hive-default.xml.template hive-site.xml
然后用hadoop创建目录
hadoop fs -mkdir -p /user/hive/warehouse
hadoop fs -mkdir-p /tmp/hive/
给刚才新建的目录赋予读写权限,执行命令:
hadoop fs -chmod 777 /user/hive/warehouse
hadoop fs -chmod777 /tmp/hive
之所以创建目录是因为hive-site-xml里有这样的配置
https://s1.运维网.com/images/blog/201901/09/998bc1dcfe907a52a2c595c0bd718492.png
https://s1.运维网.com/images/blog/201901/09/681d18398cfb6ff052d45cb2b3a75e85.png
检查目录是否创建成功
hadoop fs -ls /user/hive/
hadoop fs -ls /tmp/
https://s1.运维网.com/images/blog/201901/09/0097c874a35bb2dba69c897c482ee811.png
修改hive-site.xml中的临时目录
将hive-site.xml文件中的${system:java.io.tmpdir}替换为hive的临时目录,例如我替换为/usr/local/hive/tmp,将${system:user.name}都替换为root, 并且赋予读写权限
https://s1.运维网.com/images/blog/201901/09/15315da68f9d1c06339dcd33ebc65044.png
https://s1.运维网.com/images/blog/201901/09/0c01f9b59abeaf8458808a6c21b6f953.png
注意要把所有的都替换,这里用正则替换
sed -i 's@${system:java.io.tmpdir}@/usr/local/hive/tmp/@g' hive-site.xml
sed -i 's@${system:user.name}@root@g' hive-site.xml
修改hive-site.xml数据库相关的配置
搜索javax.jdo.option.ConnectionURL,将该name对应的value修改为MySQL的地址
https://s1.运维网.com/images/blog/201901/09/311267c2dfca5022bf846094780eb598.png
https://s1.运维网.com/images/blog/201901/09/66863cc2b778931dedfa2dde27c1fee9.png
搜索javax.jdo.option.ConnectionDriverName,将该name对应的value修改为MySQL驱动类路径
https://s1.运维网.com/images/blog/201901/09/28ad50398e2b3648ea231cf7e8e323d2.png
搜索javax.jdo.option.ConnectionUserName,将对应的value修改为MySQL数据库登录名:
https://s1.运维网.com/images/blog/201901/09/7d3d96b2e494f20a332bab032ec0749b.png
搜索javax.jdo.option.ConnectionPassword,将对应的value修改为MySQL数据库的登录密码:
https://s1.运维网.com/images/blog/201901/09/cb3f2601216b064adeed8bb1eec306e1.png
搜索hive.metastore.schema.verification,将对应的value修改为false:
https://s1.运维网.com/images/blog/201901/09/1bf78ad53562717cc62f3894fcbd1655.png
4、加载mysql驱动
将MySQL驱动包上载到lib目录
cd /usr/local/hive/lib/
wget ftp://172.18.79.77/mysql-connector-java-5.1.5-bin.jar
注意这里的地址是我的私有地址,可以去百度找mysql驱动包,或者在我的博客里下载(因为运维网上重复的资料不能上传,很无语,就随便加了个txt文件和驱动包弄成一个压缩包再上传的)。
https://s1.运维网.com/images/blog/201901/09/a402a7962e9b3946422c4e2c513955ee.png
5、修改hive-env.sh
新建hive-env.sh文件并进行修改
cd /usr/local/hive/conf
将hive-env.sh.template文件复制一份,并且改名为hive-env.sh
cp hive-env.sh.template hive-env.sh
打开hive-env.sh配置并且添加以下内容:
exportHADOOP_HOME=/usr/local/hadoop
exportHIVE_CONF_DIR=/usr/local/hive/conf
exportHIVE_AUX_JARS_PATH=/usr/local/hive/lib
https://s1.运维网.com/images/blog/201901/09/304ba99b787f79b029b1b5c01a37b354.png
6、初始化mysql
对MySQL数据库进行初始化
进入mysql数据库,创建hive用户和hive数据库
CREATE USER 'hive'@'localhost' IDENTIFIED BY "hive";
grant all privileges on . to hive@localhost identified by 'hive';
create database hive;
刷新数据库
flush privileges;
退出
exit;
测试hive用户是否创建成功
mysql -uhive -p
Enter passwd: hive
进入到hive的bin目录 执行命令:
cd /usr/local/hive/bin
对数据库进行初始化,执行命令:
schematool -initSchema-dbTypemysql
执行成功后,hive数据库里已经有一堆表创建好了
https://s1.运维网.com/images/blog/201901/09/d3635bd5d7e4e013aa2aff4d5e0814d4.png
7、测试hive
启动hive并测试
进入到hive的bin目录执行命令:
cd /usr/local/hive/bin
执行hive脚本进行启动,执行命令:
./hive
https://s1.运维网.com/images/blog/201901/09/781b558a85b4fbba2b8eb216801a554a.png
执行查看sum函数的详细信息的命令:
desc functionsum;
https://s1.运维网.com/images/blog/201901/09/e557cc53c6676887791a65dc2f86f5d6.png
执行新建数据库的hive命令:
create databasedb_hive;
https://s1.运维网.com/images/blog/201901/09/5f0da5b474ac861bb72163de21584f27.png
在刚才创建的数据库中创建数据表,执行hive命令:
usedb_hive;
createtablestudent(id int,name string)rowformatdelimitedfields terminatedby'\t';
https://s1.运维网.com/images/blog/201901/09/a97e3d5051878b4885fab58db41b1ca6.png
将文件数据写入表中
在/usr/local/hive目录内新建一个文件
执行Linux命令(最好是重新打开一个终端来执行):
touch /opt/hive/student.txt
https://s1.运维网.com/images/blog/201901/09/d50dfc19038400ad455fd50879a22c80.png
https://s1.运维网.com/images/blog/201901/09/314f2f13a43a707df5a51b2bf1692caa.png
说明:ID和name直接是TAB键,不是空格,因为在上面创建表的语句中用了terminatedby'\t'所以这个文本里id和name的分割必须是用TAB键(复制粘贴如果有问题,手动敲TAB键吧),还有就是行与行之间不能有空行,否则下面执行load,会把NULL存入表内,该文件要使用unix格式,如果是在windows上用txt文本编辑器编辑后在上载到服务器上,需要用工具将windows格式转为unix格式,例如可以使用Notepad++来转换。
完成上面的步骤后,在磁盘上/opt/hive/student.txt文件已经创建成功,文件中也已经有了内容,在hive命令行中执行加载数据的hive命令:
load datalocal inpath '/usr/local/hive/std.txt' into table db_hive.std;
https://s1.运维网.com/images/blog/201901/09/f0d936cf53c7dd621e64d326428366e5.png
执行命令,查看是否把刚才文件中的数据写入成功
https://s1.运维网.com/images/blog/201901/09/b8de5535fb30ce4da4b43478d0e2a3d7.png
查看页面
http://172.18.74.105:50070/explorer.html#/user/hive/warehouse/db_hive.db/student
https://s1.运维网.com/images/blog/201901/09/ab5cca5cdc43c2623b5d5e7ebbd91baa.png
点击std.txt
https://s1.运维网.com/images/blog/201901/09/6f50f8aff6d6dd820b7282bb2959af7a.png
在MySQL数据库中执行select语句,查看hive创建的表
select * from hive.TBLS;
https://s1.运维网.com/images/blog/201901/09/a5dce6ce8fb6165ff08e262e387d9f41.png
页:
[1]