|
0.系统版本信息
OS:Debian-7.9
Jdk:1.8.0_131
hadoop:2.8.1
zookeeper:3.4.9
hbase:1.3.1
hive:2.1.1
主机信息
192.168.74.128 master
192.168.74.132 slave-2
192.168.74.133 slave-3
192.168.74.134 slave-4
1.修改hive配置文件,需要hadoop,zookeeper,hbase全都启动,创建相关的测试表
A:hive-site.xml文件添加配置
<property>
<name>hbase.zookeeper.quorum</name>
<value>master,slave-2,slave-3,slave-4</value>
<description></description>
</property>
B:在hive安装目录bin下执行./hive
cd /home/hadoop/opt/hive-2.1.1/bin
./hive
#或者执行
/home/hadoop/opt/hive-2.1.1/bin/hive
C:创建hbase识别的表
hive>
CREATE TABLE hbase_table_1(key int, value string)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:val")
TBLPROPERTIES ("hbase.table.name" = "table_1");
D:hbase shell:list发现有新创建的table_1
2.hbase表不存在,然后通过hive创建关联表,插入数据进行测试
主要有两种方式
1:通过hive插入数据,在hbase中可以查到相关数据
2:通过hbase插入数据,在hive中可以查询到
A:在hive中创建临时表导入数据
hive> create table ccc(foo int,bar string) row format delimited fields terminated by '\t' lines terminated by '\n' stored as textfile;
B:创建数据文件data.txt
touch /home/hadoop/software/data.txt
vim data.txt
1 zhangsan
2 lisi
3 wangwu
C:将数据导入临时的hive表中,数据字段之间是tab键分隔,数据条数之间使用换行符分隔
hive>load data local inpath '/home/hadoop/software/data.txt' overwrite into table ccc;
D:将临时表中的数据导入到hbase中
hive> insert overwrite table hbase_table_1 select * from ccc where foo=1;
E:在hbas-shell中查看数据是否插入成功
这样就说明了插入通过hive插入的数据实际上最后保存到了habse的表中去了
下面证明的通过hbase插入的记录也可以通过hive查询出来,在hbase-shell中插入数据
hbase>put 'table_1','4','cf1:val','zhaoliu'
hbase>scan 'table_1'
hive>select * from hbase_table_1;
3.hbase表存在,然后通过hive创建关联表,插入数据进行测试
A:在hbase-shell中创建新的表,并插入新的数据
hbase>create 'student','info'
hbase>put "student",'1','info:name','tom'
hbase>put "student",'2','info:name','lily'
hbase>put "student",'3','info:name','wwn'
hbase>scan 'student'
B:创建hive关联关系表读取hbase中的数据
hive>CREATE EXTERNAL TABLE hbase_table_2(key int, value string)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = "info:name")
TBLPROPERTIES("hbase.table.name" = "student");
4.hive和hbase多列多族的问题
下面在hbase中创建这样一张表customer
rowkey | address | info | contact | province | city | country | age | company | phone | zhangsan | hubei | wuhan | china | 24 | douyu | 101 | lisi | guangdong | guangzhou | china | 24 | netease | 102 |
hbase>create 'customer','address','info', 'contact'
hbase>put 'customer','zhangsan','contact:phone','101'
hbase>put 'customer','zhangsan','address:province','hubie'
hbase>put 'customer','zhangsan','address:city','wuhan'
hbase>put 'customer','zhangsan','address:country','china'
hbase>put 'customer','zhangsan','info:age','24'
hbase>put 'customer','zhangsan','info:company','douyu'
hbase>put 'customer','lisi','contact:phone','102'
hbase>put 'customer','lisi','address:province','guangdong'
hbase>put 'customer','lisi','address:city','guangzhou'
hbase>put 'customer','lisi','address:country','china'
hbase>put 'customer','lisi','info:age','24'
hbase>put 'customer','lisi','info:company','netease'
hive>CREATE EXTERNAL TABLE hbase_table_3(key string, province string, city string, country string, age int, company string, phone string)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,address:province, address:city, address:country, info:age, info:company,contact:phone")
TBLPROPERTIES("hbase.table.name" = "customer");
|
|