org.apache.hadoop.hbase.coprocessor.AggregateImplementation 来统计表的行数

sxyzy · 发表于 2015-11-11 10:48:12

　　hbase自带了一个聚合coprocessor类：org.apache.hadoop.hbase.coprocessor.AggregateImplementation。使用该类可以count一张表的总记录数。
　　当然在hbase shell下面也可以count <table_name>来统计。我这里比较了一下两者的执行时间，我有一张表有700多万的数据，在hbase shell下count足足花费了我12分钟的时间，而用coprocessor来统计，只花费了78秒！！！由此可见coprocessor的强大。
　　

　　

　　hbase aip 添加协处理器：
　　

Configuration hbaseconfig = HBaseConfiguration.create();
HBaseAdmin hbaseAdmin = new HBaseAdmin(hbaseconfig);
hbaseAdmin.disableTable(TABLE_NAME);
HTableDescriptor htd = hbaseAdmin.getTableDescriptor(TABLE_NAME);
htd.addCoprocessor(AggregateImplementation.class.getName());
hbaseAdmin.modifyTable(TABLE_NAME, htd);
hbaseAdmin.enableTable(TABLE_NAME);
hbaseAdmin.close();

使用hbase提供的聚合coprocessor:　　
　　

AggregationClient aggregationClient = new AggregationClient(hbaseconfig);
Scan scan = new Scan();
scan.addFamily(Bytes.toBytes("fr"));
Date start = new Date();
long rowcount = aggregationClient.rowCount(TABLE_NAME,
new LongColumnInterpreter(), scan);
Date end = new Date();
System.out.println("rowcount:" + rowcount);
System.out.println("timecost:" + (end.getTime() - start.getTime()));

　　hbase shell添加coprocessor:
　　disable 'member'

alter 'member',METHOD => 'table_att','coprocessor' => 'hdfs://master24:9000/user/hadoop/jars/test.jar|mycoprocessor.SampleCoprocessor|1001|'

enable 'member'

　　

　　hbase shell 删除coprocessor:
　　disable 'member'

alter 'member',METHOD => 'table_att_unset',NAME =>'coprocessor$1'

enable 'member'

　　

版权声明：本文为博主原创文章，未经博主允许不得转载。

账号		自动登录	找回密码
密码			立即注册

大疆运维招人啦，

C++ :try 语句块和异常处理

C++的多态

Red Hat RHCE 8 (EX294) Cert Guide

Java/C++ 区别：看完这一篇，就够用！

别再用过时库了！这 13 个顶级 C++ 库才是

c++ size_t 和 int 的区别

org.apache.hadoop.hbase.coprocessor.AggregateImplementation 来统计表的行数

浏览过的版块

扫码加入运维网微信交流群