2,修改conf/hbase-site.xml:指定hbase连接到什么hdfs地址,指定hbase的数据存储目录tmp,指定hbase是分布式部署,指定内置zookeeper节点分别是什么,启动脚本会到对应机器上拉起zk,你应该了解zookeeper应该部署奇数个用于内部一致性选举,指定zookeeper的数据存储目录。
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://m1-reader-q1preonline07.m1.baidu.com:8010/hbase</value>
<description>The directory shared by RegionServers.
</description>
</property>
<property>
<name>hbase.tmpdir</name>
<value>/home/liangdong/hadoop/hbase/tmp</value>
<description></description>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
<description>The mode the cluster will be in. Possible values are
false: standalone and pseudo-distributed setups with managed Zookeeper
true: fully-distributed with unmanaged Zookeeper Quorum (see hbase-env.sh)
</description>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>m1-reader-q1preonline07.m1.baidu.com,m1-reader-q1preonline08.m1.baidu.com,m1-reader-q1preonline09.m1.baidu.com</value>
<description>Comma separated list of servers in the ZooKeeper Quorum.
For example, "host1.mydomain.com,host2.mydomain.com,host3.mydomain.com".
By default this is set to localhost for local and pseudo-distributed modes
of operation. For a fully-distributed setup, this should be set to a full
list of ZooKeeper quorum servers. If HBASE_MANAGES_ZK is set in hbase-env.sh
this is the list of servers which we will start/stop ZooKeeper on.
</description>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/home/liangdong/hadoop/hbase/zookeeper</value>
<description>Property from ZooKeeper's config zoo.cfg.
The directory where the snapshot is stored.
</description>
</property>
</configuration>
3,修改conf/regionservers:也就是指定hbase集群在哪些节点上启动,节点个数可以随时扩容,hbase会自动分裂一些regionserver上的key range并由空闲节点加载参与服务,这里我把我hadoop的3个datanode的地址填进去了: