索引和数据放在同一个regionsever上的解决方法

本帖最后由 howtodown 于 2013-12-16 18:19 编辑

1．Overall Solution

解决思想：

一个user table对应一个index table
index的创建与更新全部在RS端的cp-processor里实现
核心思想：一个actual　region对应一个index region
- 一对一的mapping，且两者必须在同一个RS上
- 执行balance,split操作后，受影响的actual region或者index region都必须做出对应动作

支持的特性：

表级别的多重索引
多列索引
基于列值的一部分进行索引
通过索引进行相等和范围条件的scan
可以向index table批量导入数据

2．Implement Detail

2.1 整体架构：

如上述的架构图，HBase Secondaryindex主要由三部分组成：Client Ext，Balancer，Coprocessor．这三部分各自的功能分别是：

Client Ext: 扩展的HBase Client，主要是在创建table 的时候，添加了指定特定index的细节，其他与HBase原生API没有差异．实际使用如下：

IndexedHTableDescriptor htd = new IndexedHTableDescriptor(usertableName);
IndexSpecification iSpec = new IndexSpecification(indexName);
HColumnDescriptor hcd = new HColumnDescriptor(columnFamily);
iSpec.addIndexColumn(hcd, indexColumnQualifier, ValueType.String, 10);
htd.addFamily(hcd);
htd.addIndex(iSpec);
admin.createTable(htd);

Balance: 华为HBase secondaryindex重写了balance的实现，主要是为了保障user表的region与其一一对应的index表的region存放在一个RegionServer上．假如user region发生了迁移，那么与之对应的index region也需要迁移到同一个地方；同理，如果index region发生了迁移，那么其对应的user region也需要迁移到对应的地方．
Coprocessor：华为HBase Secondaryindex相对原始HBase core添加了更多的coprocessor函数，主要是为了保证user table进行数据更新时，index table的index数据也及时的做出了对应的更新．这里需要面对一个问题：user table更新操作与index table更新操作连接在一起是否是一个事物操作，即要满足要么都成功，要么什么都没发生．实际上，在HBase里面，对于一行的操作比如put（包括prePut,Put,postPut）是一个原子操作，那么通过coprocessor实现index table数据更新与user table数据更新一起满足事务性要求．

2.2 数据操作流程：

create table

图２　create table

创建表的时候，指定需要建立索引的列，表创建以后，他的region会被balancer分布到Regionserver上，如图２所示：表被划分为两个region，R1 存放在　RS1上，R2存放在　RS2上，那么对应的，R1的索引region R1(图２中黄色小块)也放在RS1上，同理R2的索引region R2放在RS 2上．

put operation

图３　put table

user table ：　　　tab1 , family -> cf1, qualifier -> c1,c2

需要建立的索引： idx1 on cf1:c1 ，idx2 on cf1:c2

index table ：　　 tab1_idx

操作：put ‘tab1’, ’abd’, ’cf1:c1’, ’5’ put ‘tab1’, ’abd’, ’cf1:c2’, ’z1’

写入index table时，组合rowkey的规则是：region startkey + index name + indexed column value + user table rowkey

coprocessor获取user table的put信息后插入index table的操作如下：

put ‘tab1_idx’, ‘aaidx15abd’ put ‘tab1_idx’, ‘aaidx2z1abd’

scan operation

scan　condition c1 = 5

图４　scan 操作发起流程

图５　scan　regionserver端流程

通过coprocessor在某个index region上创建scanner, startrow,endrow为：[aaidx15, aaindx16 )．在index region上查找到具体的rowkey之后，根据该rowkey可以解析出具体的user table 的exact row，然后，再根据exact row在user table region中查找具体的数据．

创建了index之后的查询性能提升的地方

：

图６　scan整体流程

带有index的scan操作流程是：１-> 2 -> 3 -> 6 -> 7

没有index的scan操作流程是：１-> 2 -> 4 -> 5

通过对比发现：带有index的scan操作，通过第３步获取对应条件的具体rowkey之后在user table中的seek操作，会排除大部分的block，很快定位到相关的block．而不带有index的scan操作只能在user table中一个block接着一个block的遍历查询并对比条件是否符合．

2.3 Region Split操作：

为了使主表region与对应索引region在同一个RegionServer上，要禁用索引表的自动和手动split，只能由主表region split的时候触发索引表region split，当主表region split的时候，对索引表region按其对应数据进行划分．

3．Salient aspects

3.1 Design

支持表上的多重索引
多列索引
基于列值的一部分进行索引
通过索引进行相等和范围条件的scan
支持向索引表中批量导入数据

3.2 App Usage

client app不需要进行修改
scan时不需要执行index, filter会智能的选择最佳的index

3.3 Upgrade/Integration

hbase core进行了少量的改动，主要是添加了许多coprocessor处理函数，以及balance和split的处理（虽然说改动两很小，对hbase升级影响很小，但是实际情况是不确定的）

４．RoadMap

动态添加，删除索引
在HBase Shell中集成secondary index
优化range scan
hbck添加对table index的支持
针对secondary index table的WAL优化
Make Scan Evaluation Intelligence Pluggable

５．总结

华为HBase secondaryindex很好的利用了HBase coprocessor的特性，通过利用actual region 与index region始终保持存放在一个Regionserver上的核心思想，实现了HBase 二级索引的功能．但是，其对HBase core的一些侵入性改动，使得HBase的升级不得不考虑其是否适应，尤其是HBase 0.96进行了很多的改动，并且含有对以前版本不兼容的修改．另外，华为HBase secondaryindex新开源不久，在具体的生产环境中还未经受考验，其是否成熟还是一个问题．还有，从其RoadMap看出，华为HBase secondary index目前是不支持动态加载和删除index,index只能在表创建的时候指定，这在实际运用中是不可接受的．最后还有个问题，华为HBase secondary index是不能够对已有的表建立索引的，只能在表建立后，随着数据的增长，慢慢建立起索引．