HFile文件格式与HBase读写

本帖最后由 howtodown 于 2014-10-1 00:15 编辑
问题导读
1、HBase存储数据的文件组织形式是什么？
2、HFile文件的特点有哪些？
3、HFile V2的写操作流程是怎样的？

HFile是HBase存储数据的文件组织形式。HFile文件的特点：
1）HFile由DataBlock、Meta信息(Index、BloomFilter)、Info等信息组成。
2）整个DataBlock由一个或者多个KeyValue组成。
3）在文件内按照Key排序。

HFile V1的数据组织格式：
DataBlock区域、MetaBlock(bloomfilter) 与FileInfo、DataBlockIndex、MetaBlockIndex、Trailer分离。
打开一个HFile文件需要加载FileInfo、DataBlockIndex、MetablockIndex以及Fixed File Trailer到内存。

如下图所示:

HFile V1的数据格式在0.92版本升级到V2版本，

HFile V2的数据组织格式如下图所示：

与V1版本的相比，它的区别在于
1）文件分为三部分：Scanned block section,Non-scanned block section,以及Opening-time data section
2) 为DataBlockIndex建立多层索引。DataBlockIndex分为Leaf Index Block、Root Data Index(或者multi Root Data index(紫色的Meta Index区域))，Leaf index block具体存储了DataBlock的offset、length、以及firstkey的信息。RootDataIndex 存储的是每个Leaf index block的offset、length、Leaf index Block记录的第一个key，以及截至到该Leaf Index Block记录的DataBlock的个数。假定DataBlock的个数足够多，HFile文件又足够大的情况下，默认的128KB的长度的ROOTDataIndex仍然存在超过chunk大小的情况时，会分成更多的层次。这样最终的可能是ROOT INDEX –> IntermediateLevel ROOT INDEX(可以是多层) —〉Leaf index block
在ROOT INDEX中会记录Mid Key所对应的信息，帮助在做File Split或者折半查询时快速定位中间Row的信息。

//追加Split操作的相关知识：Region在执行Split操作，默认选择Region当中最大Store下的最大Storefile文件中的midkey，而midkey其实只是在通过HFile获取了这个文件之前记录好的数据。在自动触发Split操作的前提下，大部分的Split操作都伴随在Compaction操作之后进行的原因，在于可以对于Region中的文件进行合并，生成较大的StoreFile文件，以方便选择更好的Split Point。

HFile V2的写操作流程：
1）Append KV到 Data Block。在每次Append之前，首先检查当前DataBlock的大小是否超过了默认的设置，如果不超出阈值，写入输出流。如果超出了阈值，则执行finishBlock()，按照Table-CF的设置，对DataBlock进行编码和压缩，然后写入HFile中。//以Block为单位进行编码和压缩，会有一些性能开销，可以参考 HBase实战系列1—压缩与编码技术

2）根据数据的规模，写入Leaf index block和Bloom block。
Leaf index Block，每次Flush一个DataBlock会在该Block上添加一条记录，并判断该Block的大小是否超过阈值(默认128KB)，超出阈值的情况下，会在DataBlock之后写入一个Leaf index block。对应的控制类：HFileBlockIndex，内置了BlockIndexChunk、BlockIndexReader和BlockIndexWriter(实现了InlineBlockWriter接口)。

Bloom Block设置：默认使用MURMUR hash策略，每个Block的默认大小为128KB，每个BloomBlock可以接收的Key的个数通过如下的公式计算，接收的key的个数与block的容量以及errorRate的之间存在一定的关系，如下的计算公式中，可以得到在系统默认的情况下，每个BloomBlock可以接纳109396个Key。
注意：影响BloomBlock个数的因素，显然受到HFile内KeyValue个数、errorRate、以及BlockSize大小的影响。可以根据应用的需求合理调整相关控制参数。

/**
   * The maximum number of keys we can put into a Bloom filter of a certain
   * size to maintain the given error rate, assuming the number of hash
   * functions is chosen optimally and does not even have to be an integer
   * (hence the "ideal" in the function name).
   *
   * @param bitSize
   * @param errorRate
   * @return maximum number of keys that can be inserted into the Bloom filter
   * @see #computeMaxKeys(long, double, int) for a more precise estimate
   */
  public static long idealMaxKeys(long bitSize, double errorRate) {
    // The reason we need to use floor here is that otherwise we might put
    // more keys in a Bloom filter than is allowed by the target error rate.
    return (long) (bitSize * (LOG2_SQUARED / -Math.log(errorRate)));//这里的bitSize是byteSizeHint *8，如果按照默认设置，大概是128*1024*8 *(Math.log(2)*Math.log(2)/-Math.log(0.01)) = 109396 .
  }
复制代码

每一个BloomBlock会对应index信息，存储在Meta Index区域。

这样在加载数据的时候，只需加载不超过128KB的RootDataIndex以及IntermediateLevelRootIndex，而避免加载如HFile V1的所有的Leaf index block信息，同样，也只需要加载BloomBlockIndex信息到内存，这样避免在HFile V1格式因为加载过大的DataBlockIndex造成的开销，加快Region的加载速度。

图文精华

HFile文件格式与HBase读写

最佳新人

活跃会员

突出贡献

论坛元老

推荐 /2