分享

Hive外连HBase表

pig2 2014-8-2 10:59:21 发表于 实操演练 [显示全部楼层] 回帖奖励 阅读模式 关闭右栏 1 9143
问题导读:
1.如何创建hive外部表与之对应的hbase表?
2.hive与hbase字段之间是如何对应的?





Hive外连HBase表

CREATE [EXTERNAL] TABLE [IF NOT EXISTS] table_name
[(col_name data_type [COMMENT col_comment], ...)]
[COMMENT table_comment]
[PARTITIONED BY (col_name data_type
[COMMENT col_comment], ...)]
[CLUSTERED BY (col_name, col_name, ...)
[SORTED BY (col_name [ASC|DESC], ...)]
INTO num_buckets BUCKETS]
[ROW FORMAT row_format]
[STORED AS file_format]
[LOCATION hdfs_path]

连接HBase中已存在的表

下面语句为hive语句:

CREATE EXTERNAL TABLE table_name(id int, value string)

STORED BY 'org.apache.Hadoop.hive.hbase.HBaseStorageHandler'

WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:val")

TBLPROPERTIES ("hbase.table.name" = "xyz");

红色:hive表
绿色:HBase表,其中cf1代表列簇名,val代表列名,:key代表HBase的row赋予第一个字段,即上面例子中的蓝色标注的id





已有(1)人评论

跳转到指定楼层
pig2 发表于 2014-8-2 11:02:50
这里再补充一些内容:

HBase查询award表
  1. hbase(main):003:0> scan 'award' ,LIMIT=>2
  2. ROW COLUMN+CELL
  3. 2012-04-27 06:55:00:102713629 column=info:MPID, timestamp=1341890254281, value=2947
  4. 2012-04-27 06:55:00:102713629 column=info:MatchID, timestamp=1341890254281, value=433203828
  5. 2012-04-27 06:55:00:102713629 column=info:MatchName, timestamp=1341890254281, value=xxx
  6. 2012-04-27 06:55:00:102713629 column=info:Rank, timestamp=1341890254281, value=2
  7. 2012-04-27 06:55:00:102713629 column=info:TourneyID, timestamp=1341890254281, value=1027102
  8. 2012-04-27 06:55:00:102713629 column=info:UserId, timestamp=1341890254281, value=102713629
  9. 2012-04-27 06:55:00:102713629 column=info:gameID, timestamp=1341890254281, value=1001
  10. 2012-04-27 06:55:00:102713629 column=info:loginId, timestamp=1341890254281, value=715878221
  11. 2012-04-27 06:55:00:102713629 column=info:nickName, timestamp=1341890254281, value=xxx
  12. 2012-04-27 06:55:00:102713629 column=info:platform, timestamp=1341890254281, value=ios
  13. 2012-04-27 06:55:00:102713629 column=info:ware, timestamp=1341890254281, value=1984:1 2082:1
  14. 2012-04-27 06:55:00:106788559 column=info:MPID, timestamp=1341890254281, value=478
  15. 2012-04-27 06:55:00:106788559 column=info:MatchID, timestamp=1341890254281, value=433203930
  16. 2012-04-27 06:55:00:106788559 column=info:MatchName, timestamp=1341890254281, value=xxx
  17. 2012-04-27 06:55:00:106788559 column=info:Rank, timestamp=1341890254281, value=19
  18. 2012-04-27 06:55:00:106788559 column=info:TourneyID, timestamp=1341890254281, value=1014780
  19. 2012-04-27 06:55:00:106788559 column=info:UserId, timestamp=1341890254281, value=106788559
  20. 2012-04-27 06:55:00:106788559 column=info:gameID, timestamp=1341890254281, value=1001
  21. 2012-04-27 06:55:00:106788559 column=info:gold, timestamp=1341890254281, value=1
  22. 2012-04-27 06:55:00:106788559 column=info:loginId, timestamp=1341890254281, value=13835155880
  23. 2012-04-27 06:55:00:106788559 column=info:nickName, timestamp=1341890254281, value=xxx
  24. 2012-04-27 06:55:00:106788559 column=info:platform, timestamp=1341890254281, value=android
  25. 2 row(s) in 0.0420 seconds
复制代码

hive中创建外部表
  1. hive> CREATE EXTERNAL TABLE hive_award(key string, productid int,matchid string, rank string, tourneyid string, userid bigint,gameid int,gold int,loginid string,nick string,plat string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,info:MPID,info:MatchID,info:Rank,info:TourneyID,info:UserId,info:gameID,info:gold,info:loginId,info:nickName,info:platform") TBLPROPERTIES("hbase.table.name" = "award");
复制代码

此命令可以看到HBase的列族如何和hive的列对应,表名如何对应。SERDEPROPERTIES表示serializer properties, 序列化属性。deserializer 表示反序列化。TBLPROPERTIES表示table 表属性。


  1. hive> desc hive_award;
  2. key string from deserializer
  3. productid int from deserializer
  4. matchid string from deserializer
  5. rank string from deserializer
  6. tourneyid string from deserializer
  7. userid bigint from deserializer
  8. gameid int from deserializer
  9. gold int from deserializer
  10. loginid string from deserializer
  11. nick string from deserializer
  12. plat string from deserializer
复制代码



hive查询
  1. hive> select * from hive_award limit 3;
  2. OK
  3. 2012-04-27 06:55:00:102713629 2947 433203828 2 1027102 102713629 1001 NULL 715878221 xxx ios
  4. 2012-04-27 06:55:00:106788559 478 433203930 19 1014780 106788559 1001 1 13835155880 xxx android
  5. 2012-04-27 06:55:00:114298440 478 433203930 20 1014780 114298440 1001 1 1131024406 xxx android
  6. Time taken: 0.104 seconds
复制代码










回复

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

关闭

推荐上一条 /2 下一条