HBase的一个典型例子

本帖最后由 pig2 于 2014-2-4 00:25 编辑

1、例子一

import java.io.IOException;
import org.apache.hadoop.hbase.client.HTable;
import org.apache.hadoop.hbase.client.Scanner;
import org.apache.hadoop.hbase.io.BatchUpdate;
import org.apache.hadoop.hbase.io.Cell;
import org.apache.hadoop.hbase.io.RowResult;

public class MyClient {

public static void main(String args[]) throws IOException {
// You need a configuration object to tell the client where to connect.
// But don't worry, the defaults are pulled from the local config file.
HBaseConfiguration config = new HBaseConfiguration();

// This instantiates an HTable object that connects you to the "myTable"
// table.
HTable table = new HTable(config, "myTable");

// To do any sort of update on a row, you use an instance of the BatchUpdate
// class. A BatchUpdate takes a row and optionally a timestamp which your
// updates will affect.
BatchUpdate batchUpdate = new BatchUpdate("myRow");

// The BatchUpdate#put method takes a Text that describes what cell you want
// to put a value into, and a byte array that is the value you want to
// store. Note that if you want to store strings, you have to getBytes()
// from the string for HBase to understand how to store it. (The same goes
// for primitives like ints and longs and user-defined classes - you must
// find a way to reduce it to bytes.)
batchUpdate.put("myColumnFamily:columnQualifier1",
"columnQualifier1 value!".getBytes());

// Deletes are batch operations in HBase as well.
batchUpdate.delete("myColumnFamily:cellIWantDeleted");

// Once you've done all the puts you want, you need to commit the results.
// The HTable#commit method takes the BatchUpdate instance you've been
// building and pushes the batch of changes you made into HBase.
table.commit(batchUpdate);

// Now, to retrieve the data we just wrote. The values that come back are
// Cell instances. A Cell is a combination of the value as a byte array and
// the timestamp the value was stored with. If you happen to know that the
// value contained is a string and want an actual string, then you must
// convert it yourself.
Cell cell = table.get("myRow", "myColumnFamily:columnQualifier1");
String valueStr = new String(cell.getValue());

// Sometimes, you won't know the row you're looking for. In this case, you
// use a Scanner. This will give you cursor-like interface to the contents
// of the table.
Scanner scanner =
   // we want to get back only "myColumnFamily:columnQualifier1" when we iterate
   table.getScanner(new String[]{"myColumnFamily:columnQualifier1"});

// Scanners in HBase 0.2 return RowResult instances. A RowResult is like the
// row key and the columns all wrapped up in a single interface.
// RowResult#getRow gives you the row key. RowResult also implements
// Map, so you can get to your column results easily.

// Now, for the actual iteration. One way is to use a while loop like so:
RowResult rowResult = scanner.next();

while(rowResult != null) {
   // print out the row we found and the columns we were looking for
   System.out.println("Found row: " + new String(rowResult.getRow()) + " with value: " +
   rowResult.get("myColumnFamily:columnQualifier1".getBytes()));

   rowResult = scanner.next();
}

// The other approach is to use a foreach loop. Scanners are iterable!
for (RowResult result : scanner) {
   // print out the row we found and the columns we were looking for
   System.out.println("Found row: " + new String(result.getRow()) + " with value: " +
   result.get("myColumnFamily:columnQualifier1".getBytes()));
}

// Make sure you close your scanners when you are done!
scanner.close();
}
}

在这个例子中，使用了HBase中的很多概念，包括:
HBaseConfiguration: 用于告诉client如何连接，连接到哪个HBase的服务器上。
HTable：代表一个HBase表格。
BatchUpdate：用于表格中一行的更新。包括添加某个列，修改某列的值，删除某列等。
commit：table的一个方法。代表某个BatchUpdate操作可以生效了。类似于数据库中的commit操作。

Cell：table中对应某个（行key, 列值，时间戳）下的单元格值。
获取Cell的方法。For example:
table.get("myRow", "myColumnFamily:columnQualifier1");

scanner：用于遍历表格。
rowResult：遍历过程当中保存某行信息。

2、例子二

通过编码（java）的形式对HBase进行一系列的管理涉及到对表的管理、数据的操作等。

1、对表的创建、删除、显示以及修改等，可以用HBaseAdmin，一旦创建了表，那么可以通过HTable的实例来访问表，每次可以往表里增加数据。

2、插入数据

创建一个Put对象，在这个Put对象里可以指定要给哪个列增加数据，以及当前的时间戳等值，然后通过调用HTable.put(Put)来提交操作，子猴在这里提请注意的是：在创建Put对象的时候，你必须指定一个行(Row)值，在构造Put对象的时候作为参数传入。

3、获取数据

要获取数据，使用Get对象，Get对象同Put对象一样有好几个构造函数，通常在构造的时候传入行值，表示取第几行的数据，通过HTable.get(Get)来调用。

4、浏览每一行

通过Scan可以对表中的行进行浏览，得到每一行的信息，比如列名，时间戳等，Scan 相当于一个游标，通过next()来浏览下一个，通过调用HTable.getScanner(Scan) 来返回一个ResultScanner对象。HTable.get(Get)和HTable.getScanner(Scan) 都是返回一个Result。Result是一个KeyValue的链表，

5、删除

使用Delete来删除记录，通过调用HTable.delete(Delete)来执行删除操作。（注：删除这里有些特别，也就是删除并不是马上将数据从表中删除。）

6、锁

7、新增、获取、删除在操作过程中会对所操作的行加一个锁，而浏览却不会。

8、簇（cluster）的访问

客户端代码通过ZooKeeper来访问找到簇，也就是说ZooKeeper quorum将被使用，那么相关的类（包）应该在客户端的类（classes）目录下，即客户端一定要找到文件hbase-site.xml。

下面是一个例子，假定你已经创建了一个表：myTable，还有一个column family（这个找不到合适的翻译词语）：myColumnFamily：

import java.io.IOException;

import org.apache.hadoop.hbase.HBaseConfiguration;

import org.apache.hadoop.hbase.client.Get;

import org.apache.hadoop.hbase.client.HTable;

import org.apache.hadoop.hbase.client.Put;

import org.apache.hadoop.hbase.client.Result;

import org.apache.hadoop.hbase.client.ResultScanner;

import org.apache.hadoop.hbase.client.Scan;

import org.apache.hadoop.hbase.util.Bytes;

// Class that has nothing but a main.

// Does a Put, Get and a Scan against an hbase table.

public class MyLittleHBaseClient {

public static void main(String[] args) throws IOException {

// You need a configuration object to tell the client where to connect.

// When you create a HBaseConfiguration, it reads in whatever you've set

// into your hbase-site.xml and in hbase-default.xml, as long as these can

// be found on the CLASSPATH

HBaseConfiguration config = new HBaseConfiguration();

// This instantiates an HTable object that connects you to

// the "myLittleHBaseTable" table.

HTable table = new HTable(config, "myLittleHBaseTable");

// To add to a row, use Put. A Put constructor takes the name of the row

// you want to insert into as a byte array. In HBase, the Bytes class has

// utility for converting all kinds of java types to byte arrays. In the

// below, we are converting the String "myLittleRow" into a byte array to

// use as a row key for our update. Once you have a Put instance, you can

// adorn it by setting the names of columns you want to update on the row,

// the timestamp to use in your update, etc.If no timestamp, the server

// applies current time to the edits.

Put p = new Put(Bytes.toBytes("myLittleRow"));

// To set the value you'd like to update in the row 'myLittleRow', specify

// the column family, column qualifier, and value of the table cell you'd

// like to update. The column family must already exist in your table

// schema. The qualifier can be anything. All must be specified as byte

// arrays as hbase is all about byte arrays. Lets pretend the table

// 'myLittleHBaseTable' was created with a family 'myLittleFamily'.

p.add(Bytes.toBytes("myLittleFamily"), Bytes.toBytes("someQualifier"),

Bytes.toBytes("Some Value"));

// Once you've adorned your Put instance with all the updates you want to

// make, to commit it do the following (The HTable#put method takes the

// Put instance you've been building and pushes the changes you made into

// hbase)

table.put(p);

// Now, to retrieve the data we just wrote. The values that come back are

// Result instances. Generally, a Result is an object that will package up

// the hbase return into the form you find most palatable.

Get g = new Get(Bytes.toBytes("myLittleRow"));

Result r = table.get(g);

byte [] value = r.getValue(Bytes.toBytes("myLittleFamily"),

Bytes.toBytes("someQualifier"));

// If we convert the value bytes, we should get back 'Some Value', the

// value we inserted at this location.

String valueStr = Bytes.toString(value);

System.out.println("GET: " + valueStr);

// Sometimes, you won't know the row you're looking for. In this case, you

// use a Scanner. This will give you cursor-like interface to the contents

// of the table. To set up a Scanner, do like you did above making a Put

// and a Get, create a Scan. Adorn it with column names, etc.

Scan s = new Scan();

s.addColumn(Bytes.toBytes("myLittleFamily"), Bytes.toBytes("someQualifier"));

ResultScanner scanner = table.getScanner(s);

try {

// Scanners return Result instances.

// Now, for the actual iteration. One way is to use a while loop like so:

for (Result rr = scanner.next(); rr != null; rr = scanner.next()) {

// print out the row we found and the columns we were looking for

System.out.println("Found row: " + rr);

}

// The other approach is to use a foreach loop. Scanners are iterable!

// for (Result rr : scanner) {

// System.out.println("Found row: " + rr);

// }

} finally {

// Make sure you close your scanners when you are done!

// Thats why we have it inside a try/finally clause

scanner.close();

}

图文精华

HBase的一个典型例子

推荐 /2