hbase0.90.5+ hadoop0.20.2append 下出现了大量日志Got brand-new compressor

hbase0.90.5+ hadoop0.20.2append 版本下 hbase表使用GZ压缩。出现了大量日志Got brand-new compressor，而后jvm老年代使用率快速到90%以上。
遇到这个问题总结：分享给大家

INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new compressor
INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor
INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new compressor
INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor
INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor
INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor
INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new compressor
INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor
INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor
INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new compressor
jstat -gcutil -h5 2421 3s 100
S0 S1 E O P YGC YGCT FGC FGCT GCT
0.00 0.00 21.27 95.46 59.71 8352 218.916 4518 718.946 937.862
0.00 0.00 33.70 95.46 59.71 8352 218.916 4518 718.946 937.862
0.00 0.00 43.97 95.46 59.71 8352 218.916 4518 718.946 937.862
0.00 0.00 56.42 95.46 59.71 8352 218.916 4518 718.946 937.862
0.00 0.00 74.33 95.46 59.71 8352 218.916 4518 718.946 937.862

经过24小时后：
2013-04-22T19:15:32.960+0800: [GC [ParNew: 18222K->926K(19136K), 0.0012810 secs] 44891K->27688K(83008K) icms_dc=0 , 0.0013570 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
"hbase-xm99-regionserver-centos1.out" 74488L, 12139018C 1,1 顶端
2013-04-24T09:20:00.032+0800: [Full GC [CMS2013-04-24T09:20:00.544+0800: [CMS-concurrent-mark: 0.512/25.895 secs] [Times: user=10.57 sys=8.59, real=25.89 secs]
2013-04-24T09:20:04.141+0800: [GC [1 CMS-initial-mark: 3917391K(3917440K)] 4166241K(4166656K), 0.0047850 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
2013-04-24T09:20:04.147+0800: [Full GC [CMS2013-04-24T09:20:04.575+0800: [CMS-concurrent-mark: 0.427/0.429 secs] [Times: user=1.70 sys=0.00, real=0.43 secs]
at org.apache.hadoop.hbase.client.Put.readFields(Put.java:495)
at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:555)
at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:475)
at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:475)
at org.apache.hadoop.hbase.client.MultiAction.readFields(MultiAction.java:116)
at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:555)
at org.apache.hadoop.hbase.ipc.HBaseRPC$Invocation.readFields(HBaseRPC.java:127)
at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:978)
at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:946)
at org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:522)
at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:316)
2013-04-24T09:20:09.821+0800: [Full GC [CMS2013-04-24T09:20:10.582+0800: [CMS-concurrent-mark: 0.509/1.505 secs] [Times: user=2.08 sys=0.00, real=1.50 secs]
(concurrent mode failure): 3917391K->3917390K(3917440K), 4.0398780 secs] 4166559K->4166377K(4166656K), [CMS Perm : 18899K->18896K(31596K)] icms_dc=100 , 4.0399890 secs] [Times: user=5.50 sys=0.03, real=4.04 secs]
2013-04-24T09:20:14.125+0800: [GC [1 CMS-initial-mark: 3917390K(3917440K)] 4166506K(4166656K), 0.0047060 secs] [Times: user=0.01 sys=0.00, real=0.01 secs]

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
java.lang.OutOfMemoryError: Java heap space
at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:39)
at java.nio.ByteBuffer.allocate(ByteBuffer.java:312)
2013-04-24T09:20:14.141+0800: [Full GC [CMS2013-04-24T09:20:14.600+0800: [CMS-concurrent-mark: 0.458/0.470 secs] [Times: user=1.82 sys=0.01, real=0.47 secs]
(concurrent mode failure): 3917390K->3917388K(3917440K), 4.2988240 secs] 4166544K->4165827K(4166656K), [CMS Perm : 18897K->18893K(31596K)] icms_dc=100 , 4.2989130 secs] [Times: user=5.62 sys=0.01, real=4.30 secs]
2013-04-24T09:20:18.441+0800: [GC [1 CMS-initial-mark: 3917388K(3917440K)] 4166124K(4166656K), 0.0051100 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:936)
at org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:522)
at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:316)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
2013-04-24T09:20:18.479+0800: [Full GC [CMS2013-04-24T09:20:18.858+0800: [CMS-concurrent-mark: 0.407/0.412 secs] [Times: user=1.67 sys=0.02, real=0.42 secs]
(concurrent mode failure): 3917388K->3917386K(3917440K), 3.7944930 secs] 4166433K->4166190K(4166656K), [CMS Perm : 18894K->18892K(31596K)] icms_dc=100 , 3.7945760 secs] [Times: user=4.93 sys=0.01, real=3.79 secs]

出现频繁的full gc，最终oom了。

另进程号：14084 HRegionServer
lsof |grep 14084 |more
java 14084 xm99 108u IPv4 5184211 0t0 TCP centos1:54938->centos1:50010 (ESTABLISHED)
java 14084 xm99 109u sock 0,5 0t0 5154331 can't identify protocol
java 14084 xm99 110u IPv4 5135298 0t0 TCP centos1:48634->centos1:50010 (CLOSE_WAIT)
java 14084 xm99 111u sock 0,5 0t0 5136629 can't identify protocol
java 14084 xm99 112u sock 0,5 0t0 5091184 can't identify protocol
java 14084 xm99 113u sock 0,5 0t0 5091615 can't identify protocol
java 14084 xm99 114u sock 0,5 0t0 5122457 can't identify protocol

系统/proc/sys/fs/file-max里的设置
cat /proc/sys/fs/file-max
1585044

ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 137216
max locked memory (kbytes, -l) 32
max memory size (kbytes, -m) unlimited
open files (-n) 32768
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 137216
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited

lsof | grep "can't identify protocol" |wc -l
4885

解决： http://hbase.apache.org/book.html
12.9.2.10. Logs flooded with '2011-01-10 12:40:48,407 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new compressor' messages
We are not using the native versions of compression libraries. See HBASE-1900 Put back native support when hadoop 0.21 is released. Copy the native libs from hadoop under hbase lib dir or symlink them into place and the message should go away.

在hbase的lib下
ln -s /usr/local/hadoop/lib/native native
后重启hbase集群生效。没有再出现Got brand-new compressor洪灾

图文精华

hbase0.90.5+ hadoop0.20.2append 下出现了大量日志Got brand-new compressor

推荐 /2