分享

Hbase启动几分钟之后HMaster挂掉,求助。。。。。

梦~天涯 发表于 2015-11-23 15:04:54 [显示全部楼层] 回帖奖励 阅读模式 关闭右栏 2 14083
hbase集群启动之后,过了一会儿查看hmaster就挂了,报错日志如下,求大神帮助!!!

2015-11-23 14:52:14,596 INFO  [cmcc2,60000,1448260249337.splitLogManagerTimeoutMonitor] master.SplitLogManager: total tasks = 1 unassigned = 0 tasks={/hbase/splitWAL/WALs%2Fcmcc5%2C60020%2C1446430898091-splitting%2Fcmcc5%252C60020%252C1446430898091.1446711751417.meta=last_update = 1448261485384 last_version = 31 cur_worker_name = mobile13,60020,1448260248619 status = in_progress incarnation = 0 resubmits = 0 batch = installed = 1 done = 0 error = 0}
2015-11-23 14:52:17,474 INFO  [main-EventThread] master.SplitLogManager: task /hbase/splitWAL/WALs%2Fcmcc5%2C60020%2C1446430898091-splitting%2Fcmcc5%252C60020%252C1446430898091.1446711751417.meta entered state: ERR mobile13,60020,1448260248619
2015-11-23 14:52:17,474 WARN  [main-EventThread] master.SplitLogManager: Error splitting /hbase/splitWAL/WALs%2Fcmcc5%2C60020%2C1446430898091-splitting%2Fcmcc5%252C60020%252C1446430898091.1446711751417.meta
2015-11-23 14:52:17,475 WARN  [master:cmcc2:60000] master.SplitLogManager: error while splitting logs in [hdfs://cmcc2:9000/hbase/WALs/cmcc5,60020,1446430898091-splitting] installed = 1 but only 0 done
2015-11-23 14:52:17,476 FATAL [master:cmcc2:60000] master.HMaster: Master server abort: loaded coprocessors are: []
2015-11-23 14:52:17,479 FATAL [master:cmcc2:60000] master.HMaster: Unhandled exception. Starting shutdown.
java.io.IOException: error or interrupted while splitting logs in [hdfs://cmcc2:9000/hbase/WALs/cmcc5,60020,1446430898091-splitting] Task = installed = 1 done = 0 error = 1
        at org.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:359)
        at org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:417)
        at org.apache.hadoop.hbase.master.MasterFileSystem.splitMetaLog(MasterFileSystem.java:309)
        at org.apache.hadoop.hbase.master.MasterFileSystem.splitMetaLog(MasterFileSystem.java:300)
        at org.apache.hadoop.hbase.master.HMaster.splitMetaLogBeforeAssignment(HMaster.java:1182)
        at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:954)
        at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:684)
        at java.lang.Thread.run(Thread.java:745)
2015-11-23 14:52:17,485 INFO  [master:cmcc2:60000] master.HMaster: Aborting
2015-11-23 14:52:17,487 DEBUG [master:cmcc2:60000] master.HMaster: Stopping service threads
2015-11-23 14:52:17,488 INFO  [master:cmcc2:60000] ipc.RpcServer: Stopping server on 60000
2015-11-23 14:52:17,488 DEBUG [main-EventThread] master.SplitLogManager$DeleteAsyncCallback: deleted /hbase/splitWAL/WALs%2Fcmcc5%2C60020%2C1446430898091-splitting%2Fcmcc5%252C60020%252C1446430898091.1446711751417.meta
2015-11-23 14:52:17,489 INFO  [RpcServer.listener,port=60000] ipc.RpcServer: RpcServer.listener,port=60000: stopping
2015-11-23 14:52:17,489 INFO  [RpcServer.responder] ipc.RpcServer: RpcServer.responder: stopped
2015-11-23 14:52:17,491 INFO  [master:cmcc2:60000] master.HMaster: Stopping infoServer

已有(2)人评论

跳转到指定楼层
NEOGX 发表于 2015-11-23 15:28:17
本帖最后由 NEOGX 于 2015-11-23 15:30 编辑

hbase wal分裂产生了问题。原因也比较多:
下面试试,并且确保hadoop集群是正常,最好重启下hadoop
1.系统防火墙开启后主机ip对应主机名解析有问题,需要删除Hbase 的tmp文件夹重启(每个节点都要操作)
127.0.0.1 主机名也得删除。

2.hadoop 集群进入了safe model 模式,需要执行hadoop dfsadmin -safemode leave退出安全模式
3.存储在Hbase的数据有丢失,需要利用hadoop的回收站的机制恢复数据,或者删除HBase的数据
回复

使用道具 举报

梦~天涯 发表于 2015-11-24 10:15:16
NEOGX 发表于 2015-11-23 15:28
hbase wal分裂产生了问题。原因也比较多:
下面试试,并且确保hadoop集群是正常,最好重启下hadoop
1.系 ...

谢谢!问题解决了,应该就是hbase数据丢失的问题,因为同事之前更换了hdfs存储数据的位置,重启启动的时候数据就丢失了一些
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

关闭

推荐上一条 /2 下一条