分享

HBase+ZooKeeper+Hadoop2.6.0的ResourceManager HA集群高可用配置

小伙425 发表于 2016-4-27 10:14:53
Alkaloid0515 发表于 2016-4-26 20:06
yarn.resourcemanager.address.rm1 yarn.resourcemanager.address.rm2
这两个不识别,不可盲目模仿

那这两个怎么配置?名字不是上面的rm1和rm2吗?我配置的是slave1和2的,这有什么问题?
回复

使用道具 举报

orclcast 发表于 2016-4-27 14:51:41
我就弱弱的问一句,zkfc进程都没有启动,你是怎么做到的自动故障转移的!!!
回复

使用道具 举报

kermit 发表于 2016-9-25 14:52:24
我hbase的regionserver全都启动成功了,backup master也启动成功了,执行start-hbase.sh的那台机的master也启动成功了,但是这台master启动成功后一两秒马上就挂了
2016-09-25 01:03:47,105 INFO  [main] zookeeper.ZooKeeper: Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
2016-09-25 01:03:47,105 INFO  [main] zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp
2016-09-25 01:03:47,105 INFO  [main] zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
2016-09-25 01:03:47,105 INFO  [main] zookeeper.ZooKeeper: Client environment:os.name=Linux
2016-09-25 01:03:47,106 INFO  [main] zookeeper.ZooKeeper: Client environment:os.arch=amd64
2016-09-25 01:03:47,106 INFO  [main] zookeeper.ZooKeeper: Client environment:os.version=3.10.0-327.28.3.el7.x86_64
2016-09-25 01:03:47,106 INFO  [main] zookeeper.ZooKeeper: Client environment:user.name=sony
2016-09-25 01:03:47,106 INFO  [main] zookeeper.ZooKeeper: Client environment:user.home=/home/sony
2016-09-25 01:03:47,106 INFO  [main] zookeeper.ZooKeeper: Client environment:user.dir=/home/sony/soft/hbase-1.1.6
2016-09-25 01:03:47,109 INFO  [main] zookeeper.ZooKeeper: Initiating client connection, connectString=EB37:2181,node1:2181,node2:2181 sessionTimeout=90000 watcher=master:160000x0, quorum=EB37:2181,node1:2181,node2:2181, baseZNode=/hbase
2016-09-25 01:03:47,135 INFO  [main-SendThread(node1:2181)] zookeeper.ClientCnxn: Opening socket connection to server node1/192.168.31.151:2181. Will not attempt to authenticate using SASL (unknown error)
2016-09-25 01:03:47,170 INFO  [main-SendThread(node1:2181)] zookeeper.ClientCnxn: Socket connection established to node1/192.168.31.151:2181, initiating session
2016-09-25 01:03:47,300 INFO  [main-SendThread(node1:2181)] zookeeper.ClientCnxn: Session establishment complete on server node1/192.168.31.151:2181, sessionid = 0x1000030cd960011, negotiated timeout = 40000
2016-09-25 01:03:47,448 INFO  [RpcServer.responder] ipc.RpcServer: RpcServer.responder: starting
2016-09-25 01:03:47,449 INFO  [RpcServer.listener,port=16000] ipc.RpcServer: RpcServer.listener,port=16000: starting
2016-09-25 01:03:47,537 INFO  [main] mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2016-09-25 01:03:47,541 INFO  [main] http.HttpRequestLog: Http request log for http.requests.master is not defined
2016-09-25 01:03:47,552 INFO  [main] http.HttpServer: Added global filter 'safety' (class=org.apache.hadoop.hbase.http.HttpServer$QuotingInputFilter)
2016-09-25 01:03:47,555 INFO  [main] http.HttpServer: Added filter static_user_filter (class=org.apache.hadoop.hbase.http.lib.StaticUserWebFilter$StaticUserFilter) to context master
2016-09-25 01:03:47,555 INFO  [main] http.HttpServer: Added filter static_user_filter (class=org.apache.hadoop.hbase.http.lib.StaticUserWebFilter$StaticUserFilter) to context logs
2016-09-25 01:03:47,555 INFO  [main] http.HttpServer: Added filter static_user_filter (class=org.apache.hadoop.hbase.http.lib.StaticUserWebFilter$StaticUserFilter) to context static
2016-09-25 01:03:47,573 INFO  [main] http.HttpServer: Jetty bound to port 16010
2016-09-25 01:03:47,573 INFO  [main] mortbay.log: jetty-6.1.26
2016-09-25 01:03:48,019 INFO  [main] mortbay.log: Started SelectChannelConnector@0.0.0.0:16010
2016-09-25 01:03:48,024 INFO  [main] master.HMaster: hbase.rootdir=hdfs://clusterA/hbase, hbase.cluster.distributed=true
2016-09-25 01:03:48,042 INFO  [main] master.HMaster: Adding backup master ZNode /hbase/backup-masters/eb37,16000,1474736626010
2016-09-25 01:03:48,273 INFO  [master/EB37/192.168.31.150:16000] zookeeper.RecoverableZooKeeper: Process identifier=hconnection-0x784623e2 connecting to ZooKeeper ensemble=EB37:2181,node1:2181,node2:2181
2016-09-25 01:03:48,273 INFO  [master/EB37/192.168.31.150:16000] zookeeper.ZooKeeper: Initiating client connection, connectString=EB37:2181,node1:2181,node2:2181 sessionTimeout=90000 watcher=hconnection-0x784623e20x0, quorum=EB37:2181,node1:2181,node2:2181, baseZNode=/hbase
2016-09-25 01:03:48,276 INFO  [master/EB37/192.168.31.150:16000-SendThread(node2:2181)] zookeeper.ClientCnxn: Opening socket connection to server node2/192.168.31.152:2181. Will not attempt to authenticate using SASL (unknown error)
2016-09-25 01:03:48,279 INFO  [master/EB37/192.168.31.150:16000-SendThread(node2:2181)] zookeeper.ClientCnxn: Socket connection established to node2/192.168.31.152:2181, initiating session
2016-09-25 01:03:48,327 INFO  [master/EB37/192.168.31.150:16000-SendThread(node2:2181)] zookeeper.ClientCnxn: Session establishment complete on server node2/192.168.31.152:2181, sessionid = 0x2000031c262000a, negotiated timeout = 40000
2016-09-25 01:03:48,344 INFO  [master/EB37/192.168.31.150:16000] client.ZooKeeperRegistry: ClusterId read in ZooKeeper is null
2016-09-25 01:03:48,381 INFO  [EB37:16000.activeMasterManager] master.ActiveMasterManager: Deleting ZNode for /hbase/backup-masters/eb37,16000,1474736626010 from backup master directory
2016-09-25 01:03:48,450 INFO  [EB37:16000.activeMasterManager] master.ActiveMasterManager: Registered Active Master=eb37,16000,1474736626010
2016-09-25 01:03:48,509 FATAL [EB37:16000.activeMasterManager] master.HMaster: Failed to become active master
java.lang.IllegalStateException
        at com.google.common.base.Preconditions.checkState(Preconditions.java:133)
        at org.apache.hadoop.ipc.Client.setCallIdAndRetryCount(Client.java:118)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:99)
        at com.sun.proxy.$Proxy17.setSafeMode(Unknown Source)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:279)
        at com.sun.proxy.$Proxy18.setSafeMode(Unknown Source)
        at org.apache.hadoop.hdfs.DFSClient.setSafeMode(DFSClient.java:2419)
        at org.apache.hadoop.hdfs.DistributedFileSystem.setSafeMode(DistributedFileSystem.java:1036)
        at org.apache.hadoop.hdfs.DistributedFileSystem.setSafeMode(DistributedFileSystem.java:1020)
        at org.apache.hadoop.hbase.util.FSUtils.isInSafeMode(FSUtils.java:524)
        at org.apache.hadoop.hbase.util.FSUtils.waitOnSafeMode(FSUtils.java:970)
        at org.apache.hadoop.hbase.master.MasterFileSystem.checkRootDir(MasterFileSystem.java:417)
        at org.apache.hadoop.hbase.master.MasterFileSystem.createInitialFileSystemLayout(MasterFileSystem.java:146)
        at org.apache.hadoop.hbase.master.MasterFileSystem.<init>(MasterFileSystem.java:126)
        at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:652)
        at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:184)
        at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1664)
        at java.lang.Thread.run(Thread.java:745)
2016-09-25 01:03:48,511 FATAL [EB37:16000.activeMasterManager] master.HMaster: Unhandled exception. Starting shutdown.
java.lang.IllegalStateException
        at com.google.common.base.Preconditions.checkState(Preconditions.java:133)
        at org.apache.hadoop.ipc.Client.setCallIdAndRetryCount(Client.java:118)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:99)
        at com.sun.proxy.$Proxy17.setSafeMode(Unknown Source)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:279)
        at com.sun.proxy.$Proxy18.setSafeMode(Unknown Source)
        at org.apache.hadoop.hdfs.DFSClient.setSafeMode(DFSClient.java:2419)
        at org.apache.hadoop.hdfs.DistributedFileSystem.setSafeMode(DistributedFileSystem.java:1036)
        at org.apache.hadoop.hdfs.DistributedFileSystem.setSafeMode(DistributedFileSystem.java:1020)
        at org.apache.hadoop.hbase.util.FSUtils.isInSafeMode(FSUtils.java:524)
        at org.apache.hadoop.hbase.util.FSUtils.waitOnSafeMode(FSUtils.java:970)
        at org.apache.hadoop.hbase.master.MasterFileSystem.checkRootDir(MasterFileSystem.java:417)
        at org.apache.hadoop.hbase.master.MasterFileSystem.createInitialFileSystemLayout(MasterFileSystem.java:146)
        at org.apache.hadoop.hbase.master.MasterFileSystem.<init>(MasterFileSystem.java:126)
        at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:652)
        at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:184)
        at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1664)
        at java.lang.Thread.run(Thread.java:745)
2016-09-25 01:03:48,512 INFO  [EB37:16000.activeMasterManager] regionserver.HRegionServer: STOPPED: Unhandled exception. Starting shutdown.
2016-09-25 01:03:51,367 INFO  [master/EB37/192.168.31.150:16000] ipc.RpcServer: Stopping server on 16000
2016-09-25 01:03:51,367 INFO  [RpcServer.listener,port=16000] ipc.RpcServer: RpcServer.listener,port=16000: stopping
2016-09-25 01:03:51,368 INFO  [RpcServer.responder] ipc.RpcServer: RpcServer.responder: stopped
2016-09-25 01:03:51,368 INFO  [RpcServer.responder] ipc.RpcServer: RpcServer.responder: stopping
2016-09-25 01:03:51,368 INFO  [master/EB37/192.168.31.150:16000] regionserver.HRegionServer: Stopping infoServer
2016-09-25 01:03:51,375 INFO  [master/EB37/192.168.31.150:16000] mortbay.log: Stopped SelectChannelConnector@0.0.0.0:16010
2016-09-25 01:03:51,476 INFO  [master/EB37/192.168.31.150:16000] regionserver.HRegionServer: stopping server eb37,16000,1474736626010
2016-09-25 01:03:51,477 INFO  [master/EB37/192.168.31.150:16000] client.ConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x2000031c262000a
2016-09-25 01:03:51,484 INFO  [master/EB37/192.168.31.150:16000] zookeeper.ZooKeeper: Session: 0x2000031c262000a closed
2016-09-25 01:03:51,484 INFO  [master/EB37/192.168.31.150:16000-EventThread] zookeeper.ClientCnxn: EventThread shut down
2016-09-25 01:03:51,489 INFO  [master/EB37/192.168.31.150:16000] regionserver.HRegionServer: stopping server eb37,16000,1474736626010; all regions closed.
2016-09-25 01:03:51,490 INFO  [master/EB37/192.168.31.150:16000] hbase.ChoreService: Chore service for: eb37,16000,1474736626010 had [] on shutdown
2016-09-25 01:03:51,508 INFO  [master/EB37/192.168.31.150:16000] ipc.RpcServer: Stopping server on 16000
2016-09-25 01:03:51,517 INFO  [master/EB37/192.168.31.150:16000] zookeeper.RecoverableZooKeeper: Node /hbase/rs/eb37,16000,1474736626010 already deleted, retry=false
2016-09-25 01:03:51,523 INFO  [master/EB37/192.168.31.150:16000] zookeeper.ZooKeeper: Session: 0x1000030cd960011 closed
2016-09-25 01:03:51,523 INFO  [main-EventThread] zookeeper.ClientCnxn: EventThread shut down
2016-09-25 01:03:51,524 INFO  [master/EB37/192.168.31.150:16000] regionserver.HRegionServer: stopping server eb37,16000,1474736626010; zookeeper connection closed.
2016-09-25 01:03:51,524 INFO  [master/EB37/192.168.31.150:16000] regionserver.HRegionServer: master/EB37/192.168.31.150:16000 exiting
回复

使用道具 举报

kermit 发表于 2016-9-25 14:56:07
2016-09-25 01:03:48,042 INFO  [main] master.HMaster: Adding backup master ZNode /hbase/backup-masters/eb37,16000,1474736626010
这条日志很奇怪,我在eb37这个节点执行的启动脚本,backup-masters里面只配了node1作为backupmaster,按理说eb37应该作为primary master,可他却把往zookeeper的backupmasters里面加
回复

使用道具 举报

kermit 发表于 2016-9-25 15:18:06
超级肉丸 发表于 2015-7-1 15:20
**** 作者被禁止或删除 内容自动屏蔽 ****

规避方法配置不正确

        <property>
                <name>dfs.ha.fencing.methods</name>
                <value>sshfence</value>
        </property>
        <property>
                <name>dfs.ha.fencing.ssh.private-key-files</name>
                <value>/home/sony/.ssh/id_rsa</value>
        </property>

回复

使用道具 举报

kermit 发表于 2016-9-26 02:20:22
唉,好像是因为我用了hadoop2.6.4的jar包替换了hbase里hadoop开头的jar包,可是我看官方文档不是说需要替换么,真是rlgl
回复

使用道具 举报

ylp09 发表于 2017-7-2 20:34:14
向楼主请教个问题,楼主的hbase的集群搭建是否有问题?
楼主的4.2,
4.2 修改HBase的配置文件#HBASE_HOME/conf/hbase-site.xml,,
其中 hbase-site.xml文件中的hbase.rootdir的配置是
<property>
<name>hbase.rootdir</name>
<value>hdfs://Master:9000/hbase</value>
</property>
<property>

而楼主的备注中有说明 " 注意:$HBASE_HOME/conf/hbase-site.xml的hbase.rootdir的主机和端口号与$HADOOP_HOME/conf/core-site.xml的fs.default.name的主机和端口号一致 "

但是楼主在2.2.2 中的core-site.xml文件中配置fs.default.name并不是"hdfs://Master", 而是 " hdfs://masters  ",

所以我就有疑问了,如果hdfs ha的集群中的Master节点挂点,active的节点转移到其他slave节点,
这里的hbase还能用吗?
回复

使用道具 举报

12345
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

关闭

推荐上一条 /2 下一条