分享

Hadoop HA 配置自动切换后启动时NameNode自动消失

aLivable 发表于 2017-12-1 23:00:31 [显示全部楼层] 回帖奖励 阅读模式 关闭右栏 6 8398
2017-12-01 09:33:17,352 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
2017-12-01 09:33:17,350 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at java.security.AccessController.doPrivileged(Native Method)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.jav
a:585)
        at org.apache.hadoop.ha.proto.HAServiceProtocolProtos$HAServiceProtocolService$2.callBlockingMe
thod(HAServiceProtocolProtos.java:4460)
        at org.apache.hadoop.ha.protocolPB.HAServiceProtocolServerSideTranslatorPB.transitionToActive(H
AServiceProtocolServerSideTranslatorPB.java:107)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.transitionToActive(NameNodeRpcServe
r.java:1197)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.transitionToActive(NameNode.java:1502)
        at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.setState(StandbyState.java:49)
        at org.apache.hadoop.hdfs.server.namenode.ha.HAState.setStateInternal(HAState.java:63)
        at org.apache.hadoop.hdfs.server.namenode.ha.ActiveState.enterState(ActiveState.java:61)
        at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.startActiveServices(NameNo
de.java:1624)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startActiveServices(FSNamesystem.java:10
68)
        at org.apache.hadoop.hdfs.server.namenode.FSEditLog.recoverUnclosedStreams(FSEditLog.java:1361)
        at org.apache.hadoop.hdfs.server.namenode.JournalSet.recoverUnfinalizedSegments(JournalSet.java
:587)
        at org.apache.hadoop.hdfs.server.namenode.JournalSet.mapJournalsAndReportErrors(JournalSet.java
:359)
        at org.apache.hadoop.hdfs.server.namenode.JournalSet$7.apply(JournalSet.java:590)
        at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.recoverUnfinalizedSegments(Quoru
mJournalManager.java:436)
        at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createNewUniqueEpoch(QuorumJourn
alManager.java:182)
        at org.apache.hadoop.hdfs.qjournal.client.AsyncLoggerSet.waitForWriteQuorum(AsyncLoggerSet.java
:142)
        at org.apache.hadoop.hdfs.qjournal.client.QuorumCall.rethrowException(QuorumCall.java:223)
        at org.apache.hadoop.hdfs.qjournal.client.QuorumException.create(QuorumException.java:81)
192.168.1.110:8485: Call From hadoop-senior.ibeifeng.com/192.168.1.110 to hadoop-senior.ibeifeng.com:84
85 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:
  http://wiki.apache.org/hadoop/ConnectionRefused
192.168.1.112:8485: Call From hadoop-senior.ibeifeng.com/192.168.1.110 to hadoop-senior02.ibeifeng.com:
8485 failed on connection exception: java.net.ConnectException: Connection refused; For more details se
e:  http://wiki.apache.org/hadoop/ConnectionRefused
192.168.1.111:8485: Call From hadoop-senior.ibeifeng.com/192.168.1.110 to hadoop-senior01.ibeifeng.com:
8485 failed on connection exception: java.net.ConnectException: Connection refused; For more details se
e:  http://wiki.apache.org/hadoop/ConnectionRefused
org.apache.hadoop.hdfs.qjournal.client.QuorumException: Got too many exceptions to achieve quorum size
2/3. 3 exceptions thrown:
2017-12-01 09:33:17,343 FATAL org.apache.hadoop.hdfs.server.namenode.FSEditLog: Error: recoverUnfinaliz
edSegments failed for required journal (JournalAndStream(mgr=QJM to [192.168.1.110:8485, 192.168.1.111:
8485, 192.168.1.112:8485], stream=null))
2017-12-01 09:33:17,342 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hadoop-senior02.
ibeifeng.com/192.168.1.112:8485. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixe
dSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2017-12-01 09:33:17,333 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hadoop-senior01.
ibeifeng.com/192.168.1.111:8485. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixe
dSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2017-12-01 09:33:17,332 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hadoop-senior.ib
eifeng.com/192.168.1.110:8485. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedS
leep(maxRetries=10, sleepTime=1000 MILLISECONDS)

已有(6)人评论

跳转到指定楼层
aLivable 发表于 2017-12-1 23:01:42
找大腿求助..................
回复

使用道具 举报

desehawk 发表于 2017-12-2 11:43:52
本帖最后由 desehawk 于 2017-12-2 11:45 编辑
aLivable 发表于 2017-12-1 23:01
找大腿求助..................

journalnode先启动,然后在启动其它的
回复

使用道具 举报

aLivable 发表于 2017-12-2 20:36:49
desehawk 发表于 2017-12-2 11:43
journalnode先启动,然后在启动其它的

我是先启动ZKServer.sh 服务,然后启动集群NameNode启动后自动消失,jps查看的时候其他服务都启动了 ,就死NameNode消失了
回复

使用道具 举报

aLivable 发表于 2017-12-2 21:42:40
问题已解决
回复

使用道具 举报

aLivable 发表于 2017-12-2 21:52:06
添加        配置core-site.xml
                <property>
                   <name>ipc.client.connect.max.retries</name>
                        <value>20</value>
                        <description>
                          Indicates the number of retries a clientwill make to establisha server connection.
                        </description>
                  </property>
                 
                  <property>
                   <name>ipc.client.connect.retry.interval</name>
                        <value>5000</value>
                        <description>
                          Indicates the number of milliseconds aclient will wait for before retrying to establish a server connection.
                        </description>
                  </property>
回复

使用道具 举报

nextuser 发表于 2017-12-2 22:09:18
aLivable 发表于 2017-12-2 21:52
添加        配置core-site.xml
               
                   ipc.client.connect.max.retries

多连接了几次就好了,那可能就是网络原因了。
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

关闭

推荐上一条 /2 下一条