分享

Hadoop2.6.0 集群能够正常启动,为什么却提示datanode的数目为0?

hapjin 发表于 2015-4-29 17:47:00 [显示全部楼层] 只看大图 回帖奖励 阅读模式 关闭右栏 4 83544
                               
遇到这种情况,是不是要重新格式化HDFS?(希望大神指点,谢谢!)

其中namenode能正常启动:

namenode

namenode


datanodenodemanager没有启动成功

datanode

datanode


由于hadoop集群中所有的数据块都丢失了,我执行了hdfsfsck -delete  之后,执行 hadoopdfsadmin -report 如下图:

dfsadmin

dfsadmin


执行hdfsfsck /      如下图:

fsck

fsck

namenode上的日志信息如下(没有error):

6942015-04-29 17:25:54,529 INFOorg.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor:Scanned 0 directive(s) and     0 block(s) in 0 millisecond(s).
6952015-04-29 17:26:24,530 INFOorg.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor:Rescanning after 30000 mill    iseconds
6962015-04-29 17:26:24,530 INFOorg.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor:Scanned 0 directive(s) and     0 block(s) in 0 millisecond(s).
6972015-04-29 17:26:54,530 INFOorg.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor:Rescanning after 30000 mill    iseconds
6982015-04-29 17:26:54,530 INFOorg.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor:Scanned 0 directive(s) and     0 block(s) in 1 millisecond(s).
6992015-04-29 17:27:24,529 INFOorg.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor:Rescanning after 30000 mill    iseconds
7002015-04-29 17:27:24,530 INFOorg.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor:Scanned 0 directive(s) and     0 block(s) in 0 millisecond(s).
7012015-04-29 17:27:54,530 INFOorg.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor:Rescanning after 30000 mill    iseconds
7022015-04-29 17:27:54,530 INFOorg.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor:Scanned 0 directive(s) and     0 block(s) in 0 millisecond(s).
7032015-04-29 17:28:24,530 INFOorg.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor:Rescanning after 30000 mill    iseconds
7042015-04-29 17:28:24,530 INFOorg.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor:Scanned 0 directive(s) and     0 block(s) in 1 millisecond(s).
7052015-04-29 17:28:54,530 INFOorg.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor:Rescanning after 30000 mill    iseconds
7062015-04-29 17:28:54,530 INFOorg.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor:Scanned 0 directive(s) and     0 block(s) in 0 millisecond(s).
7072015-04-29 17:29:24,530 INFOorg.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor:Rescanning after 30000 mill    iseconds
7082015-04-29 17:29:24,530 INFOorg.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor:Scanned 0 directive(s) and     0 block(s) in 1 millisecond(s).
7092015-04-29 17:29:54,529 INFOorg.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor:Rescanning after 30000 mill    iseconds
7102015-04-29 17:29:54,530 INFOorg.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor:Scanned 0 directive(s) and     0 block(s) in 0 millisecond(s).
7112015-04-29 17:30:24,530 INFOorg.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor:Rescanning after 30000 mill    iseconds
7122015-04-29 17:30:24,530 INFOorg.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor:Scanned 0 directive(s) and     0 block(s) in 0 millisecond(s).


datanode上的日志信息如下(没有error):

2015-04-2917:33:48,060 INFO org.apache.hadoop.ipc.Client: Retrying connect toserver: controller/192.168.1.186:9000. Already tried 0 time(s); retrypolicy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,sleepTime=1000 MILLISECONDS)
2015-04-2917:33:49,060 INFO org.apache.hadoop.ipc.Client: Retrying connect toserver: controller/192.168.1.186:9000. Already tried 1 time(s); retrypolicy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,sleepTime=1000 MILLISECONDS)
2015-04-2917:33:50,061 INFO org.apache.hadoop.ipc.Client: Retrying connect toserver: controller/192.168.1.186:9000. Already tried 2 time(s); retrypolicy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,sleepTime=1000 MILLISECONDS)
2015-04-2917:33:51,061 INFO org.apache.hadoop.ipc.Client: Retrying connect toserver: controller/192.168.1.186:9000. Already tried 3 time(s); retrypolicy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,sleepTime=1000 MILLISECONDS)
2015-04-2917:33:52,062 INFO org.apache.hadoop.ipc.Client: Retrying connect toserver: controller/192.168.1.186:9000. Already tried 4 time(s); retrypolicy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,sleepTime=1000 MILLISECONDS)
2015-04-2917:33:53,063 INFO org.apache.hadoop.ipc.Client: Retrying connect toserver: controller/192.168.1.186:9000. Already tried 5 time(s); retrypolicy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,sleepTime=1000 MILLISECONDS)
2015-04-2917:33:54,063 INFO org.apache.hadoop.ipc.Client: Retrying connect toserver: controller/192.168.1.186:9000. Already tried 6 time(s); retrypolicy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,sleepTime=1000 MILLISECONDS)
2015-04-2917:33:55,064 INFO org.apache.hadoop.ipc.Client: Retrying connect toserver: controller/192.168.1.186:9000. Already tried 7 time(s); retrypolicy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,sleepTime=1000 MILLISECONDS)
2015-04-2917:33:56,064 INFO org.apache.hadoop.ipc.Client: Retrying connect toserver: controller/192.168.1.186:9000. Already tried 8 time(s); retrypolicy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,sleepTime=1000 MILLISECONDS)
2015-04-2917:33:57,065 INFO org.apache.hadoop.ipc.Client: Retrying connect toserver: controller/192.168.1.186:9000. Already tried 9 time(s); retrypolicy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,sleepTime=1000 MILLISECONDS)


                               
来自群组: Hadoop技术组

已有(4)人评论

跳转到指定楼层
langke93 发表于 2015-4-29 18:06:08
可能是僵死的进程。尝试执行下面命令,看是否有停止datanode
  1. stop-dfs.sh
复制代码



回复

使用道具 举报

levycui 发表于 2015-4-30 09:39:47
这种情况需要删除tmp中临时的hadoop文件,rm -rf /tmp/hadoop*
之后再format下
回复

使用道具 举报

hapjin 发表于 2015-4-30 11:52:34
levycui 发表于 2015-4-30 09:39
这种情况需要删除tmp中临时的hadoop文件,rm -rf /tmp/hadoop*
之后再format下

你说的是<name>hadoop.tmp.dir</name>属性下指定的文件吗???
那重新format之后,原来DataNode上的数据块是不是 HDFS文件系统就不能再访问了,那这些丢失的数据块会不会一直占用磁盘空间?
谢谢啦。

回复

使用道具 举报

levycui 发表于 2015-5-5 09:27:24
hapjin 发表于 2015-4-30 11:52
你说的是hadoop.tmp.dir属性下指定的文件吗???
那重新format之后,原来DataNode上的数据块是不是 HDF ...

要是有数据就不能format,数据丢失的,
如果数据可以不要,就format,rm不是删除hadoop.tmp.dir中的内容,
rm -rf /tmp/hadoop* 是删除的/tmp中的hadoop临时文件,之后format

回复

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

关闭

推荐上一条 /2 下一条