分享

Oozie 中Suspend状态的作业可以通过resume来恢复执行

hapjin 2016-5-16 14:53:48 发表于 问题解答 [显示全部楼层] 回帖奖励 阅读模式 关闭右栏 1 10067
比如,我提交了一个作业:因底层HDFS的问题而导致作业被挂起了。

2016-05-16 11:12:52,724 WARN org.apache.oozie.command.wf.ActionStartXCommand: SERVER[datanode1] USER[cdhfive] GROUP[-] TOKEN[] APP[map-reduce-wf] JOB[0000007-160516095026479-oozie-oozi-W] ACTION[0000007-160516095026479-oozie-oozi-W@mr-node] Error starting action [mr-node]. ErrorType [TRANSIENT], ErrorCode [JA009], Message [JA009: Cannot delete /user/xxx/oozie-oozi/0000007-160516095026479-oozie-oozi-W/mr-node--map-reduce.tmp. Name node is in safe mode.
The reported blocks 2955 needs additional 3 blocks to reach the threshold 0.9990 of total blocks 2960.
The number of live datanodes 2 has reached the minimum number 0. Safe mode will be turned off automatically once the thresholds have been reached.
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1416)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:4056)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:4014)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:3998)

--------------------------------------------------------
当把HDFS问题解决(退出安全模式)后,可以用:
oozie job -resume 0000007-160516095026479-oozie-oozi-W
将作业恢复。

已有(1)人评论

跳转到指定楼层
hapjin 发表于 2016-5-16 14:56:16
Suspending a Workflow, Coordinator or Bundle Job
Example:
$ oozie job -oozie http://localhost:11000/oozie -suspend 14-20090525161321-oozie-joe
The suspend option suspends a workflow job in RUNNING status. After the command is executed the workflow job will be in SUSPENDED status.

Resuming a Workflow, Coordinator or Bundle Job
Example:
$ oozie job -oozie http://localhost:11000/oozie -resume 14-20090525161321-oozie-joe
The resume option resumes a workflow job in SUSPENDED status.
After the command is executed the workflow job will be in RUNNING status.
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

关闭

推荐上一条 /2 下一条