分享

求教,flume写入hdfs问题!

u010363909 发表于 2013-10-16 13:39:57 [显示全部楼层] 回帖奖励 阅读模式 关闭右栏 10 16798
flume配置如下:
[demoe3base@kf-app1 conf]$ cat flume-conf.conf
# Finally, now that we've defined all of our components, tell
# agent1 which ones we want to activate.
agent1.channels = ch1
agent1.sources = source1
agent1.sinks = hdfssink1
# Define a memory channel called ch1 on agent1
agent1.channels.ch1.type = memory
agent1.channels.ch1.capacity = 100000
agent1.channels.ch1.transactionCapacity = 100000
agent1.channels.ch1.keep-alive = 30
# Define an Avro source called avro-source1 on agent1 and tell it
# to bind to 0.0.0.0:41414. Connect it to channel ch1.
agent1.sources.source1.channels = ch1
agent1.sources.source1.type = avro
agent1.sources.source1.bind = 172.21.3.60
agent1.sources.source1.port = 44444
agent1.sources.source1.threads = 5
# Define a logger sink that simply logs all events it receives
# and connect it to the other end of the same channel.
agent1.sinks.hdfssink1.channel = ch1
agent1.sinks.hdfssink1.type = hdfs
agent1.sinks.hdfssink1.hdfs.path = hdfs://kf-app1:8020/flume
agent1.sinks.hdfssink1.hdfs.writeFormat = Text
agent1.sinks.hdfssink1.hdfs.fileType = DataStream
agent1.sinks.hdfssink1.hdfs.rollInterval = 0
agent1.sinks.hdfssink1.hdfs.rollSize = 60554432
agent1.sinks.hdfssink1.hdfs.rollCount = 0
agent1.sinks.hdfssink1.hdfs.batchSize = 1000
agent1.sinks.hdfssink1.hdfs.txnEventMax = 1000
agent1.sinks.hdfssink1.hdfs.callTimeout = 60000
agent1.sinks.hdfssink1.hdfs.appendTimeout = 60000
用命令启动:bin/flume-ng agent --conf ./conf/ -f conf/ flume-conf.conf -n agent1
一切正常,而且flume.log日志也正常。
用bin/flume-ng avro-client -H kf-app1 -p 44444 -F /chunk1/demo/flume/test2.txt发送文件,flume.log如下:
[demoe3base@kf-app1 logs]$ tail -f flume.log
08 五月 2013 14:34:31,370 INFO  [lifecycleSupervisor-1-3] (org.apache.flume.instrumentation.MonitoredCounterGroup.start:82)  - Component type: SOURCE, name: source1 started
08 五月 2013 14:34:31,370 INFO  [lifecycleSupervisor-1-3] (org.apache.flume.source.AvroSource.start:155)  - Avro source source1 started.
08 五月 2013 14:34:45,932 INFO  [pool-6-thread-1] (org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream:171)  - [id: 0x34bf1d3b, /172.21.3.61:39262 => /172.21.3.60:44444] OPEN
08 五月 2013 14:34:45,938 INFO  [pool-7-thread-1] (org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream:171)  - [id: 0x34bf1d3b, /172.21.3.61:39262 => /172.21.3.60:44444] BOUND: /172.21.3.60:44444
08 五月 2013 14:34:45,938 INFO  [pool-7-thread-1] (org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream:171)  - [id: 0x34bf1d3b, /172.21.3.61:39262 => /172.21.3.60:44444] CONNECTED: /172.21.3.61:39262
08 五月 2013 14:34:46,267 INFO  [pool-7-thread-1] (org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream:171)  - [id: 0x34bf1d3b, /172.21.3.61:39262 :> /172.21.3.60:44444] DISCONNECTED
08 五月 2013 14:34:46,267 INFO  [pool-7-thread-1] (org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream:171)  - [id: 0x34bf1d3b, /172.21.3.61:39262 :> /172.21.3.60:44444] UNBOUND
08 五月 2013 14:34:46,268 INFO  [pool-7-thread-1] (org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream:171)  - [id: 0x34bf1d3b, /172.21.3.61:39262 :> /172.21.3.60:44444] CLOSED
08 五月 2013 14:34:46,268 INFO  [pool-7-thread-1] (org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.channelClosed:209)  - Connection to /172.21.3.61:39262 disconnected.
08 五月 2013 14:34:46,922 INFO  [hdfs-hdfssink1-call-runner-0] (org.apache.flume.sink.hdfs.BucketWriter.doOpen:189)  - Creating hdfs://kf-app1:8020//FlumeData.1367994886244.tmp
问题来了:1、为什么是“FlumeData.1367994886244.tmp”临时文件,而不能将文件关闭呢?当我把代理强行kill掉或者关掉后,日志才打印出“08 五月 2013 14:21:17,556 INFO  [hdfs-hdfssink1-call-runner-5] (org.apache.flume.sink.hdfs.BucketWriter.renameBucket:379)  - Renaming hdfs://kf-app1:8020/flume/FlumeData.1367993804350.tmp to hdfs://kf-app1:8020/flume/FlumeData.1367993804350”,难道说代理不能够自动关闭?
          2、而且在发送第二个文件后发现日志报错UNBOUND,难道说一个通道直接接收一个文件?
08 五月 2013 14:30:47,202 INFO  [pool-6-thread-1] (org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream:171)  - [id: 0x5a9b8ff9, /172.21.3.61:38652 => /172.21.3.60:44444] OPEN
08 五月 2013 14:30:47,203 INFO  [pool-7-thread-1] (org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream:171)  - [id: 0x5a9b8ff9, /172.21.3.61:38652 => /172.21.3.60:44444] BOUND: /172.21.3.60:44444
08 五月 2013 14:30:47,203 INFO  [pool-7-thread-1] (org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream:171)  - [id: 0x5a9b8ff9, /172.21.3.61:38652 => /172.21.3.60:44444] CONNECTED: /172.21.3.61:38652
08 五月 2013 14:30:47,913 INFO  [pool-7-thread-1] (org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream:171)  - [id: 0x5a9b8ff9, /172.21.3.61:38652 :> /172.21.3.60:44444] DISCONNECTED
08 五月 2013 14:30:47,913 INFO  [pool-7-thread-1] (org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream:171)  - [id: 0x5a9b8ff9, /172.21.3.61:38652 :> /172.21.3.60:44444] UNBOUND
08 五月 2013 14:30:47,913 INFO  [pool-7-thread-1] (org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream:171)  - [id: 0x5a9b8ff9, /172.21.3.61:38652 :> /172.21.3.60:44444] CLOSED
08 五月 2013 14:30:47,914 INFO  [pool-7-thread-1] (org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.channelClosed:209)  - Connection to /172.21.3.61:38652 disconnected.
以上两个问题还请明白的或者遇到过的给予指点呀。
              
               
                                    

已有(10)人评论

跳转到指定楼层
tntzbzc 发表于 2013-10-16 13:40:49

            好长,帮顶一下,回头慢慢看
        
回复

使用道具 举报

tntzbzc 发表于 2013-10-16 13:41:40

            不知道怎么解决,帮楼主加个分,等flume高手来回答
        
回复

使用道具 举报

u010363909 发表于 2013-10-16 13:42:35

            求高手解答呀,至今仍旧没有解决掉呀
        
回复

使用道具 举报

u010363909 发表于 2013-10-16 13:43:34

            引用 2 楼 tntzbzc 的回复:不知道怎么解决,帮楼主加个分,等flume高手来回答
多谢版主帮忙,这个问题困扰我N久了,仍旧没有解决
        
回复

使用道具 举报

sptoor 发表于 2013-10-16 13:44:07

            好像是这样的:
1. avro会将您的日志收集起来放到一个文件中,当它达到设定的大小是才执行“Renaming”操作(或者强制kill时执行);
2. UNBOUND也困扰我一段时间,我的结论是,这不是一句报错,不信您仔细看看,那一行根本没有“ERROR”之类的提示。UNBOUND只是表示,当前这个日志文件没有达到设定的大小,不需要“ Renaming”为一个单独的文件。“ Renaming”之后一般会另起一个*.tmp文件开始写入。
这是我的理解,欢迎批评指正。
引用 楼主 u010363909 的回复:flume配置如下:
[demoe3base@kf-app1 conf]$ cat flume-conf.conf
# Finally, now that we've defined all of our components, tell
# agent1 which ones we want to activate.
agent1.channels = ch1
agent1.sources = source1
agent1.sinks = hdfssink1
# Define a memory channel called ch1 on agent1
agent1.channels.ch1.type = memory
agent1.channels.ch1.capacity = 100000
agent1.channels.ch1.transactionCapacity = 100000
agent1.channels.ch1.keep-alive = 30
# Define an Avro source called avro-source1 on agent1 and tell it
# to bind to 0.0.0.0:41414. Connect it to channel ch1.
agent1.sources.source1.channels = ch1
agent1.sources.source1.type = avro
agent1.sources.source1.bind = 172.21.3.60
agent1.sources.source1.port = 44444
agent1.sources.source1.threads = 5
# Define a logger sink that simply logs all events it receives
# and connect it to the other end of the same channel.
agent1.sinks.hdfssink1.channel = ch1
agent1.sinks.hdfssink1.type = hdfs
agent1.sinks.hdfssink1.hdfs.path = hdfs://kf-app1:8020/flume
agent1.sinks.hdfssink1.hdfs.writeFormat = Text
agent1.sinks.hdfssink1.hdfs.fileType = DataStream
agent1.sinks.hdfssink1.hdfs.rollInterval = 0
agent1.sinks.hdfssink1.hdfs.rollSize = 60554432
agent1.sinks.hdfssink1.hdfs.rollCount = 0
agent1.sinks.hdfssink1.hdfs.batchSize = 1000
agent1.sinks.hdfssink1.hdfs.txnEventMax = 1000
agent1.sinks.hdfssink1.hdfs.callTimeout = 60000
agent1.sinks.hdfssink1.hdfs.appendTimeout = 60000
用命令启动:bin/flume-ng agent --conf ./conf/ -f conf/ flume-conf.conf -n agent1
一切正常,而且flume.log日志也正常。
用bin/flume-ng avro-client -H kf-app1 -p 44444 -F /chunk1/demo/flume/test2.txt发送文件,flume.log如下:
[demoe3base@kf-app1 logs]$ tail -f flume.log
08 五月 2013 14:34:31,370 INFO  [lifecycleSupervisor-1-3] (org.apache.flume.instrumentation.MonitoredCounterGroup.start:82)  - Component type: SOURCE, name: source1 started
08 五月 2013 14:34:31,370 INFO  [lifecycleSupervisor-1-3] (org.apache.flume.source.AvroSource.start:155)  - Avro source source1 started.
08 五月 2013 14:34:45,932 INFO  [pool-6-thread-1] (org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream:171)  - [id: 0x34bf1d3b, /172.21.3.61:39262 => /172.21.3.60:44444] OPEN
08 五月 2013 14:34:45,938 INFO  [pool-7-thread-1] (org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream:171)  - [id: 0x34bf1d3b, /172.21.3.61:39262 => /172.21.3.60:44444] BOUND: /172.21.3.60:44444
08 五月 2013 14:34:45,938 INFO  [pool-7-thread-1] (org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream:171)  - [id: 0x34bf1d3b, /172.21.3.61:39262 => /172.21.3.60:44444] CONNECTED: /172.21.3.61:39262
08 五月 2013 14:34:46,267 INFO  [pool-7-thread-1] (org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream:171)  - [id: 0x34bf1d3b, /172.21.3.61:39262 :> /172.21.3.60:44444] DISCONNECTED
08 五月 2013 14:34:46,267 INFO  [pool-7-thread-1] (org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream:171)  - [id: 0x34bf1d3b, /172.21.3.61:39262 :> /172.21.3.60:44444] UNBOUND
08 五月 2013 14:34:46,268 INFO  [pool-7-thread-1] (org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream:171)  - [id: 0x34bf1d3b, /172.21.3.61:39262 :> /172.21.3.60:44444] CLOSED
08 五月 2013 14:34:46,268 INFO  [pool-7-thread-1] (org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.channelClosed:209)  - Connection to /172.21.3.61:39262 disconnected.
08 五月 2013 14:34:46,922 INFO  [hdfs-hdfssink1-call-runner-0] (org.apache.flume.sink.hdfs.BucketWriter.doOpen:189)  - Creating hdfs://kf-app1:8020//FlumeData.1367994886244.tmp
问题来了:1、为什么是“FlumeData.1367994886244.tmp”临时文件,而不能将文件关闭呢?当我把代理强行kill掉或者关掉后,日志才打印出“08 五月 2013 14:21:17,556 INFO  [hdfs-hdfssink1-call-runner-5] (org.apache.flume.sink.hdfs.BucketWriter.renameBucket:379)  - Renaming hdfs://kf-app1:8020/flume/FlumeData.1367993804350.tmp to hdfs://kf-app1:8020/flume/FlumeData.1367993804350”,难道说代理不能够自动关闭?
          2、而且在发送第二个文件后发现日志报错UNBOUND,难道说一个通道直接接收一个文件?
08 五月 2013 14:30:47,202 INFO  [pool-6-thread-1] (org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream:171)  - [id: 0x5a9b8ff9, /172.21.3.61:38652 => /172.21.3.60:44444] OPEN
08 五月 2013 14:30:47,203 INFO  [pool-7-thread-1] (org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream:171)  - [id: 0x5a9b8ff9, /172.21.3.61:38652 => /172.21.3.60:44444] BOUND: /172.21.3.60:44444
08 五月 2013 14:30:47,203 INFO  [pool-7-thread-1] (org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream:171)  - [id: 0x5a9b8ff9, /172.21.3.61:38652 => /172.21.3.60:44444] CONNECTED: /172.21.3.61:38652
08 五月 2013 14:30:47,913 INFO  [pool-7-thread-1] (org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream:171)  - [id: 0x5a9b8ff9, /172.21.3.61:38652 :> /172.21.3.60:44444] DISCONNECTED
08 五月 2013 14:30:47,913 INFO  [pool-7-thread-1] (org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream:171)  - [id: 0x5a9b8ff9, /172.21.3.61:38652 :> /172.21.3.60:44444] UNBOUND
08 五月 2013 14:30:47,913 INFO  [pool-7-thread-1] (org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream:171)  - [id: 0x5a9b8ff9, /172.21.3.61:38652 :> /172.21.3.60:44444] CLOSED
08 五月 2013 14:30:47,914 INFO  [pool-7-thread-1] (org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.channelClosed:209)  - Connection to /172.21.3.61:38652 disconnected.
以上两个问题还请明白的或者遇到过的给予指点呀。

        
回复

使用道具 举报

virgo777 发表于 2013-10-16 13:44:55

            学习下学习下
        
回复

使用道具 举报

duanze82 发表于 2013-10-16 13:45:33

            不懂帮顶,顺带长点知识
回复

使用道具 举报

小小布衣 发表于 2014-8-28 18:32:08
上传到hdfs上我的已经报错了,我就用avro接受,传到本地文件系统上,看能不能成功,结果报的这个错14/08/28 18:16:52 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: SOURCE, name: r2. src.open-connection.count == 0
14/08/28 18:16:52 INFO source.AvroSource: Avro source r2 stopped. Metrics: SOURCE:r2{src.events.accepted=1430, src.events.received=1430, src.append.accepted=0, src.append-batch.accepted=286, src.open-connection.count=0, src.append-batch.received=286, src.append.received=0}
14/08/28 18:16:52 ERROR flume.SinkRunner: Unable to deliver event. Exception follows.
org.apache.flume.EventDeliveryException: Failed to send events
        at org.apache.flume.sink.AbstractRpcSink.process(AbstractRpcSink.java:382)
        at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
        at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
        at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.flume.EventDeliveryException: NettyAvroRpcClient { host: 14.18.203.70, port: 50010 }: Failed to send batch
        at org.apache.flume.api.NettyAvroRpcClient.appendBatch(NettyAvroRpcClient.java:294)
        at org.apache.flume.sink.AbstractRpcSink.process(AbstractRpcSink.java:366)
        ... 3 more
Caused by: org.apache.flume.EventDeliveryException: NettyAvroRpcClient { host: 14.18.203.70, port: 50010 }: Interrupted in handshake
        at org.apache.flume.api.NettyAvroRpcClient.appendBatch(NettyAvroRpcClient.java:341)
        at org.apache.flume.api.NettyAvroRpcClient.appendBatch(NettyAvroRpcClient.java:282)
        ... 4 more
Caused by: java.lang.InterruptedException
        at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:400)
        at java.util.concurrent.FutureTask.get(FutureTask.java:199)
        at org.apache.flume.api.NettyAvroRpcClient.appendBatch(NettyAvroRpcClient.java:336)
        ... 5 more
^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C14/08/28 18:16:57 INFO sink.AbstractRpcSink: Rpc sink k2 stopping...
14/08/28 18:16:57 INFO instrumentation.MonitoredCounterGroup: Component type: SINK, name: k2 stopped
14/08/28 18:16:57 INFO instrumentation.MonitoredCounterGroup:


不知道呢的有没有解决,我的这个帮我看下把
回复

使用道具 举报

12下一页
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

关闭

推荐上一条 /2 下一条