分享

如何把kafka的数据通过flume采集到hdfs中呢或Hbase中

yunge2016 发表于 2017-7-1 18:43:46 [显示全部楼层] 回帖奖励 阅读模式 关闭右栏 17 30186
yunge2016 发表于 2017-7-2 17:20:58
现在这个目录下可以看到topic对应的数据文件.log和.index文件,打开也可以看到数据。就是想问问flume采集kafka的数据到hdfs如何操作。现在也不确定哪里的问题,请问您这里有没有完整的配置文件呢。各种报错。指定了hdfs的地址,现在报错配置文件中没有找到主机。启动时需要指定主机吗
回复

使用道具 举报

nextuser 发表于 2017-7-2 19:00:29
yunge2016 发表于 2017-7-2 17:20
现在这个目录下可以看到topic对应的数据文件.log和.index文件,打开也可以看到数据。就是想问问flume采集ka ...

贴出错误来
回复

使用道具 举报

yunge2016 发表于 2017-7-2 19:52:22
flume采集kafka数据时报错
17/07/02 18:43:11 INFO avro.ReliableSpoolingFileEventReader: Preparing to move file /usr/hadoop3data/testdata.txt to /usr/hadoop3data/testdata.txt.COMPLETED
17/07/02 18:43:11 INFO hdfs.HDFSDataStream: Serializer = TEXT, UseRawLocalFileSystem = false
17/07/02 18:43:18 INFO hdfs.BucketWriter: Creating hdfs://192.168.137.3:8020/data/2017-07-02.1499035391653.tmp
17/07/02 18:43:28 WARN hdfs.HDFSEventSink: HDFS IO error
java.io.IOException: Callable timed out after 10000 ms on file: hdfs://192.168.137.3:8020/data/2017-07-02.1499035391653.tmp
        at org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(BucketWriter.java:682)
        at org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:232)
        at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:504)
        at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:406)
        at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:67)
        at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:145)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.concurrent.TimeoutException
        at java.util.concurrent.FutureTask.get(FutureTask.java:201)
        at org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(BucketWriter.java:675)
        ... 6 more
17/07/02 18:43:33 INFO hdfs.BucketWriter: Creating hdfs://192.168.137.3:8020/data/2017-07-02.1499035391654.tmp
17/07/02 18:43:43 WARN hdfs.HDFSEventSink: HDFS IO error
java.io.IOException: Callable timed out after 10000 ms on file: hdfs://192.168.137.3:8020/data/2017-07-02.1499035391654.tmp
        at org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(BucketWriter.java:682)
        at org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:232)
        at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:504)
        at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:406)
        at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:67)
        at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:145)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.concurrent.TimeoutException
        at java.util.concurrent.FutureTask.get(FutureTask.java:201)
        at org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(BucketWriter.java:675)
        ... 6 more
17/07/02 18:43:48 INFO hdfs.BucketWriter: Creating hdfs://192.168.137.3:8020/data/2017-07-02.1499035391655.tmp
17/07/02 18:43:58 WARN hdfs.HDFSEventSink: HDFS IO error
java.io.IOException: Callable timed out after 10000 ms on file: hdfs://192.168.137.3:8020/data/2017-07-02.1499035391655.tmp
        at org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(BucketWriter.java:682)
        at org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:232)
        at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:504)
        at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:406)
        at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:67)
        at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:145)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.concurrent.TimeoutException
        at java.util.concurrent.FutureTask.get(FutureTask.java:201)
        at org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(BucketWriter.java:675)
        ... 6 more
17/07/02 18:44:05 INFO hdfs.BucketWriter: Creating hdfs://192.168.137.3:8020/data/2017-07-02.1499035391656.tmp
17/07/02 18:44:15 WARN hdfs.HDFSEventSink: HDFS IO error
java.io.IOException: Callable timed out after 10000 ms on file: hdfs://192.168.137.3:8020/data/2017-07-02.1499035391656.tmp
        at org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(BucketWriter.java:682)
        at org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:232)
        at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:504)
        at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:406)
        at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:67)
        at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:145)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.concurrent.TimeoutException
        at java.util.concurrent.FutureTask.get(FutureTask.java:201)
        at org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(BucketWriter.java:675)
        ... 6 more
回复

使用道具 举报

yunge2016 发表于 2017-7-2 20:37:02
启动flume后报错 配置文件中没有找到a2  agent
7/07/02 20:08:53 INFO node.PollingPropertiesFileConfigurationProvider: Configuration provider starting
17/07/02 20:08:53 INFO node.PollingPropertiesFileConfigurationProvider: Reloading configuration file:/opt/modules/flume/conf/final-hdfs.conf
17/07/02 20:08:53 INFO conf.FlumeConfiguration: Processing:HDFS
17/07/02 20:08:53 INFO conf.FlumeConfiguration: Processing:HDFS
17/07/02 20:08:53 INFO conf.FlumeConfiguration: Processing:HDFS
17/07/02 20:08:53 INFO conf.FlumeConfiguration: Processing:HDFS
17/07/02 20:08:53 INFO conf.FlumeConfiguration: Processing:HDFS
17/07/02 20:08:53 INFO conf.FlumeConfiguration: Processing:HDFS
17/07/02 20:08:53 INFO conf.FlumeConfiguration: Processing:HDFS
17/07/02 20:08:53 INFO conf.FlumeConfiguration: Processing:HDFS
17/07/02 20:08:53 INFO conf.FlumeConfiguration: Processing:HDFS
17/07/02 20:08:53 INFO conf.FlumeConfiguration: Processing:HDFS
17/07/02 20:08:53 INFO conf.FlumeConfiguration: Added sinks: HDFS Agent: LogAgent
17/07/02 20:08:53 INFO conf.FlumeConfiguration: Processing:HDFS
17/07/02 20:08:53 WARN conf.FlumeConfiguration: Configuration empty for: LogAgent.Removed.
17/07/02 20:08:53 INFO conf.FlumeConfiguration: Post-validation flume configuration contains configuration for agents: [LogAgent]
17/07/02 20:08:53 WARN node.AbstractConfigurationProvider: No configuration found for this host:a2
17/07/02 20:08:53 INFO node.Application: Starting new configuration:{ sourceRunners:{} sinkRunners:{} channels:{} }
回复

使用道具 举报

nextuser 发表于 2017-7-3 08:31:19
yunge2016 发表于 2017-7-2 20:37
启动flume后报错 配置文件中没有找到a2  agent
7/07/02 20:08:53 INFO node.PollingPropertiesFileConfigu ...

错误太多了,建议先理解一些基本的配置的含义。否则漏洞根本补不过来.
a2到底是host,还是代理名称。
回复

使用道具 举报

yunge2016 发表于 2017-7-3 10:15:18
a2是agent的名称。host是啥。配置文件里没有写的。
回复

使用道具 举报

nextuser 发表于 2017-7-7 08:56:15
本帖最后由 nextuser 于 2017-7-7 08:57 编辑
yunge2016 发表于 2017-7-3 10:15
a2是agent的名称。host是啥。配置文件里没有写的。
全篇好像定义的是flumetohdfs_agent,而非a2
WARN node.AbstractConfigurationProvider: No configuration found for this host:a2
回复

使用道具 举报

liuyou2036 发表于 2020-7-16 14:25:55
最好加上注释
回复

使用道具 举报

12
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

关闭

推荐上一条 /2 下一条