分享

java如何将monggodb数据备份到hdfs

aqi915 发表于 2015-12-8 16:39:55 [显示全部楼层] 回帖奖励 阅读模式 关闭右栏 4 14972
各位大虾好:
       想咨询下下,java如何将monggodb数据备份到hdfs呢?

已有(4)人评论

跳转到指定楼层
xuanxufeng 发表于 2015-12-8 17:28:09

准备工作

使用Mongo Hadoop适配器最简单的方法是从GitHub上克隆Mongo-Hadoop工程,并且将该工程编译到一个特定的Hadoop版本。克隆该工程需要安装一个Git客户端。

本节假定你使用的Hadoop版本是CDH3。

Git客户端官方的下载地址是:http://git-scm.com/downloads

在Windows操作系统上可以通过http://windows.github.com/访问GitHub。

在Mac操作系统上可以通过http://mac.github.com/访问GitHub。

可以通过https://github.com/mongodb/mongo-hadoop获取到Mongo Hadoop适配器。该工程需要编译在特定的Hadoop版本上。编译完的JAR文件需要复制到Hadoop集群每个节点的$HADOOP_HOME/lib目录下。

Mongo Java驱动包也需要安装到Hadoop集群每个节点的$HADOOP_HOME/lib目录下。该驱动包可从https://github.com/mongodb/mongo-java-driver/downloads下载。


更多参考:

从MongoDB导入数据到HDFS
http://www.aboutyun.com/thread-16508-1-1.html



回复

使用道具 举报

aqi915 发表于 2015-12-8 19:51:22
xuanxufeng 发表于 2015-12-8 17:28
准备工作

使用Mongo Hadoop适配器最简单的方法是从GitHub上克隆Mongo-Hadoop工程,并且将该工程编译到 ...

您好:
        测试了您的例子,报错了呢?
Configuration: Configuration: core-default.xml, core-site.xml
15/12/08 19:46:14 INFO client.RMProxy: Connecting to ResourceManager at ktbigdata1/192.168.100.141:8032
15/12/08 19:46:14 WARN mapreduce.JobResourceUploader: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
15/12/08 19:46:15 INFO splitter.SingleMongoSplitter: SingleMongoSplitter calculating splits for mongodb://192.168.3.107:27017/mydb.tree
15/12/08 19:46:15 INFO mapreduce.JobSubmitter: number of splits:1
15/12/08 19:46:15 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
15/12/08 19:46:15 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1449480684665_0027
15/12/08 19:46:16 INFO impl.YarnClientImpl: Submitted application application_1449480684665_0027
15/12/08 19:46:16 INFO mapreduce.Job: The url to track the job: http://ktbigdata1:8088/proxy/application_1449480684665_0027/
15/12/08 19:46:16 INFO mapreduce.Job: Running job: job_1449480684665_0027
15/12/08 19:46:22 INFO mapreduce.Job: Job job_1449480684665_0027 running in uber mode : false
15/12/08 19:46:22 INFO mapreduce.Job:  map 0% reduce 0%
15/12/08 19:46:27 INFO mapreduce.Job: Task Id : attempt_1449480684665_0027_m_000000_0, Status : FAILED
Error: java.lang.NullPointerException
        at test.ImportWeblogsFromMongo$ReadWeblogsFromMongo.map(ImportWeblogsFromMongo.java:21)
        at test.ImportWeblogsFromMongo$ReadWeblogsFromMongo.map(ImportWeblogsFromMongo.java:1)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

15/12/08 19:46:32 INFO mapreduce.Job:  map 100% reduce 0%
15/12/08 19:46:32 INFO mapreduce.Job: Task Id : attempt_1449480684665_0027_m_000000_1, Status : FAILED
Error: java.lang.NullPointerException
        at test.ImportWeblogsFromMongo$ReadWeblogsFromMongo.map(ImportWeblogsFromMongo.java:21)
        at test.ImportWeblogsFromMongo$ReadWeblogsFromMongo.map(ImportWeblogsFromMongo.java:1)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

15/12/08 19:46:33 INFO mapreduce.Job:  map 0% reduce 0%
15/12/08 19:46:35 INFO mapreduce.Job: Task Id : attempt_1449480684665_0027_m_000000_2, Status : FAILED
Error: java.lang.NullPointerException
        at test.ImportWeblogsFromMongo$ReadWeblogsFromMongo.map(ImportWeblogsFromMongo.java:21)
        at test.ImportWeblogsFromMongo$ReadWeblogsFromMongo.map(ImportWeblogsFromMongo.java:1)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

15/12/08 19:46:40 INFO mapreduce.Job:  map 100% reduce 0%
15/12/08 19:46:40 INFO mapreduce.Job: Job job_1449480684665_0027 failed with state FAILED due to: Task failed task_1449480684665_0027_m_000000
Job failed as tasks failed. failedMaps:1 failedReduces:0

15/12/08 19:46:40 INFO mapreduce.Job: Counters: 9
        Job Counters
                Failed map tasks=4
                Launched map tasks=4
                Other local map tasks=3
                Rack-local map tasks=1
                Total time spent by all maps in occupied slots (ms)=12125
                Total time spent by all reduces in occupied slots (ms)=0
                Total time spent by all map tasks (ms)=12125
                Total vcore-seconds taken by all map tasks=12125
                Total megabyte-seconds taken by all map tasks=12416000

这可能会是什么原因呢?

回复

使用道具 举报

easthome001 发表于 2015-12-8 20:08:22
aqi915 发表于 2015-12-8 19:51
您好:
        测试了您的例子,报错了呢?
Configuration: Configuration: core-default.xml, core-si ...

ImportWeblogsFrom Mongo.class这个类可能没有找到
回复

使用道具 举报

aqi915 发表于 2015-12-9 09:11:48
easthome001 发表于 2015-12-8 20:08
ImportWeblogsFrom Mongo.class这个类可能没有找到

您好:
         现在找到那类了,就是好像值没有传到?会是什么原因呢?
Error: java.lang.NullPointerException
        at test.ImportWeblogsFromMongo$ReadWeblogsFromMongo.map(ImportWeblogsFromMongo.java:21)
        at test.ImportWeblogsFromMongo$ReadWeblogsFromMongo.map(ImportWeblogsFromMongo.java:1)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)


回复

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

关闭

推荐上一条 /2 下一条