分享

关系型数据库中多个表一起导入到HDFS,只能一次导入两张

smfswxj 发表于 2017-9-18 09:39:40 [显示全部楼层] 只看大图 回帖奖励 阅读模式 关闭右栏 8 7444
关系型数据库中多个表一起导入到HDFS:
sqoop-import-all-tables --connect jdbc:mysql://secondmgt:3306/spice  --username hive --password hive --as-textfile --warehouse-dir /output/

只能一次导入两张表,这是什么回事?

已有(8)人评论

跳转到指定楼层
yuwenge 发表于 2017-9-18 09:56:09
每个表都要有主键
回复

使用道具 举报

smfswxj 发表于 2017-9-18 10:00:02
yuwenge 发表于 2017-9-18 09:56
每个表都要有主键

确定是有主键的
回复

使用道具 举报

yuwenge 发表于 2017-9-18 10:02:20
smfswxj 发表于 2017-9-18 10:00
确定是有主键的

应该不会的,但是从提供的信息来看,确实看不到其它问题了。是否遗漏了配置或则其它问题。
回复

使用道具 举报

smfswxj 发表于 2017-9-18 10:02:37
yuwenge 发表于 2017-9-18 09:56
每个表都要有主键

QQ截图20170918100109.jpg QQ截图20170918100121.jpg
导入的时候就只有customers、employees这两张表,其他表都没有,比如说products这张表是有主键的,但是就是无法导入
回复

使用道具 举报

smfswxj 发表于 2017-9-18 10:03:45
yuwenge 发表于 2017-9-18 10:02
应该不会的,但是从提供的信息来看,确实看不到其它问题了。是否遗漏了配置或则其它问题。

大神,要不加个QQ 指导指导  我QQ121651934  感激不尽
回复

使用道具 举报

yuwenge 发表于 2017-9-18 10:07:50
smfswxj 发表于 2017-9-18 10:02
导入的时候就只有customers、employees这两张表,其他表都没有,比如说products这张表是有主键的,但是 ...

那应该有错误的,看下日志
回复

使用道具 举报

smfswxj 发表于 2017-9-18 10:23:03
yuwenge 发表于 2017-9-18 10:07
那应该有错误的,看下日志

[root@tiger wxj]# sqoop-import-all-tables --connect jdbc:mysql://elephant:3306/test  --username wxj --password wxj --as-textfile --warehouse-dir=/user/hive/warehouse
Warning: /usr/lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
17/09/18 10:18:39 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.11.0
17/09/18 10:18:39 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
17/09/18 10:18:39 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
17/09/18 10:18:40 INFO tool.CodeGenTool: Beginning code generation
17/09/18 10:18:40 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `customers` AS t LIMIT 1
17/09/18 10:18:40 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `customers` AS t LIMIT 1
17/09/18 10:18:40 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/lib/hadoop-mapreduce
Note: /tmp/sqoop-root/compile/62107688fc04cf93137972a110739fbb/customers.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
17/09/18 10:18:44 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/62107688fc04cf93137972a110739fbb/customers.jar
17/09/18 10:18:44 WARN manager.MySQLManager: It looks like you are importing from mysql.
17/09/18 10:18:44 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
17/09/18 10:18:44 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
17/09/18 10:18:44 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
17/09/18 10:18:44 INFO mapreduce.ImportJobBase: Beginning import of customers
17/09/18 10:18:45 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
17/09/18 10:18:46 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
17/09/18 10:18:46 INFO client.RMProxy: Connecting to ResourceManager at monkey/192.168.1.241:8032
17/09/18 10:18:51 INFO db.DBInputFormat: Using read commited transaction isolation
17/09/18 10:18:51 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN(`cust_id`), MAX(`cust_id`) FROM `customers`
17/09/18 10:18:51 INFO db.IntegerSplitter: Split size: 50343; Num splits: 4 from: 1000000 to: 1201374
17/09/18 10:18:51 INFO mapreduce.JobSubmitter: number of splits:4
17/09/18 10:18:51 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1505701000334_0002
17/09/18 10:18:52 INFO impl.YarnClientImpl: Submitted application application_1505701000334_0002
17/09/18 10:18:52 INFO mapreduce.Job: The url to track the job: http://monkey:8088/proxy/application_1505701000334_0002/
17/09/18 10:18:52 INFO mapreduce.Job: Running job: job_1505701000334_0002
17/09/18 10:19:08 INFO mapreduce.Job: Job job_1505701000334_0002 running in uber mode : false
17/09/18 10:19:08 INFO mapreduce.Job:  map 0% reduce 0%
17/09/18 10:19:35 INFO mapreduce.Job:  map 25% reduce 0%
17/09/18 10:19:38 INFO mapreduce.Job:  map 50% reduce 0%
17/09/18 10:19:53 INFO mapreduce.Job:  map 75% reduce 0%
17/09/18 10:19:54 INFO mapreduce.Job:  map 100% reduce 0%
17/09/18 10:19:55 INFO mapreduce.Job: Job job_1505701000334_0002 completed successfully
17/09/18 10:19:55 INFO mapreduce.Job: Counters: 32
        File System Counters
                FILE: Number of bytes read=0
                FILE: Number of bytes written=599144
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=481
                HDFS: Number of bytes written=12577346
                HDFS: Number of read operations=16
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=8
        Job Counters
                Launched map tasks=4
                Other local map tasks=4
                Total time spent by all maps in occupied slots (ms)=82934
                Total time spent by all reduces in occupied slots (ms)=0
                Total time spent by all map tasks (ms)=82934
                Total vcore-milliseconds taken by all map tasks=82934
                Total megabyte-milliseconds taken by all map tasks=84924416
        Map-Reduce Framework
                Map input records=201375
                Map output records=201375
                Input split bytes=481
                Spilled Records=0
                Failed Shuffles=0
                Merged Map outputs=0
                GC time elapsed (ms)=734
                CPU time spent (ms)=24030
                Physical memory (bytes) snapshot=770482176
                Virtual memory (bytes) snapshot=6157479936
                Total committed heap usage (bytes)=430833664
                Peak Map Physical memory (bytes)=244813824
                Peak Map Virtual memory (bytes)=1574080512
        File Input Format Counters
                Bytes Read=0
        File Output Format Counters
                Bytes Written=12577346
17/09/18 10:19:55 INFO mapreduce.ImportJobBase: Transferred 11.9947 MB in 69.2712 seconds (177.3114 KB/sec)
17/09/18 10:19:55 INFO mapreduce.ImportJobBase: Retrieved 201375 records.
17/09/18 10:19:55 INFO tool.CodeGenTool: Beginning code generation
17/09/18 10:19:55 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `employees` AS t LIMIT 1
17/09/18 10:19:55 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/lib/hadoop-mapreduce
Note: /tmp/sqoop-root/compile/62107688fc04cf93137972a110739fbb/employees.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
17/09/18 10:19:57 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/62107688fc04cf93137972a110739fbb/employees.jar
17/09/18 10:19:57 INFO mapreduce.ImportJobBase: Beginning import of employees
17/09/18 10:19:57 INFO client.RMProxy: Connecting to ResourceManager at monkey/192.168.1.241:8032
17/09/18 10:19:59 INFO db.DBInputFormat: Using read commited transaction isolation
17/09/18 10:19:59 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN(`emp_id`), MAX(`emp_id`) FROM `employees`
17/09/18 10:19:59 WARN db.TextSplitter: Generating splits for a textual index column.
17/09/18 10:19:59 WARN db.TextSplitter: If your database sorts in a case-insensitive order, this may result in a partial import or duplicate records.
17/09/18 10:19:59 WARN db.TextSplitter: You are strongly encouraged to choose an integral split column.
17/09/18 10:20:00 INFO mapreduce.JobSubmitter: number of splits:6
17/09/18 10:20:00 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1505701000334_0003
17/09/18 10:20:00 INFO impl.YarnClientImpl: Submitted application application_1505701000334_0003
17/09/18 10:20:00 INFO mapreduce.Job: The url to track the job: http://monkey:8088/proxy/application_1505701000334_0003/
17/09/18 10:20:00 INFO mapreduce.Job: Running job: job_1505701000334_0003
17/09/18 10:20:16 INFO mapreduce.Job: Job job_1505701000334_0003 running in uber mode : false
17/09/18 10:20:16 INFO mapreduce.Job:  map 0% reduce 0%
17/09/18 10:20:30 INFO mapreduce.Job:  map 17% reduce 0%
17/09/18 10:20:31 INFO mapreduce.Job:  map 33% reduce 0%
17/09/18 10:20:40 INFO mapreduce.Job:  map 50% reduce 0%
17/09/18 10:20:45 INFO mapreduce.Job:  map 67% reduce 0%
17/09/18 10:20:50 INFO mapreduce.Job:  map 83% reduce 0%
17/09/18 10:20:55 INFO mapreduce.Job:  map 100% reduce 0%
17/09/18 10:20:55 INFO mapreduce.Job: Job job_1505701000334_0003 completed successfully
17/09/18 10:20:55 INFO mapreduce.Job: Counters: 32
        File System Counters
                FILE: Number of bytes read=0
                FILE: Number of bytes written=898932
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=823
                HDFS: Number of bytes written=6706056
                HDFS: Number of read operations=24
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=12
        Job Counters
                Launched map tasks=6
                Other local map tasks=6
                Total time spent by all maps in occupied slots (ms)=66583
                Total time spent by all reduces in occupied slots (ms)=0
                Total time spent by all map tasks (ms)=66583
                Total vcore-milliseconds taken by all map tasks=66583
                Total megabyte-milliseconds taken by all map tasks=68180992
        Map-Reduce Framework
                Map input records=61712
                Map output records=61712
                Input split bytes=823
                Spilled Records=0
                Failed Shuffles=0
                Merged Map outputs=0
                GC time elapsed (ms)=701
                CPU time spent (ms)=17580
                Physical memory (bytes) snapshot=1135460352
                Virtual memory (bytes) snapshot=9213046784
                Total committed heap usage (bytes)=642056192
                Peak Map Physical memory (bytes)=243220480
                Peak Map Virtual memory (bytes)=1570648064
        File Input Format Counters
                Bytes Read=0
        File Output Format Counters
                Bytes Written=6706056
17/09/18 10:20:55 INFO mapreduce.ImportJobBase: Transferred 6.3954 MB in 58.5077 seconds (111.932 KB/sec)
17/09/18 10:20:55 INFO mapreduce.ImportJobBase: Retrieved 61712 records.
17/09/18 10:20:55 INFO tool.CodeGenTool: Beginning code generation
17/09/18 10:20:55 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `order_details` AS t LIMIT 1
17/09/18 10:20:55 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/lib/hadoop-mapreduce
Note: /tmp/sqoop-root/compile/62107688fc04cf93137972a110739fbb/order_details.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
17/09/18 10:20:56 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/62107688fc04cf93137972a110739fbb/order_details.jar
17/09/18 10:20:56 ERROR tool.ImportAllTablesTool: Error during import: No primary key could be found for table order_details. Please specify one with --split-by or perform a sequential import with '-m 1'.


回复

使用道具 举报

tntzbzc 发表于 2017-9-18 19:42:01
本帖最后由 tntzbzc 于 2017-9-18 20:07 编辑
smfswxj 发表于 2017-9-18 10:23
[root@tiger wxj]# sqoop-import-all-tables --connect jdbc:mysql://elephant:3306/test  --username wx ...

执行下面语句,一个线程应该没有问题[mw_shl_code=bash,true]sqoop-import-all-tables --connect jdbc:mysql://secondmgt:3306/spice  --username hive --password hive --as-textfile --warehouse-dir /output/  -m 1[/mw_shl_code]


回复

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

关闭

推荐上一条 /2 下一条