我有一个hadoop集群,三个节点
hive随便找两个节点,一个安装metastore,配置如下:
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:derby:;databaseName=metastore_db;create=true</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>org.apache.derby.jdbc.EmbeddedDriver</value>
</property>
</configuration>
另一个直接运行hive命令,配置如下:
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
</property>
<property>
<name>hive.exec.scratchdir</name>
<value>/home/yufan/hivetmp/tmp</value>
</property>
<property>
<name>hive.metastore.local</name>
<value>false</value>
</property>
<property>
<name>hive.metastore.uris</name>
<value>thrift://s2.bw.com:9083</value>
</property>
普通的select *from语句没有问题,
但是执行group by等需要运行MR程序时候就出错,
[size=14.3999996185303px]Showing 4096 bytes of 4597 total. Click here for the full log. 4-07 19:44:26,264 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).2015-04-07 19:44:26,264 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system started2015-04-07 19:44:26,284 INFO [main] org.apache.hadoop.mapred.YarnChild: Executing with tokens:2015-04-07 19:44:26,284 INFO [main] org.apache.hadoop.mapred.YarnChild: Kind: mapreduce.job, Service: job_1428415999428_0021, Ident: (org.apache.hadoop.mapreduce.security.token.JobTokenIdentifier@a39579)2015-04-07 19:44:26,379 INFO [main] org.apache.hadoop.mapred.YarnChild: Sleeping for 0ms before retrying again. Got null now.2015-04-07 19:44:26,745 INFO [main] org.apache.hadoop.mapred.YarnChild: mapreduce.cluster.local.dir for child: /hadoop-learn/modules/hadoop-2.5.1/tmp/nm-local-dir/usercache/hadoop/appcache/application_1428415999428_00212015-04-07 19:44:26,866 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring.2015-04-07 19:44:26,868 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring.2015-04-07 19:44:27,068 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id2015-04-07 19:44:27,739 INFO [main] org.apache.hadoop.mapred.Task: Using ResourceCalculatorProcessTree : [ ]2015-04-07 19:44:28,023 INFO [main] org.apache.hadoop.mapred.MapTask: Processing split: Paths:/data/pv/pv.1.txt:1342177280+134217728,/data/pv/pv.1.txt:1476395008+134217728InputFormatClass: hive.HiveInputFormatForPV2015-04-07 19:44:28,068 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address2015-04-07 19:44:28,070 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.lang.RuntimeException: java.io.FileNotFoundException: /tmp/hadoop/hive_2015-04-07_19-44-09_279_8480002767511807217/-mr-10001/93bdaf63-3673-48c3-82d5-24a68afbfce3 (No such file or directory) at org.apache.hadoop.hive.ql.exec.Utilities.getMapRedWork(Utilities.java:224) at org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:255) at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:381) at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:374) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:536) at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.<init>(MapTask.java:168) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:409) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)Caused by: java.io.FileNotFoundException: /tmp/hadoop/hive_2015-04-07_19-44-09_279_8480002767511807217/-mr-10001/93bdaf63-3673-48c3-82d5-24a68afbfce3 (No such file or directory) at java.io.FileInputStream.open(Native Method) at java.io.FileInputStream.<init>(FileInputStream.java:138) at java.io.FileInputStream.<init>(FileInputStream.java:97) at org.apache.hadoop.hive.ql.exec.Utilities.getMapRedWork(Utilities.java:215) ... 12 more2015-04-07 19:44:28,074 INFO [main] org.apache.hadoop.mapred.Task: Runnning cleanup for the task2015-04-07 19:44:28,078 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping MapTask metrics system...2015-04-07 19:44:28,078 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system stopped.2015-04-07 19:44:28,078 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system shutdown complete.
于是,我就关闭其他NodeManger,只运行hive所在节点的NM,就没有发现文件找不到的错误,
我怀疑是hive运行的中间某系东西放到了本地,因此其他节点都找不到,因为他们的本地没有。
但是怎么配置,要运行在hdfs上,以前配置过,可以,最近瞎整,又不可以了,想了解下其中的详细信息,
有高手知道的么,指点一下,应该简单,只不过我发现找不到问题解决办法了。
|