请教spark-sql的问题

我代码执行如下：
[mw_shl_code=scala,true] val sparkConf = new SparkConf().setAppName("FemaleInfo")
val sc = new SparkContext(sparkConf)
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
import sqlContext.implicits._

// Convert RDD to DataFrame through the implicit conversion, then register table.
sc.textFile(args(0)).map(_.split(","))
   .map(p => FemaleInfo(p(0), p(1), p(2).trim.toInt))
   .toDF.registerTempTable("FemaleInfoTable")

// Via SQL statements to screen out the time information of female stay on the Internet , and aggregated the same names.
val femaleTimeInfo = sqlContext.sql("select name,sum(stayTime) as " +
   "stayTime from FemaleInfoTable where gender = 'female' group by name")[/mw_shl_code]

请问我该如何查看这个表FemaleInfoTable呢我通过spark-sql后台查看发现并没有找到这个表

nextuser · 发表于 2017-2-25 10:05:34

这是“临时表”，用完就删掉了。所以楼主会找不到

Wyy_Ck · 发表于 2017-2-25 10:17:13

nextuser 发表于 2017-2-25 10:05
这是“临时表”，用完就删掉了。所以楼主会找不到

多谢回答。

如果我想保存到HDFS呢  有没有什么方法呢

下面是我的代码：
[mw_shl_code=scala,true] val userPrincipal = "miner"
val userKeytabPath = "/opt/user.keytab"
val krb5ConfPath = "/opt/krb5.conf"
      System.setProperty("hadoop.home.dir", "/opt/FI-Client/HDFS/hadoop/")
val hadoopConf: Configuration  = new Configuration()
LoginUtil.login(userPrincipal, userKeytabPath, krb5ConfPath, hadoopConf);

// Configure Spark application name
val sparkConf = new SparkConf().setAppName("FemaleInfo")
val sc = new SparkContext(sparkConf)
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
import sqlContext.implicits._

// Convert RDD to DataFrame through the implicit conversion, then register table.
sc.textFile(args(0)).map(_.split(","))
   .map(p => FemaleInfo(p(0), p(1), p(2).trim.toInt))
   .toDF.registerTempTable("FemaleInfoTable")

// Via SQL statements to screen out the time information of female stay on the Internet , and aggregated the same names.
val femaleTimeInfo = sqlContext.sql("select name,sum(stayTime) as " +
   "stayTime from FemaleInfoTable where gender = 'female' group by name")

// Filter information about female netizens who spend more than 2 hours online.
val c = femaleTimeInfo.filter("stayTime >= 120").collect().foreach(println)
sc.stop()  }[/mw_shl_code]

nextuser · 发表于 2017-2-25 10:24:01

Wyy_Ck 发表于 2017-2-25 10:17
多谢回答。

如果我想保存到HDFS呢有没有什么方法呢

后面可以加上save函数
sc.textFile(args(0)).map(_.split(","))
  .map(p => FemaleInfo(p(0), p(1), p(2).trim.toInt))
  .toDF.registerTempTable("FemaleInfoTable").save(path="hdfs://path/to/data.parquet",
         source="parquet",
         mode="append")
更多参考
Spark SQL 1.3.0 DataFrame介绍、使用及提供了些完整的数据写入
http://www.aboutyun.com/forum.php?mod=viewthread&tid=12358

nextuser · 发表于 2017-2-25 10:32:06

DataFrame.save
将DataFrame写入指定的外部数据源。
DataFrame.saveAsTable
将DataFrame保存为SQL表，元信息存入Hive metastore，同时将数据写入指定位置。

图文精华

请教spark-sql的问题

已有(4)人评论

最佳新人

活跃会员

热心会员

推荐 /2