分享

hive、spark等比impala时间戳多8个小时分享

desehawk 发表于 2017-11-24 20:16:17 [显示全部楼层] 回帖奖励 阅读模式 关闭右栏 0 10726

很多人遇到这么个问题。
同样的数据,用impala查询少8个小时,比如spark和hive查询则是多8个小时。
这里给出一段英文解释
Based on this discussion it seems that when support for saving timestamps in Parquet was added to Hive, the primary goal was to be compatible with Impala's implementation, which probably predates the addition of the timestamp_millis type to the Parquet specification.

Impala's timestamp representation maps to the int96 Parquet type (4 bytes for the date, 8 bytes for the time, details in the linked discussion).

So no, storing a Hive timestamp in Parquet does not use the timestamp_millis type, but Impala's int96 timestamp representation instead.
也就是可能他们用的格式是不一样造成的。
我们知道存在这么个现象,通过函数转换下即可

参考:
http://blog.csdn.net/bsf5521/article/details/72682996

没找到任何评论,期待你打破沉寂

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

关闭

推荐上一条 /2 下一条