设为首页 收藏本站
查看: 669|回复: 0

[经验分享] 第57课:Spark SQL on Hive配置及实战

[复制链接]

尚未签到

发表于 2018-10-22 06:25:20 | 显示全部楼层 |阅读模式
scala> hc.sql("show tables").collect.foreach(println)  
[sougou,false]
  
[t1,false]
  
scala> hc.sql("select count(*) from sougou").collect.foreach(println)
  
16/03/14 23:15:58 INFO parse.ParseDriver: Parsing command: select count(*) from sougou
  
16/03/14 23:16:00 INFO parse.ParseDriver: Parse Completed
  
16/03/14 23:16:01 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
  
16/03/14 23:16:02 INFO storage.MemoryStore: Block broadcast_0 stored as values in memory (estimated size 474.9 KB, free 474.9 KB)
  
16/03/14 23:16:02 INFO storage.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 41.6 KB, free 516.4 KB)
  
16/03/14 23:16:02 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on 192.168.199.100:41635 (size: 41.6 KB, free: 517.4 MB)
  
16/03/14 23:16:02 INFO spark.SparkContext: Created broadcast 0 from collect at :30
  
16/03/14 23:16:03 INFO mapred.FileInputFormat: Total input paths to process : 1
  
16/03/14 23:16:03 INFO spark.SparkContext: Starting job: collect at :30
  
16/03/14 23:16:03 INFO scheduler.DAGScheduler: Registering RDD 5 (collect at :30)
  
16/03/14 23:16:03 INFO scheduler.DAGScheduler: Got job 0 (collect at :30) with 1 output partitions
  
16/03/14 23:16:03 INFO scheduler.DAGScheduler: Final stage: ResultStage 1 (collect at :30)
  
16/03/14 23:16:03 INFO scheduler.DAGScheduler: Parents of final stage: List(ShuffleMapStage 0)
  
16/03/14 23:16:04 INFO scheduler.DAGScheduler: Missing parents: List(ShuffleMapStage 0)
  
16/03/14 23:16:04 INFO scheduler.DAGScheduler: Submitting ShuffleMapStage 0 (MapPartitionsRDD[5] at collect at :30), which has no missing parents
  
16/03/14 23:16:04 INFO storage.MemoryStore: Block broadcast_1 stored as values in memory (estimated size 13.8 KB, free 530.2 KB)
  
16/03/14 23:16:04 INFO storage.MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 6.9 KB, free 537.1 KB)
  
16/03/14 23:16:04 INFO storage.BlockManagerInfo: Added broadcast_1_piece0 in memory on 192.168.199.100:41635 (size: 6.9 KB, free: 517.4 MB)
  
16/03/14 23:16:04 INFO spark.SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:1006
  
16/03/14 23:16:04 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from ShuffleMapStage 0 (MapPartitionsRDD[5] at collect at :30)
  
16/03/14 23:16:04 INFO scheduler.TaskSchedulerImpl: Adding task set 0.0 with 2 tasks
  
16/03/14 23:16:04 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, spark-worker2, partition 0,NODE_LOCAL, 2152 bytes)
  
16/03/14 23:16:04 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, spark-worker1, partition 1,NODE_LOCAL, 2152 bytes)
  
16/03/14 23:16:05 INFO storage.BlockManagerInfo: Added broadcast_1_piece0 in memory on spark-worker2:55899 (size: 6.9 KB, free: 146.2 MB)
  
16/03/14 23:16:05 INFO storage.BlockManagerInfo: Added broadcast_1_piece0 in memory on spark-worker1:38231 (size: 6.9 KB, free: 146.2 MB)
  
16/03/14 23:16:09 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on spark-worker1:38231 (size: 41.6 KB, free: 146.2 MB)
  
16/03/14 23:16:10 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on spark-worker2:55899 (size: 41.6 KB, free: 146.2 MB)
  
16/03/14 23:16:16 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 12015 ms on spark-worker1 (1/2)
  
16/03/14 23:16:16 INFO scheduler.DAGScheduler: ShuffleMapStage 0 (collect at :30) finished in 12.351 s
  
16/03/14 23:16:16 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 12341 ms on spark-worker2 (2/2)
  
16/03/14 23:16:16 INFO scheduler.DAGScheduler: looking for newly runnable stages
  
16/03/14 23:16:16 INFO scheduler.DAGScheduler: running: Set()
  
16/03/14 23:16:16 INFO scheduler.DAGScheduler: waiting: Set(ResultStage 1)
  
16/03/14 23:16:16 INFO scheduler.DAGScheduler: failed: Set()
  
16/03/14 23:16:16 INFO scheduler.DAGScheduler: Submitting ResultStage 1 (MapPartitionsRDD[8] at collect at :30), which has no missing parents
  
16/03/14 23:16:16 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
  
16/03/14 23:16:16 INFO storage.MemoryStore: Block broadcast_2 stored as values in memory (estimated size 12.9 KB, free 550.1 KB)
  
16/03/14 23:16:16 INFO storage.MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 6.4 KB, free 556.5 KB)
  
16/03/14 23:16:16 INFO storage.BlockManagerInfo: Added broadcast_2_piece0 in memory on 192.168.199.100:41635 (size: 6.4 KB, free: 517.4 MB)
  
16/03/14 23:16:16 INFO spark.SparkContext: Created broadcast 2 from broadcast at DAGScheduler.scala:1006
  
16/03/14 23:16:16 INFO scheduler.DAGScheduler: Submitting 1 missing tasks from ResultStage 1 (MapPartitionsRDD[8] at collect at :30)
  
16/03/14 23:16:16 INFO scheduler.TaskSchedulerImpl: Adding task set 1.0 with 1 tasks
  
16/03/14 23:16:16 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 1.0 (TID 2, spark-worker1, partition 0,NODE_LOCAL, 1999 bytes)
  
16/03/14 23:16:16 INFO storage.BlockManagerInfo: Added broadcast_2_piece0 in memory on spark-worker1:38231 (size: 6.4 KB, free: 146.1 MB)
  
16/03/14 23:16:17 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 0 to spark-worker1:43568
  
16/03/14 23:16:17 INFO spark.MapOutputTrackerMaster: Size of output statuses for shuffle 0 is 158 bytes
  
16/03/14 23:16:18 INFO scheduler.DAGScheduler: ResultStage 1 (collect at :30) finished in 1.288 s
  
16/03/14 23:16:18 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 1.0 (TID 2) in 1279 ms on spark-worker1 (1/1)
  
16/03/14 23:16:18 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 1.0, whose tasks have all completed, from pool
  
16/03/14 23:16:18 INFO scheduler.DAGScheduler: Job 0 finished: collect at :30, took 14.285673 s
  
[1000000]



运维网声明 1、欢迎大家加入本站运维交流群:群②:261659950 群⑤:202807635 群⑦870801961 群⑧679858003
2、本站所有主题由该帖子作者发表,该帖子作者与运维网享有帖子相关版权
3、所有作品的著作权均归原作者享有,请您和我们一样尊重他人的著作权等合法权益。如果您对作品感到满意,请购买正版
4、禁止制作、复制、发布和传播具有反动、淫秽、色情、暴力、凶杀等内容的信息,一经发现立即删除。若您因此触犯法律,一切后果自负,我们对此不承担任何责任
5、所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其内容的准确性、可靠性、正当性、安全性、合法性等负责,亦不承担任何法律责任
6、所有作品仅供您个人学习、研究或欣赏,不得用于商业或者其他用途,否则,一切后果均由您自己承担,我们对此不承担任何法律责任
7、如涉及侵犯版权等问题,请您及时通知我们,我们将立即采取措施予以解决
8、联系人Email:admin@iyunv.com 网址:www.yunweiku.com

所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其承担任何法律责任,如涉及侵犯版权等问题,请您及时通知我们,我们将立即处理,联系人Email:kefu@iyunv.com,QQ:1061981298 本贴地址:https://www.yunweiku.com/thread-624638-1-1.html 上篇帖子: 第56课:Spark SQL和DataFrame的本质 下篇帖子: 安装SQL2005提示“SQL Server 2005 COM+ 目录要求”警告 解决方法
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

扫码加入运维网微信交流群X

扫码加入运维网微信交流群

扫描二维码加入运维网微信交流群,最新一手资源尽在官方微信交流群!快快加入我们吧...

扫描微信二维码查看详情

客服E-mail:kefu@iyunv.com 客服QQ:1061981298


QQ群⑦:运维网交流群⑦ QQ群⑧:运维网交流群⑧ k8s群:运维网kubernetes交流群


提醒:禁止发布任何违反国家法律、法规的言论与图片等内容;本站内容均来自个人观点与网络等信息,非本站认同之观点.


本站大部分资源是网友从网上搜集分享而来,其版权均归原作者及其网站所有,我们尊重他人的合法权益,如有内容侵犯您的合法权益,请及时与我们联系进行核实删除!



合作伙伴: 青云cloud

快速回复 返回顶部 返回列表