506629361 发表于 2018-11-26 12:34:47

Apache Spark 2.3 运行在Kubernete实战


[*]下载源代码,并解压
下载地址

tar -zxvf v2.3.2.tar.gz

[*]编译

cd spark-2.3.2
build/mvn install -DskipTests
build/mvn compile -Pkubernetes -pl resource-managers/kubernetes/core -am -DskipTests
build/mvn install -Pkubernetes -pl resource-managers/kubernetes/core -am -DskipTests
# ls assembly/target/scala-2.11/jars/ -la|grep spark-kub*
-rw-r--r-- 1 root root   381120 Sep 26 09:56 spark-kubernetes_2.11-2.3.2.jar
dev/make-distribution.sh --tgz -Phadoop-2.7 -Pkubernetes
  构建支持R语言和hive的tar

./dev/make-distribution.sh --name inspur-spark --pip --r --tgz -Psparkr -Phadoop-2.7 -Phive -Phive-thriftserver -Pkubernetes
  出错:

++ echo 'Cannot find '\''R_HOME'\''. Please specify '\''R_HOME'\'' or make sure R is properly installed.'
Cannot find 'R_HOME'. Please specify 'R_HOME' or make sure R is properly installed.
  此次我们只测试Spark running on kubernetes,因此暂时不需要解决此问题。


[*]构建Docker Image
./bin/docker-image-tool.sh -r bigdata.registry.com:5000 -t 2.3.2 build
./bin/docker-image-tool.sh -r bigdata.registry.com:5000 -t 2.3.2 push
  由于本地的私有harbor中创建了仓库insight
因此,执行如下命令push Image:

docker tag bigdata.registry.com:5000/spark:2.3.2 bigdata.registry.com:5000/insight/spark:2.3.2
docker pushbigdata.registry.com:5000/insight/spark:2.3.2
[*]将examples.jar上传至httpd服务中
# ll dist/examples/jars/spark-examples_2.11-2.3.2.jar
-rw-r--r-- 1 root root 1997551 Sep 26 09:56 dist/examples/jars/spark-examples_2.11-2.3.2.jar
# cp dist/examples/jars/spark-examples_2.11-2.3.2.jar /opt/mnt/www/html/spark/
# ll /opt/mnt/www/html/spark/
-rw-r--r-- 1 root root1997551 Sep 26 10:26 spark-examples_2.11-2.3.2.jar
[*]准备kubernetes环境,即授权
kubectl create serviceaccount spark -nspark
kubectl create clusterrolebinding spark-role --clusterrole=edit --serviceaccount=spark:spark --namespace=spark
  --seriveaccount=spark:spark 前一个spark是指namespace, 后一个spark是指serviceaccount

[*]测试
bin/spark-submit \
--master k8s://http://10.221.129.20:8080 \
--deploy-mode cluster \
--name spark-pi \
--class org.apache.spark.examples.SparkPi \
--conf spark.executor.instances=1 \
--conf spark.kubernetes.container.image=bigdata.registry.com:5000/insight/spark:2.3.2 \
--conf spark.kubernetes.namespace=spark \
--conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
http://10.221.129.22/spark/spark-examples_2.11-2.3.2.jar

运行日志:

  2018-09-26 10:27:54 WARNUtils:66 - Kubernetes master URL uses HTTP instead of HTTPS.
2018-09-26 10:28:25 WARNConfig:347 - Error reading service account token from: . Ignoring.
2018-09-26 10:28:27 INFOLoggingPodStatusWatcherImpl:54 - State changed, new state:
pod name: spark-pi-7b0ffe8a4023370a872acdd679f024b1-driver
namespace: default
labels: spark-app-selector -> spark-74d52904a3794e8986895a12322c5cd9, spark-role -> driver
pod uid: d9bce33c-c133-11e8-b988-fa163e609d06
creation time: 2018-09-26T02:28:27Z
service account name: default
volumes: spark-init-properties, download-jars-volume, download-files-volume, default-token-7mnhw
node name: N/A
start time: N/A
container images: N/A
phase: Pending
status: []
2018-09-26 10:28:27 INFOLoggingPodStatusWatcherImpl:54 - State changed, new state:
pod name: spark-pi-7b0ffe8a4023370a872acdd679f024b1-driver
namespace: default
labels: spark-app-selector -> spark-74d52904a3794e8986895a12322c5cd9, spark-role -> driver
pod uid: d9bce33c-c133-11e8-b988-fa163e609d06
creation time: 2018-09-26T02:28:27Z
service account name: default
volumes: spark-init-properties, download-jars-volume, download-files-volume, default-token-7mnhw
node name: master2
start time: N/A
container images: N/A
phase: Pending
status: []
2018-09-26 10:28:27 INFOLoggingPodStatusWatcherImpl:54 - State changed, new state:
pod name: spark-pi-7b0ffe8a4023370a872acdd679f024b1-driver
namespace: default
labels: spark-app-selector -> spark-74d52904a3794e8986895a12322c5cd9, spark-role -> driver
pod uid: d9bce33c-c133-11e8-b988-fa163e609d06
creation time: 2018-09-26T02:28:27Z
service account name: default
volumes: spark-init-properties, download-jars-volume, download-files-volume, default-token-7mnhw
node name: master2
start time: 2018-09-26T02:28:27Z
container images: bigdata.registry.com:5000/insight/spark:2.3.2
phase: Pending
status:
2018-09-26 10:28:28 INFOClient:54 - Waiting for application spark-pi to finish...
2018-09-26 10:28:51 INFOLoggingPodStatusWatcherImpl:54 - State changed, new state:
pod name: spark-pi-7b0ffe8a4023370a872acdd679f024b1-driver
namespace: default
labels: spark-app-selector -> spark-74d52904a3794e8986895a12322c5cd9, spark-role -> driver
pod uid: d9bce33c-c133-11e8-b988-fa163e609d06
creation time: 2018-09-26T02:28:27Z
service account name: default
volumes: spark-init-properties, download-jars-volume, download-files-volume, default-token-7mnhw
node name: master2
start time: 2018-09-26T02:28:27Z
container images: bigdata.registry.com:5000/insight/spark:2.3.2
phase: Pending
status:
2018-09-26 10:28:56 INFOLoggingPodStatusWatcherImpl:54 - State changed, new state:
pod name: spark-pi-7b0ffe8a4023370a872acdd679f024b1-driver
namespace: default
labels: spark-app-selector -> spark-74d52904a3794e8986895a12322c5cd9, spark-role -> driver
pod uid: d9bce33c-c133-11e8-b988-fa163e609d06
creation time: 2018-09-26T02:28:27Z
service account name: default
volumes: spark-init-properties, download-jars-volume, download-files-volume, default-token-7mnhw
node name: master2
start time: 2018-09-26T02:28:27Z
container images: bigdata.registry.com:5000/insight/spark:2.3.2
phase: Pending
status:
2018-09-26 10:28:57 INFOLoggingPodStatusWatcherImpl:54 - State changed, new state:
pod name: spark-pi-7b0ffe8a4023370a872acdd679f024b1-driver
namespace: default
labels: spark-app-selector -> spark-74d52904a3794e8986895a12322c5cd9, spark-role -> driver
pod uid: d9bce33c-c133-11e8-b988-fa163e609d06
creation time: 2018-09-26T02:28:27Z
service account name: default
volumes: spark-init-properties, download-jars-volume, download-files-volume, default-token-7mnhw
node name: master2
start time: 2018-09-26T02:28:27Z
container images: bigdata.registry.com:5000/insight/spark:2.3.2
phase: Running
status:
2018-09-26 10:29:05 INFOLoggingPodStatusWatcherImpl:54 - State changed, new state:
pod name: spark-pi-7b0ffe8a4023370a872acdd679f024b1-driver
namespace: default
labels: spark-app-selector -> spark-74d52904a3794e8986895a12322c5cd9, spark-role -> driver
pod uid: d9bce33c-c133-11e8-b988-fa163e609d06
creation time: 2018-09-26T02:28:27Z
service account name: default
volumes: spark-init-properties, download-jars-volume, download-files-volume, default-token-7mnhw
node name: master2
start time: 2018-09-26T02:28:27Z
container images: bigdata.registry.com:5000/insight/spark:2.3.2
phase: Failed
status:
2018-09-26 10:29:05 INFOLoggingPodStatusWatcherImpl:54 - Container final statuses:
Container name: spark-kubernetes-driver
Container image: bigdata.registry.com:5000/insight/spark:2.3.2
Container state: Terminated
Exit code: 1
2018-09-26 10:29:05 INFOClient:54 - Application spark-pi finished.
2018-09-26 10:29:05 INFOShutdownHookManager:54 - Shutdown hook called
2018-09-26 10:29:05 INFOShutdownHookManager:54 - Deleting directory /tmp/spark-53c85221-619e-41c6-8b94-80b950852b7e


  编码提交:

val args = Array(//10.110.25.114 //10.221.129.20
"--master","k8s://http://10.221.129.20:8080",
"--deploy-mode", "cluster",
"--name","spark-pi",
"--class", "org.apache.spark.examples.SparkPi",
"--conf", "spark.kubernetes.container.image=bigdata.registry.com:5000/insight/spark:2.3.2",
"--conf","spark.kubernetes.container.image.pullPolicy=Always",
"--conf", "spark.kubernetes.namespace=spark",
"--conf","spark.executor.instances=1",
"--conf", "spark.kubernetes.authenticate.driver.serviceAccountName=spark",
"http://10.221.129.22/spark/spark-examples_2.11-2.3.2.jar",
"1000"
)
for (arg
页: [1]
查看完整版本: Apache Spark 2.3 运行在Kubernete实战