spark on k8s
spark从2.3之后开始 在包里面有docker 镜像脚本。
本文主要介绍2.x的镜像打包已经在k8s中怎么去部署spark,3.x的目前自己还没有测试成功,目前3.0自己遇到的问题是在k8s启动后,会提示没有权限创建logs目录。
1.到spark官网中下载2.x的spark,我下载的是spark-2.4.8-bin-hadoop2.6
spark下载https://archive.apache.org/dist/spark/2.解压spark-2.4.8-bin-hadoop2.6
3.修改 sbin/spark-daemon.sh 这个脚本文件,需要把 "--" 删除 ,否则会在之后的k8s启动中提示
如下错误。
failed to launch: nice -n 0 /opt/spark/bin/spark-class org.apache.spark.deploy.master.Master --host spark-manager-84c8795878-4n8x9 --port 7077 --webui-port 8080 nohup: can't execute '--': No such file or directoryfull log in /opt/spark/logs/spark--org.apache.spark.deploy.master.Master-1-spark-manager-84c8795878-4n8x9.out
4.执行打包命令
./bin/docker-image-tool.sh -t xxx(docker tag) build
5.docker images |grep spark
会出现两个 我用的是不带-r的
6.接下来就是利用yaml文件 创建 spark了 yaml文件我就直接粘贴了 没啥说的,需要注意的就是将镜像改成自己的就行了
---apiVersion: v1kind: Servicemetadata: name: spark-managerspec: type: ClusterIP ports: - name: rpc port: 7077 - name: ui port: 8080 selector: app: spark component: sparkmanager---apiVersion: v1kind: Servicemetadata: name: spark-manager-restspec: type: NodePort ports: - name: rest port: 8080 nodePort: 30221 targetPort: 8080 selector: app: spark component: sparkmanager---apiVersion: apps/v1kind: Deploymentmetadata: name: spark-managerspec: replicas: 1 selector: matchLabels: app: spark component: sparkmanager template: metadata: labels: app: spark component: sparkmanager spec: containers: - name: sparkmanager image: spark:my_spark_2.4_hadoop_2.7 workingDir: /opt/spark command: ["/bin/bash", "-c", "/opt/spark/sbin/start-master.sh && while true;do echo hello;sleep 6000;done"] ports: - containerPort: 7077 name: rpc - containerPort: 8080 name: ui livenessProbe: tcpSocket: port: 7077 initialDelaySeconds: 30 periodSeconds: 60---apiVersion: apps/v1kind: Deploymentmetadata: name: spark-workerspec: replicas: 2 selector: matchLabels: app: spark component: worker template: metadata: labels: app: spark component: worker spec: containers: - name: sparkworker image: spark:my_spark_2.4_hadoop_2.7 workingDir: /opt/spark command: ["/bin/bash", "-c", "/opt/spark/sbin/start-slave.sh spark://spark-manager:7077 && while true;do echo hello;sleep 6000;done"]
7.通过nodePort将spark-ui映射出去可以在浏览器上访问
8.进入到master的pod容器中 执行 spark-submit 测试任务,圆周率计算
./bin/spark-submit --class org.apache.spark.examples.SparkPi --master local examples/jars/spark-examples_2.11-2.4.8.jar
计算成功。
9.至此spark on k8s完成,如果有疑问可以留言,自己也是初步探索,也找了很多网上的资料。