【实战排障】K8s三则报错解决：节点加入失败、APIServer连接异常、资源获取错误_[preflight] running pre-flight checks error execut

技术文档

问题一、node节点加入集群失败（残留文件/端口占用）

构建三节点k8s集群过程中（已完成初始化），其余node节点加入k8s集群时报错。

1、报错片段

[root@node01 docker]# kubeadm join 11.0.1.173:6443 --token dj2kc8.h3h0ep3jkmtry355 \\ --discovery-token-ca-cert-hash sha256:7f01426026c45d3dcd60e67f825356d4896e681de0a08f7846322eb01f4e9b76 [preflight] Running pre-flight checkserror execution phase preflight: [preflight] Some fatal errors occurred:[ERROR FileAvailable--etc-kubernetes-kubelet.conf]: /etc/kubernetes/kubelet.conf already exists[ERROR Port-10250]: Port 10250 is in use[ERROR FileAvailable--etc-kubernetes-pki-ca.crt]: /etc/kubernetes/pki/ca.crt already exists[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`

2、根因分析

通过报错信息\"[ERROR FileAvailable--etc-kubernetes-kubelet.conf]: /etc/kubernetes/kubelet.conf already exists
[ERROR Port-10250]: Port 10250 is in use
[ERROR FileAvailable--etc-kubernetes-pki-ca.crt]: /etc/kubernetes/pki/ca.crt already exists
\"，我们可以得出当前节点存在之前k8s安装的残留文件，因此加入集群失败。

3、解决思路

（1）清理残留

#1、重置kubeadm（如果之前初始化过）sudo kubeadm reset --forcesystemctl status kubelet #确认未残留旧服务#2、手动删除残留文件sudo rm -rf /etc/kubernetes/*sudo rm -rf /var/lib/kubelet/*sudo rm -rf /var/lib/etcd/*sudo rm -rf /var/lib/etcd/*sudo rm -rf ~/.kube/#3、清理网络接口(Flannel/calico)sudo ip link delete cni0sudo ip link delete flannel.1 2>/dev/null#4、释放被占用的端口（10250）sudo kill -9 $(sudo lsof -i :10250 -t)

（2）重新加入集群

sudo kubeadm join 11.0.1.173:6443 \\ --token dj2kc8.h3h0ep3jkmtry355 \\ --discovery-token-ca-cert-hash sha256:7f01426026c45d3dcd60e67f825356d4896e681de0a08f7846322eb01f4e9b76 \\ --ignore-preflight-errors=Port-10250 # 仅在确认端口冲突可忽略时使用

（3）注意事项

-------token失效(24h有效)

#在master节点重新生成有效tokenkubeadm token create --print-join-command

-------网络插件（需确保running）

#查看网络插件状态kubectl get pods -n kube-system

-------时间同步、防火墙、docker/containerd

#所有节点时间必须同步sudo timedatectl set-ntp true#防火墙端口开放sudo firewalld-cmd --add-port={6443,10250,2379,2380}/tcp --permanentsudo firewalld-cmd --reload#docker/containerdsudo systemctl restart docker containerd

按照以上步骤排查处理node节点加入集群报错问题，若仍失败，请结合\"journalctl -u kubelet\"看日志输出结果继续进行具体排查。

问题二、APIServer连接异常（kubeconfig/DNS问题）

在node节点使用\"kubectl get nodes\"命令时出现报错。

1、报错片段

[root@node01 docker]# kubectl get noedsE0527 15:38:23.832617 126941 memcache.go:265] couldn\'t get current server API group list: Get \"http://localhost:8080/api?timeout=32s\": dial tcp: lookup localhost on 8.8.8.8:53: no such hostE0527 15:38:23.882251 126941 memcache.go:265] couldn\'t get current server API group list: Get \"http://localhost:8080/api?timeout=32s\": dial tcp: lookup localhost on 8.8.8.8:53: no such hostE0527 15:38:23.934448 126941 memcache.go:265] couldn\'t get current server API group list: Get \"http://localhost:8080/api?timeout=32s\": dial tcp: lookup localhost on 8.8.8.8:53: no such hostE0527 15:38:23.986633 126941 memcache.go:265] couldn\'t get current server API group list: Get \"http://localhost:8080/api?timeout=32s\": dial tcp: lookup localhost on 8.8.8.8:53: no such hostE0527 15:38:24.053044 126941 memcache.go:265] couldn\'t get current server API group list: Get \"http://localhost:8080/api?timeout=32s\": dial tcp: lookup localhost on 8.8.8.8:53: no such hostUnable to connect to the server: dial tcp: lookup localhost on 8.8.8.8:53: no such host

2、根因分析

这个报错显示kubectl未正确连接到Api Server，尝试默认连接http://localhost:8080（没配置kubeconfig或环境变量），且查不到localhost域名（DNS解析失败）。

3、解决思路

（1）确认kubectl配置文件有无，或默认路径是否存在（如果文件不存在或内容错误，kubectl会默认尝试连接localhost:8080，失败是正常的）

[root@node01 ~]# echo $KUBECONFIG/root/.kube/config[root@node01 ~]# ls ~/.kube/configls: 无法访问 \'/root/.kube/config\': 没有那个文件或目录

这时，我们可以用这几条命令进行修复。

mkdir -p ~/.kubecp -i /etc/kubernetes/admin.conf ~/.kube/configchown $(id -u):$(id -g) ~/.kube/config

我这里直接手动创建，有config这个文件之后就能通过\"kubectl get nodes\"正常看到各个节点的状态了。

（2）根据\"lookup localhost on 8.8.8.8:53: no such host\"的报错，我们查看/etc/hosts文件，确认/etc/hosts文件里是否有localhost。

#检查cat /etc/hosts |grep localhost#如果没有echo \"127.0.0.1 localhost\" >> /etc/hosts#重试kubectl get nodes

（3）代理

如果你配置了代理http_proxy和https_proxy，kubectl、DNS也可能会异常。

#排查env |grep -i proxy#尝试临时取消unset http_proxyunset https_proxyunset HTTP_PROXYunset HTTPS_PROXY#重试kubectl get nodes

问题三、资源获取失败（配置错误/版本兼容）

1、报错片段

[root@node02 ~]# kubectl get nodesE0527 21:56:54.255182 614678 memcache.go:265] couldn\'t get current server API group list: the server could not find the requested resourceE0527 21:56:54.255898 614678 memcache.go:265] couldn\'t get current server API group list: the server could not find the requested resourceE0527 21:56:54.258469 614678 memcache.go:265] couldn\'t get current server API group list: the server could not find the requested resourceE0527 21:56:54.259326 614678 memcache.go:265] couldn\'t get current server API group list: the server could not find the requested resourceE0527 21:56:54.261615 614678 memcache.go:265] couldn\'t get current server API group list: the server could not find the requested resourceError from server (NotFound): the server could not find the requested resource

2、根因分析

此时kubectl可以连接Api Server，但尝试获取资源时，服务器找不到资源。原因可能是kubeconfig文件指向的Api Server地址错误（当前处于非master节点）或者kubectl与Api Server版本不兼容。

3、解决思路

复制kubeconfig并修改地址

#在master节点执行scp ~/.kube/config node02:/root/.kube/config

然后在对应node节点修改~/.kube/config文件

#找到这一段server: https://127.0.0.1:6443#改成下面这一段server: https://master节点的ip:6443#检查kubectl get nodes

上面是kubeconfig文件的问题，如果是版本兼容的问题，建议针对性的进行降级或升级。

总结

以上三类问题覆盖了K8s节点部署中的常见痛点，核心思路是：清理残留→检查配置→验证连通性。建议结合日志（journalctl -u kubelet）和集群状态（kubectl get pods -A）综合排查。

> 本文部分内容由 AI 辅助生成，结合本人实践整理而成，欢迎指正与交流。

【实战排障】K8s三则报错解决：节点加入失败、APIServer连接异常、资源获取错误_[preflight] running pre-flight checks error execut

问题一、node节点加入集群失败（残留文件/端口占用）

1、报错片段

2、根因分析

3、解决思路

问题二、APIServer连接异常（kubeconfig/DNS问题）

1、报错片段

2、根因分析

3、解决思路

问题三、资源获取失败（配置错误/版本兼容）

1、报错片段

2、根因分析

3、解决思路

总结

公告

DeepSeek全套部署资料免费下载

免费可商用字体批量下载

【实战排障】K8s三则报错解决：节点加入失败、APIServer连接异常、资源获取错误_[preflight] running pre-flight checks error execut

问题一、node节点加入集群失败（残留文件/端口占用）

1、报错片段

2、根因分析

3、解决思路

问题二、APIServer连接异常（kubeconfig/DNS问题）

1、报错片段

2、根因分析

3、解决思路

问题三、资源获取失败（配置错误/版本兼容）

1、报错片段

2、根因分析

3、解决思路

总结

相关问题

公告

DeepSeek全套部署资料免费下载

免费可商用字体批量下载