kubectl describe node <node-name>
命令来查看当前 Node 的事件。这些事件通常都会有助于排查 Node 发生的问题。ssh [email protected]
。kubectl delete -f ssh.yaml
。iptables -P FORWARD ACCEPT
开启 Docker 容器的 IP 转发kubectl describe node <node name>
并查看 Kubelet 日志中的错误信息。常见的问题及修复方法为:Kubernetes node 有可能会出现各种硬件、内核或者运行时等问题,这些问题有可能导致服务异常。而 Node Problem Detector(NPD)就是用来监测这些异常的服务。NPD 以 DaemonSet 的方式运行在每台 Node 上面,并在异常发生时更新 NodeCondition(比如 KernelDaedlock、DockerHung、BadDisk 等)或者 Node Event(比如 OOM Kill 等)。
Failed to start ContainerManager failed to initialise top level QOS containers
(参考 #43856),临时解决方法是:--exec-opt native.cgroupdriver=systemd
选项。--cgroups-per-qos=false
),查看 node 的事件会发现每分钟都会有 Failed to update Node Allocatable Limits
的警告信息:Kubernetes nodes can be scheduled toCapacity
. Pods can consume all the available capacity on a node by default. This is an issue because nodes typically run quite a few system daemons that power the OS and Kubernetes itself. Unless resources are set aside for these system daemons, pods and system daemons compete for resources and lead to resource starvation issues on the node.Thekubelet
exposes a feature namedNode Allocatable
that helps to reserve compute resources for system daemons. Kubernetes recommends cluster administrators to configureNode Allocatable
based on their workload density on each node.1Node Capacity2---------------------------3| kube-reserved |4|-------------------------|5| system-reserved |6|-------------------------|7| eviction-threshold |8|-------------------------|9| |10| allocatable |11| (available for pods) |12| |13| |14---------------------------Copied!
conntrack-tools
包后重启 kube-proxy 即可。This worked well on version 1.11 of Kubernetes. After upgrading to 1.12 or 1.13, I've noticed that doing this will cause the cluster to significantly slow down; up to the point where nodes are being marked as NotReady and no new work is being scheduled.For example, consider a scenario in which I schedule 400 jobs, each with its own ConfigMap, which print "Hello World" on a single-node cluster would.
On v1.11, it takes about 10 minutes for the cluster to process all jobs. New jobs can be scheduled. On v1.12 and v1.13, it takes about 60 minutes for the cluster to process all jobs. After this, no new jobs can be scheduled.This is related to max concurrent http2 streams and the change of configmap manager of kubelet. By default, max concurrent http2 stream of http2 server in kube-apiserver is 250, and every configmap will consume one stream to watch in kubelet at least from version 1.13.x. Kubelet will stuck to communicate to kube-apiserver and then become NotReady if too many pods with configmap scheduled to it. A work around is to change the config http2-max-streams-per-connection of kube-apiserver to a bigger value.