Kubernetes指南
Linux性能优化实战eBPF 核心技术与实战SDN指南个人博客
EN
EN
  • Overview
  • Introduction
    • Kubernetes Introduction
    • Kubernetes Concepts
    • Kubernetes 101
    • Kubernetes 201
    • Kubernetes Cluster
  • Concepts
    • Concepts
    • Architecture
    • Design Principles
    • Components
      • etcd
      • kube-apiserver
      • kube-scheduler
      • kube-controller-manager
      • kubelet
      • kube-proxy
      • kube-dns
      • Federation
      • kubeadm
      • hyperkube
      • kubectl
    • Objects
      • Autoscaling
      • ConfigMap
      • CronJob
      • CustomResourceDefinition
      • DaemonSet
      • Deployment
      • Ingress
      • Job
      • LocalVolume
      • Namespace
      • NetworkPolicy
      • Node
      • PersistentVolume
      • Pod
      • PodPreset
      • ReplicaSet
      • Resource Quota
      • Secret
      • SecurityContext
      • Service
      • ServiceAccount
      • StatefulSet
      • Volume
  • Setup
    • Setup Guidance
    • kubectl Install
    • Single Machine
    • Feature Gates
    • Best Practice
    • Version Support
    • Setup Cluster
      • kubeadm
      • kops
      • Kubespray
      • Azure
      • Windows
      • LinuxKit
      • kubeasz
    • Setup Addons
      • Addon-manager
      • DNS
      • Dashboard
      • Monitoring
      • Logging
      • Metrics
      • GPU
      • Cluster Autoscaler
      • ip-masq-agent
  • Extension
    • API Extension
      • Aggregation
      • CustomResourceDefinition
    • Access Control
      • Authentication
      • RBAC Authz
      • Admission
    • Scheduler Extension
    • Network Plugin
      • CNI
      • Flannel
      • Calico
      • Weave
      • Cilium
      • OVN
      • Contiv
      • SR-IOV
      • Romana
      • OpenContrail
      • Kuryr
    • Container Runtime
      • CRI-tools
      • Frakti
    • Storage Driver
      • CSI
      • FlexVolume
      • glusterfs
    • Network Policy
    • Ingress Controller
      • Ingress + Letsencrypt
      • minikube Ingress
      • Traefik Ingress
      • Keepalived-VIP
    • Cloud Provider
    • Device Plugin
  • Cloud Native Apps
    • Apps Management
      • Patterns
      • Rolling Update
      • Helm
      • Operator
      • Service Mesh
      • Linkerd
      • Linkerd2
    • Istio
      • Deploy
      • Traffic Management
      • Security
      • Policy
      • Metrics
      • Troubleshooting
      • Community
    • Devops
      • Draft
      • Jenkins X
      • Spinnaker
      • Kompose
      • Skaffold
      • Argo
      • Flux GitOps
  • Practices
    • Overview
    • Resource Management
    • Cluster HA
    • Workload HA
    • Debugging
    • Portmap
    • Portforward
    • User Management
    • GPU
    • HugePage
    • Security
    • Audit
    • Backup
    • Cert Rotation
    • Large Cluster
    • Big Data
      • Spark
      • Tensorflow
    • Serverless
  • Troubleshooting
    • Overview
    • Cluster Troubleshooting
    • Pod Troubleshooting
    • Network Troubleshooting
    • PV Troubleshooting
      • AzureDisk
      • AzureFile
    • Windows Troubleshooting
    • Cloud Platform Troubleshooting
      • Azure
    • Troubleshooting Tools
  • Community
    • Development Guide
    • Unit Test and Integration Test
    • Community Contribution
  • Appendix
    • Ecosystem
    • Learning Resources
    • Domestic Mirrors
    • How to Contribute
    • Reference Documents
由 GitBook 提供支持
在本页
  • Checking Pod status and running nodes
  • Inspect Pod events
  • Surveying Node status
  • kube-apiserver logs
  • kube-controller-manager logs
  • kube-scheduler logs
  • kube-dns logs
  • Kubelet logs
  • Kube-proxy logs
  • Further Reading
  1. Troubleshooting

Overview

上一页Serverless下一页Cluster Troubleshooting

最后更新于1年前

The art of untangling issues in Kubernetes clusters and applications primarily involves

With powered by OpenAI, you can automatically diagnose the tribulations encumbering your cluster and interact with the said cluster in natural language.

In the mission of untangling errors, kubectl assumes the role of the primary tool, usually serving as the starting point towards identifying mistakes. Following are commands of frequent necessity that are integral to error troubleshooting processes.

Checking Pod status and running nodes

kubectl get pods -o wide
kubectl -n kube-system get pods -o wide

Inspect Pod events

kubectl describe pod <pod-name>

Surveying Node status

kubectl get nodes
kubectl describe node <node-name>

kube-apiserver logs

PODNAME=$(kubectl -n kube-system get pod -l component=kube-apiserver -o jsonpath='{.items[0].metadata.name}')
kubectl -n kube-system logs $PODNAME --tail 100

The above commands presuppose the control plane functioning in the form of Kubernetes static Pod. If kube-apiserver is governed by systemd, you will have to log into the master node, then use journalctl -u kube-apiserver to review its log.

kube-controller-manager logs

PODNAME=$(kubectl -n kube-system get pod -l component=kube-controller-manager -o jsonpath='{.items[0].metadata.name}')
kubectl -n kube-system logs $PODNAME --tail 100

Similar to the above, these operations also assume that the control plane is operating in the form of Kubernetes static pod. If the kube-controller-manager is managed by systemd, you will need to log in to the master node, then use journalctl -u kube-controller-manager to review its log.

kube-scheduler logs

PODNAME=$(kubectl -n kube-system get pod -l component=kube-scheduler -o jsonpath='{.items[0].metadata.name}')
kubectl -n kube-system logs $PODNAME --tail 100

As seen earlier, these operations assume that the control plane is functioning as Kubernetes static Pod. If kube-scheduler is managed by systemd, log into the master node, then use journalctl -u kube-scheduler to access its log.

kube-dns logs

kube-dns is usually deployed as an Addon, with each Pod encompassing three containers. The most critical log is from the kubedns container:

PODNAME=$(kubectl -n kube-system get pod -l k8s-app=kube-dns -o jsonpath='{.items[0].metadata.name}')
kubectl -n kube-system logs $PODNAME -c kubedns

Kubelet logs

curl -LO https://github.com/kvaps/kubectl-node-shell/raw/master/kubectl-node_shell
chmod +x ./kubectl-node_shell
sudo mv ./kubectl-node_shell /usr/local/bin/kubectl-node_shell

kubectl node-shell <node>
journalctl -l -u kubelet

Kube-proxy logs

Kube-proxy is usually deployed as a DaemonSet, its logs can directly be queried with kubectl

$ kubectl -n kube-system get pod -l component=kube-proxy
NAME               READY     STATUS    RESTARTS   AGE
kube-proxy-42zpn   1/1       Running   0          1d
kube-proxy-7gd4p   1/1       Running   0          3d
kube-proxy-87dbs   1/1       Running   0          4d
$ kubectl -n kube-system logs kube-proxy-42zpn

Further Reading

Kubelet is typically managed by systemd. To look at Kubelet logs, begin by SSHing into the Node. It's suggested to use the plugin instead of allocating a public IP address for each node. For instance:

The collates a montage of public Kubernetes anomaly cases.

shares general insights into troubleshooting AKS.

narrates general strategies for troubleshooting question within GKE.

.

Sniffing out irregularities in cluster status
Decoding anomalies in Pod operations
Unscrambling network dysfunctions
Solving persistent storage glitches
Untangling AzureDisk issues
Straightening out AzureFile hitches
Deciphering Windows container hitches
Navigating through cloud platform irregularities
Resolving Azure snags
Must-have tools for troubleshooting
kube-copilot
kubectl-node-shell
hjacobs/kubernetes-failure-stories
https://docs.microsoft.com/en-us/azure/aks/troubleshooting
https://cloud.google.com/kubernetes-engine/docs/troubleshooting
https://www.oreilly.com/ideas/kubernetes-recipes-maintenance-and-troubleshooting