Troubleshooting Windows containers

This chapter is about Windows containers troubleshooting.

SSH to Windows Node

When checking Windows container issues, a common step is RDP to nodes and check component status and logs. You could allocate a public IP to the Node or do a port forwarding from router. But a simpler way is via a RDP service (replace with your own node-ip):

# rdp.yaml
apiVersion: v1
kind: Service
  name: rdp
  type: LoadBalancer
  - protocol: TCP
    port: 3389
    targetPort: 3389
kind: Endpoints
apiVersion: v1
  name: rdp
  - addresses:
      - ip: <node-ip>
      - port: 3389
$ kubectl create -f rdp.yaml
$ kubectl get svc rdp
NAME      TYPE           CLUSTER-IP    EXTERNAL-IP      PORT(S)        AGE
rdp       LoadBalancer   3389:32008/TCP   5m

Then connect to the node via service rdp's external IP, e.g. mstsc.exe -v

Don't forget to delete the service after user: kubectl delete -f rdp.yaml.

Windows Pod stuck in ContainerCreating

Besides reasons introduced in Troubleshooting Pod, there are also other causes including:

  • the pause image is misconfigured
  • the container image is not compatible with Windows.
    • Containers on Windows Server 1709 should use images with 1709 tags, e.g.
      • microsoft/aspnet:4.7.2-windowsservercore-1709
      • microsoft/windowsservercore:1709
      • microsoft/iis:windowsservercore-1709
    • Containers on Windows Server 1803 should use images with 1803 tags, e.g.
      • microsoft/aspnet:4.7.2-windowsservercore-1803
      • microsoft/windowsservercore:1803
      • microsoft/iis:windowsservercore-1803

Windows Pod failed to resolve DNS

This is a known issue. After Windows Node rebooted, HNS Policy need to be cleaned up (Should do this for each rebooting):

# On Windows Node
Start-BitsTransfer -Source
Import-Module .\hns.psm1

Stop-Service kubeproxy
Stop-Service kubelet
Get-HnsNetwork | ? Name -eq l2Bridge | Remove-HnsNetwork 
Get-HnsPolicyList | Remove-HnsPolicyList
Start-Service kubelet
Start-Service kubeproxy

Even with this, kube-dns clusterIP may be still not working. A workaround is configure kube-dns Pod's IP address to normal Pods, e.g.

# In Windows container, e.g. kubectl exec -i -t <pod-name> powershell
Set-DnsClientServerAddress -InterfaceIndex $adapter.ifIndex -ServerAddresses,
Set-DnsClient -InterfaceIndex $adapter.ifIndex -ConnectionSpecificSuffix "default.svc.cluster.local"

The kube-dns Pod's IP could be got by

$ kubectl -n kube-system describe endpoints kube-dns
Name:         kube-dns
Namespace:    kube-system
Labels:       k8s-app=kube-dns
Annotations:  <none>
  NotReadyAddresses:  <none>
    Name     Port  Protocol
    ----     ----  --------
    dns      53    UDP
    dns-tcp  53    TCP

Events:  <none>

If your kubernetes cluster is deployed by acs-engine, then acs-engine#2378 could help to fix this issue (redeploy the cluster with this patch or change existing files according to it).

If kubernetes cluster is running on Azure and is using custom VNET, then the VNET should be attached with route table created by provisioning the cluster

rt=$(az network route-table list -g acs-custom-vnet -o json | jq -r '.[].id')
az network vnet subnet update -n KubernetesSubnet \
-g acs-custom-vnet \
--vnet-name KubernetesCustomVNET \
--route-table $rt

where KubernetesSubnet is the name of the vnet subnet, and KubernetesCustomVNET is the name of the custom VNET itself.

An example in bash form if the VNET is in a separate ResourceGroup:

rt=$(az network route-table list -g RESOURCE_GROUP_NAME_KUBE -o json | jq -r '.[].id')
az network vnet subnet update \
--route-table $rt \
--ids "/subscriptions/SUBSCRIPTION_ID/resourceGroups/RESOURCE_GROUP_NAME_VNET/providers/Microsoft.Network/VirtualNetworks/KUBERNETES_CUSTOM_VNET/subnets/KUBERNETES_SUBNET"

Remote endpoint creation failed: HNS failed with error: The switch-port was not found

This is an error happened in kube-proxy when provisioning load balancer rules for kubernetes services. KB4089848 should be installed to fix this issue:

wusa.exe windows10.0-kb4089848-x64_db7c5aad31c520c6983a937c3d53170e84372b11.msu

After the Node rebooted, recheck the fix has been installed:

PS C:\k> Get-HotFix

Source        Description      HotFixID      InstalledBy          InstalledOn
------        -----------      --------      -----------          -----------
27171k8s9000  Update           KB4087256     NT AUTHORITY\SYSTEM  3/22/2018 12:00:00 AM
27171k8s9000  Update           KB4089848     NT AUTHORITY\SYSTEM  4/4/2018 12:00:00 AM

If there are still DNS resolve issues, the steps in previous steps should be applied, e.g. restart kubelet/kube-proxy and setup DnsClientServerAddress.

Windows Pod failed to get ServiceAccount Secret

This is a known issue for old Windows releases. The fix has been included in Windows 1803, please follow here to upgrade Windows.

Windows Pod failed to access Kubernetes API

If you are using a Hyper-V virtual machine, ensure that MAC spoofing is enabled on the network adapter(s).

Windows node cannot access services clusterIP

This is a known limitation of the current networking stack on Windows. Only pods can refer to the Service ClusterIP.


© Pengfei Ni all right reserved,powered by GitbookUpdated at 2018-08-13 08:16:52

results matching ""

    No results matching ""