Kubernetes 中 Pod 无法正常解析域名

一般调试手法

更多手法 k8s 官网调试 DNS 问题

1
2
3
4
5
6
7
8
9
10
apiVersion: v1
kind: Pod
metadata:
  name: dnsutils
spec:
  containers:
  - name: dnsutils
    image: mydlqclub/dnsutils:1.3
    imagePullPolicy: IfNotPresent
    command: ["sleep","3600"]
1
2
3
# 部署到问题namespace image 包含ping nslookup netstat 
kubectl create -f dsutils.yaml -n xxx
kubectl exec -it dnsutils -n xxx -- /bin/sh

步骤

  • 查看 resolv.conf 文件的内容

    1
    kubectl exec -ti dnsutils -- cat /etc/resolv.conf
    
  • 错误表示 CoreDNS (或 kube-dns)插件或者相关服务出现了问题

    1
    kubectl exec -i -t dnsutils -- nslookup kubernetes.default
    
  • 检查 DNS Pod 是否运行

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    # 查看coredns运行情况
    kubectl get pods --namespace=kube-system -l k8s-app=kube-dns
      
    NAME                      READY   STATUS    RESTARTS      AGE
    coredns-95db45d46-5wggq   1/1     Running   1 (86s ago)   25h
    coredns-95db45d46-8ss4w   1/1     Running   1 (86s ago)   25h
      
    # coredns logs
    kubectl logs --namespace=kube-system -l k8s-app=kube-dns
      
      
    # 检查是否启用了 DNS 服务
    kubectl get svc --namespace=kube-system
      
    NAME       TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)                  AGE
    kube-dns   ClusterIP   10.96.0.10   <none>        53/UDP,53/TCP,9153/TCP   72d
    # 检查endpoints的pod ip
    kubectl get endpoints kube-dns -n kube-system
      
    NAME       ENDPOINTS                                            AGE
    kube-dns   10.1.0.81:53,10.1.0.82:53,10.1.0.81:53 + 3 more...   72d
      
      
    # 检查coredns cm
    kubectl -n kube-system edit configmap coredns
      
    
  • 测试服务正常

    1
    2
    3
    4
    # 是否解析正常,你的服务在正确的名字空间
    kubectl exec -it dnsutils -- nslookup <service-name>
    kubectl exec -it dnsutils -- nslookup <service-name>.<namespace>
    kubectl exec -it dnsutils -- nslookup <service-name>.<namespace>.svc.cluster.local
    
  • 检查pod和service的DNS映射

    就是workload和service,有没有做一些奇奇怪怪的配置,检查pod的FQDN,DNS配置,搜索域限制等等

bridge-nf-call-iptables被修改了

域名解析失败 ,coredns运行正常,连接svc timeout,通过ip连接没有问题。

image-20220303213626962

1
;; reply from unexpected source: coredns pod ip#53, expected cluster ip#53
1
2
3
4
5
# nslookup 指定dns ip 解析 是可以的
nslookup domain  pod-ip

# failed
telnet cluster-ip 53

说明cluster-ip到pod-ip,iptable有问题,再检查一下endpoints,对应关系没问题。那就是iptable有问题。如果用k8s,还有检查一下kube-proxy。我用的是k3s。

最后找到这个

centos

1
2
3
4
5
echo '1' > /proc/sys/net/bridge/bridge-nf-call-iptables

echo 'net.bridge.bridge-nf-call-iptables=1' >> /etc/sysctl.conf
sysctl -p

I’m run running kubernetes 1.8.0 on ubuntu16 to get rid of “reply from unexpected source” error you have to : modprobe br_netfilter http://ebtables.netfilter.org/documentation/bridge-nf.html

For CentOS, I fixed this issue using: echo ‘1’ > /proc/sys/net/bridge/bridge-nf-call-iptables

The problem is that I did it before, it was working and for some reason it changed back to 0 again after a while. I will have to keep monitoring.

k3s server agent 启动会自动配置内核参数,目测参数改了。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
// Configure loads required kernel modules and sets sysctls required for other components to
// function properly.
func Configure(enableIPv6 bool, config *kubeproxyconfig.KubeProxyConntrackConfiguration) {
	loadKernelModule("overlay")
	loadKernelModule("nf_conntrack")
	loadKernelModule("br_netfilter")
	loadKernelModule("iptable_nat")
	if enableIPv6 {
		loadKernelModule("ip6table_nat")
	}

	// Kernel is inconsistent about how devconf is configured for
	// new network namespaces between ipv4 and ipv6. Make sure to
	// enable forwarding on all and default for both ipv4 and ipv6.
	sysctls := map[string]int{
		"net/ipv4/conf/all/forwarding":       1,
		"net/ipv4/conf/default/forwarding":   1,
		"net/bridge/bridge-nf-call-iptables": 1,
	}

	if enableIPv6 {
		sysctls["net/ipv6/conf/all/forwarding"] = 1
		sysctls["net/ipv6/conf/default/forwarding"] = 1
		sysctls["net/bridge/bridge-nf-call-ip6tables"] = 1
		sysctls["net/core/devconf_inherit_init_net"] = 1
	}

	if conntrackMax := getConntrackMax(config); conntrackMax > 0 {
		sysctls["net/netfilter/nf_conntrack_max"] = conntrackMax
	}
	.....
}

为什么 kubernetes 环境要求开启 bridge-nf-call-iptables

启用 bridge-nf-call-iptables 这个内核参数 (置为 1),表示 bridge 设备在二层转发时也去调用 iptables 配置的三层规则 (包含 conntrack),所以开启这个参数就能够解决上述 Service 同节点通信问题,这也是为什么在 Kubernetes 环境中,大多都要求开启 bridge-nf-call-iptables 的原因。

resolv.conf 有古怪

dnsutils

1
2
3
4
5
6
7
8
9
10
11
12
13
14
/ # cat /etc/resolv.conf 
nameserver 10.43.0.10
search minio-tenant1.svc.cluster.local svc.cluster.local cluster.local openstacklocal
options ndots:5

# 带.svc.cluster.local的解析不出来,不带的可以
/ # ping kubernetes.default.svc.cluster.local
ping: bad address 'kubernetes.default.svc.cluster.local'

/ # ping kubernetes.default
PING kubernetes.default (10.43.0.1): 56 data bytes
^C
--- kubernetes.default ping statistics ---
4 packets transmitted, 0 packets received, 100% packet loss

coredns log

1
2
3
4
5
6
[ERROR] plugin/errors: 2 kubernetes.default.svc.cluster.local.openstacklocal. A: read udp 10.42.0.64:36643->114.114.114.114:53: i/o timeout
[ERROR] plugin/errors: 2 kubernetes.default.svc.cluster.local.openstacklocal. AAAA: read udp 10.42.0.64:48004->8.8.8.8:53: i/o timeout
[ERROR] plugin/errors: 2 kubernetes.default.svc.cluster.local.openstacklocal. A: read udp 10.42.0.64:52739->114.114.114.114:53: i/o timeout
[ERROR] plugin/errors: 2 kubernetes.default.svc.cluster.local.openstacklocal. AAAA: read udp 10.42.0.64:49995->8.8.8.8:53: i/o timeout
[ERROR] plugin/errors: 2 kubernetes.default.svc.cluster.local.openstacklocal. AAAA: read udp 10.42.0.64:36894->8.8.8.8:53: i/o timeout
[ERROR] plugin/errors: 2 kubernetes.default.svc.cluster.local.openstacklocal. A: read udp 10.42.0.64:40524->8.8.8.8:53: i/o timeout

可以看出pod的resolv.conf 文件有个莫名奇妙的openstacklocal,而且直接去掉。可以才可以正常解析不会报错

1
2
3
4
5
6
/ # nslookup kubernetes.default.svc.cluster.local
Server:         10.43.0.10
Address:        10.43.0.10#53

Name:   kubernetes.default.svc.cluster.local
Address: 10.43.0.1

pod的dns配置其实由kubelet生成,主要参数有

1
2
--resolv-conf=/etc/resolv.conf 名字解析服务的配置文件名,用作容器 DNS 解析配置的基础
--cluster-dns=10.43.0.10"   DNS 服务器的 IP 地址,以逗号分隔。

可以通过docker inspect kubelet |less查看启动参数和配置

从k8s 1.23GetPodDNS得知,默认ClusterFirst策略,会结合以上参数生成对应pod配置,详细源码分析参考

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
func (c *Configurer) GetPodDNS(pod *v1.Pod) (*runtimeapi.DNSConfig, error) {
  
  //获取当前 host 的 dns config, 其实就是解析宿主机上的resolv.conf  使用--resolv-conf
	dnsConfig, err := c.getHostDNSConfig()
	if err != nil {
		return nil, err
	}
// 获取 pod 的 dns 策略,根据pod的dns策略生成对应的dns配置,默认ClusterFirst 
  //其他参考https://kubernetes.io/zh-cn/docs/concepts/services-networking/dns-pod-service/#pod-s-dns-policy
	dnsType, err := getPodDNSType(pod)
	if err != nil {
		klog.ErrorS(err, "Failed to get DNS type for pod. Falling back to DNSClusterFirst policy.", "pod", klog.KObj(pod))
		dnsType = podDNSCluster
	}
  
	switch dnsType {
	case podDNSNone:
		// DNSNone should use empty DNS settings as the base.
		dnsConfig = &runtimeapi.DNSConfig{}
	case podDNSCluster:
		if len(c.clusterDNS) != 0 {
			// For a pod with DNSClusterFirst policy, the cluster DNS server is
			// the only nameserver configured for the pod. The cluster DNS server
			// itself will forward queries to other nameservers that is configured
			// to use, in case the cluster DNS server cannot resolve the DNS query
			// itself.
      
      // 默认会结合pod和host dns config,生成对应的配置,DNS搜索域列表做一些去重
			dnsConfig.Servers = []string{}
			for _, ip := range c.clusterDNS {
				dnsConfig.Servers = append(dnsConfig.Servers, ip.String())
			}
			dnsConfig.Searches = c.generateSearchesForDNSClusterFirst(dnsConfig.Searches, pod)
			dnsConfig.Options = defaultDNSOptions
			break
		}
		// clusterDNS is not known. Pod with ClusterDNSFirst Policy cannot be created.
		nodeErrorMsg := fmt.Sprintf("kubelet does not have ClusterDNS IP configured and cannot create Pod using %q policy. Falling back to %q policy.", v1.DNSClusterFirst, v1.DNSDefault)
		c.recorder.Eventf(c.nodeRef, v1.EventTypeWarning, "MissingClusterDNS", nodeErrorMsg)
		c.recorder.Eventf(pod, v1.EventTypeWarning, "MissingClusterDNS", "pod: %q. %s", format.Pod(pod), nodeErrorMsg)
		// Fallback to DNSDefault.
		fallthrough
	case podDNSHost:
		// When the kubelet --resolv-conf flag is set to the empty string, use
		// DNS settings that override the docker default (which is to use
		// /etc/resolv.conf) and effectively disable DNS lookups. According to
		// the bind documentation, the behavior of the DNS client library when
		// "nameservers" are not specified is to "use the nameserver on the
		// local machine". A nameserver setting of localhost is equivalent to
		// this documented behavior.
		if c.ResolverConfig == "" {
			for _, nodeIP := range c.nodeIPs {
				if utilnet.IsIPv6(nodeIP) {
					dnsConfig.Servers = append(dnsConfig.Servers, "::1")
				} else {
					dnsConfig.Servers = append(dnsConfig.Servers, "127.0.0.1")
				}
			}
			if len(dnsConfig.Servers) == 0 {
				dnsConfig.Servers = append(dnsConfig.Servers, "127.0.0.1")
			}
			dnsConfig.Searches = []string{"."}
		}
	}

	if pod.Spec.DNSConfig != nil {
		dnsConfig = appendDNSConfig(dnsConfig, pod.Spec.DNSConfig)
	}
	return c.formDNSConfigFitsLimits(dnsConfig, pod), nil
}

所以这个 openstacklocal搜索域,肯定从宿主上引入,查看宿主机上的/etc/resolv.conf,发现根本修改不了,被其他程序占用了

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
ll /etc/resolv.conf

lrwxrwxrwx 1 root root 39 Feb 14  2019 /etc/resolv.conf -> ../run/systemd/resolve/stub-resolv.conf

cat /etc/resolv.conf
# This file is managed by man:systemd-resolved(8). Do not edit.
#
# This is a dynamic resolv.conf file for connecting local clients to the
# internal DNS stub resolver of systemd-resolved. This file lists all
# configured search domains.
#
# Run "systemd-resolve --status" to see details about the uplink DNS servers
# currently in use.
#
# Third party programs must not access this file directly, but only through the
# symlink at /etc/resolv.conf. To manage man:resolv.conf(5) in a different way,
# replace this symlink by a static file or a different symlink.
#
# See man:systemd-resolved.service(8) for details about the supported modes of
# operation for /etc/resolv.conf.

nameserver xxxx
options edns0
search openstacklocal

根据重新夺回对 /etc/resolv.conf 的控制权,和以上信息,知道被systemd-resolved占用了

修改完重启kubelet,对应的pod,coredns

coredns 可能会报错

1
2
[FATAL] plugin/loop: Loop (127.0.0.1:48066 -> :53) detected for zone ".", see https://coredns.io/plugins/loop#troubleshooting. Query: "HINFO 2066162189351134310.8810881223121065474."

因为dns设置了包含127.0.x.x导致的,宿主的/etc/resolv.conf 去掉nameserver 127.0.x.x,修改完做对应重启