Flannel Networking Demystify
In my previous article Deploy a Ubuntu Based Flannel K8S Cluster in Azure with ARM Template and Kubeadm, I provided an Azure ARM template to deploy a flannel networking K8S cluster on Azure. But how flannel networking works, in this article we will discuss a little bit about the internals, specially for its VXLAN
, udp
and host-gw
mode.
0 What is Flannel
Refer to Flannel
Flannel is a simple and easy way to configure a layer 3 network fabric designed for Kubernetes.
Flannel create a flat network runs above host's network, it is an overlay networking solution for K8S cluster.
1 How it Works
Refer to Flannel
Flannel runs a small, single binary agent called
flanneld
on each host, and is responsible for allocating a subnet lease to each host out of a larger, preconfigured address space. Flannel uses either the Kubernetes API or etcd directly to store the network configuration, the allocated subnets, and any auxiliary data (such as the host's public IP). Packets are forwarded using one of several backend mechanisms including VXLAN and various cloud integrations.
flanneld
runs under kube-flannel-ds-*
containter, this container is created/configure when flannel networking is applied to kubernetes cluster.
kubectl get pods --namespace=kube-system -o wide
NAME READY STATUS RESTARTS AGE IP NODE
...
kube-flannel-ds-kklgx 1/1 Running 4 21d 172.16.0.4 k8snode-342zzth442uje-0
kube-flannel-ds-rk2k2 1/1 Running 3 21d 172.16.0.5 k8snode-342zzth442uje-1
...
Currently Flannel networking supports three backends
- VXLAN
- UDP
- Host-GW
2 Networking Details
Refer to Flannel
Flannel is responsible for providing a layer 3 IPv4 network between multiple nodes in a cluster. Flannel does not control how containers are networked to the host, only how the traffic is transported between hosts. However, flannel does provide a CNI plugin for Kubernetes and a guidance on integrating with Docker.
To illustrate how flannel networking works, we deployed a 2 nodes flannel K8s cluster to Azure, below is the networking diagram
2.1 Flannel Networking Space
By default, flannel will have a 10.244.X.0/24 subnet allocated to each node, K8S Pod will use IP address from subnet's address space.
2.2 Veth Pair
For each K8S Pod, flannel will create a pair of veth devices. Refer to veth Taking node1 for example, eth0(containern) is created in flannel network namespace, vethxxxxxxxx is created in host network namespace.
The veth devices are virtual Ethernet devices. They can act as tunnels between network namespaces to create a bridge to a physical
network device in another namespace.
ifconfig -a
...
veth43b57597 Link encap:Ethernet HWaddr 92:e6:95:33:81:fc
inet6 addr: fe80::90e6:95ff:fe33:81fc/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1450 Metric:1
RX packets:218343 errors:0 dropped:0 overruns:0 frame:0
TX packets:247970 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:41250604 (41.2 MB) TX bytes:28812043 (28.8 MB)
veth822966d6 Link encap:Ethernet HWaddr 8a:fa:7b:db:62:3e
inet6 addr: fe80::88fa:7bff:fedb:623e/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1450 Metric:1
RX packets:5123 errors:0 dropped:0 overruns:0 frame:0
TX packets:9927 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:489234 (489.2 KB) TX bytes:953998 (953.9 KB)
...
2.3 Cni0 Bridge
Interface vethxxxxxxxx is connected to interface cni0, cni0 is a Linux network bridge device, all veth devices will connect to this bridge, so all containers in same node can communicate with each other. cni0 has ip address 10.244.X.1,
To check cni0 details, run ip -d link show cni0
5: cni0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether 0a:58:0a:f4:01:01 brd ff:ff:ff:ff:ff:ff promiscuity 0
bridge forward_delay 1500 hello_time 200 max_age 2000 ageing_time 30000 stp_state 0 priority 32768 vlan_filtering 0 vlan_protocol 802.1Q addrgenmode eui64
We can verify veth device is connected with cni0 by issuing command ip -d link show veth43b57597
...
6: veth43b57597@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master cni0 state UP mode DEFAULT group default
link/ether 92:e6:95:33:81:fc brd ff:ff:ff:ff:ff:ff link-netnsid 0 promiscuity 1
veth
bridge_slave state forwarding priority 32 cost 2 hairpin on guard off root_block off fastleave off learning on flood on addrgenmode eui64
It shows veth43b57597 is a bridge_slave and it's master is cni0.
bridge vlan show
port vlan ids
docker0 1 PVID Egress Untagged
cni0 1 PVID Egress Untagged
veth43b57597 1 PVID Egress Untagged
veth822966d6 1 PVID Egress Untagged
2.4 VXLAN Device
When VXLAN backend is being used by flannel, a VXLAN device whose name is flannel.<vni> will be created, <vni> stands for VXLAN Network Identifier, by default in flannel VNI is set to 1, that means the default device name is flannel.1. ip -d link show flannel.1
will show details about this VXALN device
4: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN mode DEFAULT group default
link/ether 8e:d0:f8:0a:41:19 brd ff:ff:ff:ff:ff:ff promiscuity 0
vxlan id 1 local 172.16.0.5 dev eth0 srcport 0 0 dstport 8472 nolearning ageing 300 udpcsum addrgenmode eui64
As displayed from above output, vxlan id is 1, eth0 device is used in tunneling, VXLAN UDP port is 8472 and the nolearning tag that disables source-address learning meaning Multicast is not used but Unicast with static L3 entries for the peers.
VXLAN device flannel.1 is linked with physical network device eth0 to send out VXLAN traffics through physical network. Agent flanneld
will populate node ARP table as well as the bridge forwarding database, so flannel.1 knows how to forward traffics within physical network. When a new kubernetes node is found (either during startup or when it’s created), flanneld
adds
- ARP entry for remote node's VXLAN device. (VXLAN device IP->VXLAN device MAC)
- VXLAN fdb entry to remote host. (VXLAN device MAC->Remote Node IP)
Sample ARP entry and FDB entry from node1
# Permanet ARP entry programmed by flanneld
ip neigh show dev flannel.1
10.244.0.0 lladdr 76:34:2f:c5:51:ec PERMANENT
# Static fdb database programmed by flanneld
bridge fdb show flannel.1
33:33:00:00:00:01 dev eth0 self permanent
01:00:5e:00:00:01 dev eth0 self permanent
33:33:ff:a3:fa:d5 dev eth0 self permanent
33:33:00:00:00:01 dev docker0 self permanent
01:00:5e:00:00:01 dev docker0 self permanent
02:42:cf:eb:2d:10 dev docker0 master docker0 permanent
02:42:cf:eb:2d:10 dev docker0 vlan 1 master docker0 permanent
76:34:2f:c5:51:ec dev flannel.1 dst 172.16.0.4 self permanent
33:33:00:00:00:01 dev cni0 self permanent
01:00:5e:00:00:01 dev cni0 self permanent
33:33:ff:c4:13:6e dev cni0 self permanent
0a:58:0a:f4:01:01 dev cni0 master cni0 permanent
0a:58:0a:f4:01:01 dev cni0 vlan 1 master cni0 permanent
92:e6:95:33:81:fc dev veth43b57597 vlan 1 master cni0 permanent
92:e6:95:33:81:fc dev veth43b57597 master cni0 permanent
0a:58:0a:f4:01:04 dev veth43b57597 master cni0
33:33:00:00:00:01 dev veth43b57597 self permanent
01:00:5e:00:00:01 dev veth43b57597 self permanent
33:33:ff:33:81:fc dev veth43b57597 self permanent
8a:fa:7b:db:62:3e dev veth822966d6 vlan 1 master cni0 permanent
8a:fa:7b:db:62:3e dev veth822966d6 master cni0 permanent
33:33:00:00:00:01 dev veth822966d6 self permanent
01:00:5e:00:00:01 dev veth822966d6 self permanent
33:33:ff:db:62:3e dev veth822966d6 self permanent
2.5 VXLAN Routing
Traffics between cni0 and flannel.1 are forwarded by iptables, the configuration is in below.
# Check ip_forward is enabled or not
cat /proc/sys/net/ipv4/ip_forward
1
# Check iptables setting, forwarding is configured for flannel's networking address space
iptables-save
...
-A FORWARD -s 10.244.0.0/16 -j ACCEPT
-A FORWARD -d 10.244.0.0/16 -j ACCEPT
...
If you look into details if host node's routing table. We can find containers in same host node communicate to each other over the cni0 linux bridge (each container gets its own network namespace which gets connected to cni0 bridge via pair of veth interfaces) and traffics go to the containers in other host nodes will via flannel.1 interface as per routing table rule for the 10.244.0.0/24 subnet.
route
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
...
10.244.0.0 10.244.0.0 255.255.255.0 UG 0 0 0 flannel.1
10.244.1.0 * 255.255.255.0 U 0 0 0 cni0
...
2.6 Connecting the Dots(VXLAN)
Connecting the dots, the network flow will looks like in below
Imageing Pod-A from node1 is going to send data to Pod-B in node0, Pod-A's IP address is 10.244.1.5 and Pod-B's IP address is 10.244.0.5
- Pod-A's outbound packet will go to cni0 bridge
- Then it gets forwarded to device flannel.1 based on routing table entry
10.244.0.0 10.244.0.0 255.255.255.0 UG 0 0 0 flannel.1
- Flannel.1 uses 10.244.0.0's MAC address
76:34:2f:c5:51:ec
(populated byflanneld
) as the destination MAC for inner ethernet packet. - Next, flannel.1 needs to get VXLAN Tunnel Endpoint(VTEP)'s destination IP to send it out. By looking for
76:34:2f:c5:51:ec
from bridge fdb database, flannel.1 now has the IP address 172.16.0.4 of destination node, and a wrapped VXLAN packet is sent to node0. - node0 will pass up this packet to Pod-B by applying the reversed packet processing logic.
3 A Real Example of VXLAN Backend
Let's run an interactive shell from kubernetes and give it a name called busybox
kubectl run -i --tty busybox --image=busybox -- sh
We can see busybox
was running under node k8snode-342zzth442uje-1
kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE
busybox-5858cc4697-5jc7f 1/1 Running 0 3m 10.244.1.5 k8snode-342zzth442uje-1
From busybox
's shell prompt, we can see its IP address is 10.244.1.5, subnet is 10.244.1.0/24, and MAC address is 0A:58:0A:F4:01:05
# ifconfig
eth0 Link encap:Ethernet HWaddr 0A:58:0A:F4:01:05
inet addr:10.244.1.5 Bcast:0.0.0.0 Mask:255.255.255.0
inet6 addr: fe80::6063:d9ff:fe07:194b/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1450 Metric:1
RX packets:19 errors:0 dropped:0 overruns:0 frame:0
TX packets:9 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:1574 (1.5 KiB) TX bytes:766 (766.0 B)
...
Let's see how routing is configured inside of busybox
, it appears 10.244.1.1 is the gateway for 16bit subnet 10.244.0.0/16.
# route
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
default 10.244.1.1 0.0.0.0 UG 0 0 0 eth0
10.244.0.0 10.244.1.1 255.255.0.0 UG 0 0 0 eth0
10.244.1.0 * 255.255.255.0 U 0 0 0 eth0
Here comes the question, who has the IP address 10.244.1.1, the answer is "cni0".If we run ifconfig -a
from node k8snode-342zzth442uje-1, we can see cni0 has the IP address 10.244.1.1.
cni0 Link encap:Ethernet HWaddr 0a:58:0a:f4:01:01
inet addr:10.244.1.1 Bcast:0.0.0.0 Mask:255.255.255.0
inet6 addr: fe80::5c62:5eff:fec4:136e/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1450 Metric:1
RX packets:4838 errors:0 dropped:0 overruns:0 frame:0
TX packets:5489 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:847131 (847.1 KB) TX bytes:640488 (640.4 KB)
...
So if we ping 10.244.0.5
from busybox
, based on busybox
's routing table, ICMP ping packet will be forwarded to cni0.
Then from node k8snode-342zzth442uje-1's routing table, ICMP ping packet will be forwarded to flannel.1.
route
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
...
10.244.0.0 10.244.0.0 255.255.255.0 UG 0 0 0 flannel.1
...
flannel.1's IP configuration is below
flannel.1 Link encap:Ethernet HWaddr 8e:d0:f8:0a:41:19
inet addr:10.244.1.0 Bcast:0.0.0.0 Mask:255.255.255.255
inet6 addr: fe80::8cd0:f8ff:fe0a:4119/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1450 Metric:1
RX packets:1554 errors:0 dropped:0 overruns:0 frame:0
TX packets:1554 errors:0 dropped:73 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:130878 (130.8 KB) TX bytes:129775 (129.7 KB)
Put it all together, we have below network devices, their IP and MAC addresses
Network Device | IP Address | MAC Address |
---|---|---|
eth0(busybox) | 10.244.1.5 | 0a:58:0a:f4:01:05 |
veth822966d6 | 8a:fa:7b:db:62:3e | |
cni0 | 10.244.1.1 | 0a:58:0a:f4:01:01 |
flannel.1 | 10.244.1.0 | 8e:d0:f8:0a:41:19 |
eth0 | 172.16.0.5 | 00:0d:3a:a3:fa:d5 |
We will keep running ICMP ping from busybox and see how the ICMP ping packet presented from each interface
# ping 10.244.0.5
PING 10.244.0.5 (10.244.0.5): 56 data bytes
64 bytes from 10.244.0.5: seq=0 ttl=62 time=0.915 ms
64 bytes from 10.244.0.5: seq=1 ttl=62 time=0.665 ms
64 bytes from 10.244.0.5: seq=2 ttl=62 time=0.972 ms
...
Network trace from tcpdump -i veth822966d6 -n -e
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on veth822966d6, link-type EN10MB (Ethernet), capture size 262144 bytes
03:49:53.612452 0a:58:0a:f4:01:05 > 0a:58:0a:f4:01:01, ethertype IPv4 (0x0800), length 98: 10.244.1.5 > 10.244.0.5: ICMP echo request, id 3584, seq 22, length 64
03:49:53.615386 0a:58:0a:f4:01:01 > 0a:58:0a:f4:01:05, ethertype IPv4 (0x0800), length 98: 10.244.0.5 > 10.244.1.5: ICMP echo reply, id 3584, seq 22, length 64
Network trace from tcpdump -i cni0 -n -e "icmp"
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on cni0, link-type EN10MB (Ethernet), capture size 262144 bytes
03:51:48.765967 0a:58:0a:f4:01:05 > 0a:58:0a:f4:01:01, ethertype IPv4 (0x0800), length 98: 10.244.1.5 > 10.244.0.5: ICMP echo request, id 3584, seq 137, length 64
03:51:48.766957 0a:58:0a:f4:01:01 > 0a:58:0a:f4:01:05, ethertype IPv4 (0x0800), length 98: 10.244.0.5 > 10.244.1.5: ICMP echo reply, id 3584, seq 137, length 64
Network trace from tcpdump -i flannel.1 -n -e "icmp"
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on flannel.1, link-type EN10MB (Ethernet), capture size 262144 bytes
03:52:47.857269 8e:d0:f8:0a:41:19 > 76:34:2f:c5:51:ec, ethertype IPv4 (0x0800), length 98: 10.244.1.5 > 10.244.0.5: ICMP echo request, id 3584, seq 196, length 64
03:52:47.858080 76:34:2f:c5:51:ec > 8e:d0:f8:0a:41:19, ethertype IPv4 (0x0800), length 98: 10.244.0.5 > 10.244.1.5: ICMP echo reply, id 3584, seq 196, length 64
Network trace from tcpdump -i eth0 -n -e "udp"
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
03:56:12.054767 00:0d:3a:a3:fa:d5 > 12:34:56:78:9a:bc, ethertype IPv4 (0x0800), length 148: 172.16.0.5.38330 > 172.16.0.4.8472: OTV, flags [I] (0x08), overlay 0, instance 1
8e:d0:f8:0a:41:19 > 76:34:2f:c5:51:ec, ethertype IPv4 (0x0800), length 98: 10.244.1.5 > 10.244.0.5: ICMP echo request, id 3584, seq 400, length 64
03:56:12.055200 a0:3d:6f:01:0f:ef > 00:0d:3a:a3:fa:d5, ethertype IPv4 (0x0800), length 148: 172.16.0.4.56724 > 172.16.0.5.8472: OTV, flags [I] (0x08), overlay 0, instance 1
76:34:2f:c5:51:ec > 8e:d0:f8:0a:41:19, ethertype IPv4 (0x0800), length 98: 10.244.0.5 > 10.244.1.5: ICMP echo reply, id 3584, seq 400, length 64
Fullly expanded ICMP request packet captured from eth0
Frame 7: 148 bytes on wire (1184 bits), 148 bytes captured (1184 bits)
Encapsulation type: Ethernet (1)
Arrival Time: May 31, 2018 21:09:53.823688000 China Standard Time
[Time shift for this packet: 0.000000000 seconds]
Epoch Time: 1527772193.823688000 seconds
[Time delta from previous captured frame: 0.132288000 seconds]
[Time delta from previous displayed frame: 0.000000000 seconds]
[Time since reference or first frame: 0.799452000 seconds]
Frame Number: 7
Frame Length: 148 bytes (1184 bits)
Capture Length: 148 bytes (1184 bits)
[Frame is marked: False]
[Frame is ignored: False]
[Protocols in frame: eth:ethertype:ip:udp:vxlan:eth:ethertype:ip:icmp:data]
[Coloring Rule Name: ICMP]
[Coloring Rule String: icmp || icmpv6]
Ethernet II, Src: Microsof_a3:fa:d5 (00:0d:3a:a3:fa:d5), Dst: 12:34:56:78:9a:bc (12:34:56:78:9a:bc)
Destination: 12:34:56:78:9a:bc (12:34:56:78:9a:bc)
Address: 12:34:56:78:9a:bc (12:34:56:78:9a:bc)
.... ..1. .... .... .... .... = LG bit: Locally administered address (this is NOT the factory default)
.... ...0 .... .... .... .... = IG bit: Individual address (unicast)
Source: Microsof_a3:fa:d5 (00:0d:3a:a3:fa:d5)
Address: Microsof_a3:fa:d5 (00:0d:3a:a3:fa:d5)
.... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
.... ...0 .... .... .... .... = IG bit: Individual address (unicast)
Type: IPv4 (0x0800)
Internet Protocol Version 4, Src: 172.16.0.5, Dst: 172.16.0.4
0100 .... = Version: 4
.... 0101 = Header Length: 20 bytes (5)
Differentiated Services Field: 0x00 (DSCP: CS0, ECN: Not-ECT)
0000 00.. = Differentiated Services Codepoint: Default (0)
.... ..00 = Explicit Congestion Notification: Not ECN-Capable Transport (0)
Total Length: 134
Identification: 0x1171 (4465)
Flags: 0x0000
0... .... .... .... = Reserved bit: Not set
.0.. .... .... .... = Don't fragment: Not set
..0. .... .... .... = More fragments: Not set
...0 0000 0000 0000 = Fragment offset: 0
Time to live: 64
Protocol: UDP (17)
Header checksum: 0x10cd [validation disabled]
[Header checksum status: Unverified]
Source: 172.16.0.5
Destination: 172.16.0.4
User Datagram Protocol, Src Port: 38330, Dst Port: 8472
Source Port: 38330
Destination Port: 8472
Length: 114
Checksum: 0x58ad [unverified]
[Checksum Status: Unverified]
[Stream index: 0]
Virtual eXtensible Local Area Network
Flags: 0x0800, VXLAN Network ID (VNI)
0... .... .... .... = GBP Extension: Not defined
.... .... .0.. .... = Don't Learn: False
.... 1... .... .... = VXLAN Network ID (VNI): True
.... .... .... 0... = Policy Applied: False
.000 .000 0.00 .000 = Reserved(R): 0x0000
Group Policy ID: 0
VXLAN Network Identifier (VNI): 1
Reserved: 0
Ethernet II, Src: 8e:d0:f8:0a:41:19 (8e:d0:f8:0a:41:19), Dst: 76:34:2f:c5:51:ec (76:34:2f:c5:51:ec)
Destination: 76:34:2f:c5:51:ec (76:34:2f:c5:51:ec)
Address: 76:34:2f:c5:51:ec (76:34:2f:c5:51:ec)
.... ..1. .... .... .... .... = LG bit: Locally administered address (this is NOT the factory default)
.... ...0 .... .... .... .... = IG bit: Individual address (unicast)
Source: 8e:d0:f8:0a:41:19 (8e:d0:f8:0a:41:19)
Address: 8e:d0:f8:0a:41:19 (8e:d0:f8:0a:41:19)
.... ..1. .... .... .... .... = LG bit: Locally administered address (this is NOT the factory default)
.... ...0 .... .... .... .... = IG bit: Individual address (unicast)
Type: IPv4 (0x0800)
Internet Protocol Version 4, Src: 10.244.1.5, Dst: 10.244.0.5
0100 .... = Version: 4
.... 0101 = Header Length: 20 bytes (5)
Differentiated Services Field: 0x00 (DSCP: CS0, ECN: Not-ECT)
0000 00.. = Differentiated Services Codepoint: Default (0)
.... ..00 = Explicit Congestion Notification: Not ECN-Capable Transport (0)
Total Length: 84
Identification: 0x2b06 (11014)
Flags: 0x4000, Don't fragment
0... .... .... .... = Reserved bit: Not set
.1.. .... .... .... = Don't fragment: Set
..0. .... .... .... = More fragments: Not set
...0 0000 0000 0000 = Fragment offset: 0
Time to live: 63
Protocol: ICMP (1)
Header checksum: 0xf9b1 [validation disabled]
[Header checksum status: Unverified]
Source: 10.244.1.5
Destination: 10.244.0.5
Internet Control Message Protocol
Type: 8 (Echo (ping) request)
Code: 0
Checksum: 0xdfbe [correct]
[Checksum Status: Good]
Identifier (BE): 2560 (0x0a00)
Identifier (LE): 10 (0x000a)
Sequence number (BE): 103 (0x0067)
Sequence number (LE): 26368 (0x6700)
[Response frame: 8]
Data (56 bytes)
Fullly expanded ICMP reply packet captured from eth0
Frame 8: 148 bytes on wire (1184 bits), 148 bytes captured (1184 bits)
Encapsulation type: Ethernet (1)
Arrival Time: May 31, 2018 21:09:53.824379000 China Standard Time
[Time shift for this packet: 0.000000000 seconds]
Epoch Time: 1527772193.824379000 seconds
[Time delta from previous captured frame: 0.000691000 seconds]
[Time delta from previous displayed frame: 0.000691000 seconds]
[Time since reference or first frame: 0.800143000 seconds]
Frame Number: 8
Frame Length: 148 bytes (1184 bits)
Capture Length: 148 bytes (1184 bits)
[Frame is marked: False]
[Frame is ignored: False]
[Protocols in frame: eth:ethertype:ip:udp:vxlan:eth:ethertype:ip:icmp:data]
[Coloring Rule Name: ICMP]
[Coloring Rule String: icmp || icmpv6]
Ethernet II, Src: Cisco_01:0f:ef (a0:3d:6f:01:0f:ef), Dst: Microsof_a3:fa:d5 (00:0d:3a:a3:fa:d5)
Destination: Microsof_a3:fa:d5 (00:0d:3a:a3:fa:d5)
Address: Microsof_a3:fa:d5 (00:0d:3a:a3:fa:d5)
.... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
.... ...0 .... .... .... .... = IG bit: Individual address (unicast)
Source: Cisco_01:0f:ef (a0:3d:6f:01:0f:ef)
Address: Cisco_01:0f:ef (a0:3d:6f:01:0f:ef)
.... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
.... ...0 .... .... .... .... = IG bit: Individual address (unicast)
Type: IPv4 (0x0800)
Internet Protocol Version 4, Src: 172.16.0.4, Dst: 172.16.0.5
0100 .... = Version: 4
.... 0101 = Header Length: 20 bytes (5)
Differentiated Services Field: 0x00 (DSCP: CS0, ECN: Not-ECT)
0000 00.. = Differentiated Services Codepoint: Default (0)
.... ..00 = Explicit Congestion Notification: Not ECN-Capable Transport (0)
Total Length: 134
Identification: 0xedfb (60923)
Flags: 0x0000
0... .... .... .... = Reserved bit: Not set
.0.. .... .... .... = Don't fragment: Not set
..0. .... .... .... = More fragments: Not set
...0 0000 0000 0000 = Fragment offset: 0
Time to live: 64
Protocol: UDP (17)
Header checksum: 0x3442 [validation disabled]
[Header checksum status: Unverified]
Source: 172.16.0.4
Destination: 172.16.0.5
User Datagram Protocol, Src Port: 56724, Dst Port: 8472
Source Port: 56724
Destination Port: 8472
Length: 114
Checksum: 0xd758 [unverified]
[Checksum Status: Unverified]
[Stream index: 1]
Virtual eXtensible Local Area Network
Flags: 0x0800, VXLAN Network ID (VNI)
0... .... .... .... = GBP Extension: Not defined
.... .... .0.. .... = Don't Learn: False
.... 1... .... .... = VXLAN Network ID (VNI): True
.... .... .... 0... = Policy Applied: False
.000 .000 0.00 .000 = Reserved(R): 0x0000
Group Policy ID: 0
VXLAN Network Identifier (VNI): 1
Reserved: 0
Ethernet II, Src: 76:34:2f:c5:51:ec (76:34:2f:c5:51:ec), Dst: 8e:d0:f8:0a:41:19 (8e:d0:f8:0a:41:19)
Destination: 8e:d0:f8:0a:41:19 (8e:d0:f8:0a:41:19)
Address: 8e:d0:f8:0a:41:19 (8e:d0:f8:0a:41:19)
.... ..1. .... .... .... .... = LG bit: Locally administered address (this is NOT the factory default)
.... ...0 .... .... .... .... = IG bit: Individual address (unicast)
Source: 76:34:2f:c5:51:ec (76:34:2f:c5:51:ec)
Address: 76:34:2f:c5:51:ec (76:34:2f:c5:51:ec)
.... ..1. .... .... .... .... = LG bit: Locally administered address (this is NOT the factory default)
.... ...0 .... .... .... .... = IG bit: Individual address (unicast)
Type: IPv4 (0x0800)
Internet Protocol Version 4, Src: 10.244.0.5, Dst: 10.244.1.5
0100 .... = Version: 4
.... 0101 = Header Length: 20 bytes (5)
Differentiated Services Field: 0x00 (DSCP: CS0, ECN: Not-ECT)
0000 00.. = Differentiated Services Codepoint: Default (0)
.... ..00 = Explicit Congestion Notification: Not ECN-Capable Transport (0)
Total Length: 84
Identification: 0xff42 (65346)
Flags: 0x0000
0... .... .... .... = Reserved bit: Not set
.0.. .... .... .... = Don't fragment: Not set
..0. .... .... .... = More fragments: Not set
...0 0000 0000 0000 = Fragment offset: 0
Time to live: 63
Protocol: ICMP (1)
Header checksum: 0x6575 [validation disabled]
[Header checksum status: Unverified]
Source: 10.244.0.5
Destination: 10.244.1.5
Internet Control Message Protocol
Type: 0 (Echo (ping) reply)
Code: 0
Checksum: 0xe7be [correct]
[Checksum Status: Good]
Identifier (BE): 2560 (0x0a00)
Identifier (LE): 10 (0x000a)
Sequence number (BE): 103 (0x0067)
Sequence number (LE): 26368 (0x6700)
[Request frame: 7]
[Response time: 0.691 ms]
Data (56 bytes)
4 Flannel UDP Mode
Flannel also supports a debugging purpose mode called UDP, refer to Backends
UDP
Use UDP only for debugging if your network and kernel prevent you from using VXLAN or host-gw.
Type and options:
Type (string): udp
Port (number): UDP port to use for sending encapsulated packets. Defaults to 8285.
4.1 Configure Flannel Network to UDP Mode
Here are the steps to configure Flannel in UDP mode
- Dump Flannel configuration first
kubectl get configmaps kube-flannel-cfg -n=kube-system -o yaml > flannel-cfg.yml
- Edit flannel-cfg.yml by issuing
vi flannel-cfg.yml
, modifyBackend
type toudp
then save the file
net-conf.json: |
{
"Network": "10.244.0.0/16",
"Backend": {
"Type": "udp"
}
}
- Apply new configruation to K8S
kubectl apply -f flannel-cfg.yml
- Reboot all K8S nodes to apply the changes
4.2 Understand How Flannel UDP Mode Works
If we run ipconfig -a
from K8S node, we could see there is no flannel.1 device anymore, instead we have a new device flannel0
created.
cni0 Link encap:Ethernet HWaddr 0a:58:0a:f4:01:01
inet addr:10.244.1.1 Bcast:0.0.0.0 Mask:255.255.255.0
inet6 addr: fe80::8cd3:42ff:fe50:b881/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1472 Metric:1
RX packets:279993 errors:0 dropped:0 overruns:0 frame:0
TX packets:313453 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:49042038 (49.0 MB) TX bytes:36502523 (36.5 MB)
docker0 Link encap:Ethernet HWaddr 02:42:e7:d9:90:ad
inet addr:172.17.0.1 Bcast:0.0.0.0 Mask:255.255.0.0
UP BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
eth0 Link encap:Ethernet HWaddr 00:0d:3a:a3:fa:d5
inet addr:172.16.0.5 Bcast:172.16.0.255 Mask:255.255.255.0
inet6 addr: fe80::20d:3aff:fea3:fad5/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:3286115 errors:0 dropped:0 overruns:0 frame:0
TX packets:2696890 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:2743208258 (2.7 GB) TX bytes:852929710 (852.9 MB)
flannel0 Link encap:UNSPEC HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
inet addr:10.244.1.0 P-t-P:10.244.1.0 Mask:255.255.0.0
inet6 addr: fe80::92dc:d0db:d2b6:f5e8/64 Scope:Link
UP POINTOPOINT RUNNING NOARP MULTICAST MTU:1472 Metric:1
RX packets:45 errors:0 dropped:0 overruns:0 frame:0
TX packets:122 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:500
RX bytes:3836 (3.8 KB) TX bytes:7476 (7.4 KB)
...
Flannel0 is a TUN device created by our flanneld daemon process, TUN device provides packet reception and transmission for user space programs. It can be seen as a simple Point-to-Point or Ethernet device, which, instead of receiving packets from physical media, receives them from user space program and instead of sending packets via physical media writes them to the user space program. Which means in UDP mode, flanneld
is the user space program that will wrap the packet send it over to eth0 device/unwrap the packet received from the eth0. Run ip -d link show flannel0
to get the detailed infromation of flannel0
.
4: flannel0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1472 qdisc pfifo_fast state UNKNOWN mode DEFAULT group default qlen 500
link/none promiscuity 0
tun
The difference between flannel0
and flannel.1
is also in the IP address assignment, flannel0
has a 16bit netmask while flannel.1
has a 32bit netmask. Following result is from ifconfig flannel0
, as you can see, flannel0
is using 255.255.0.0 as netmask.
flannel0 Link encap:UNSPEC HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
inet addr:10.244.1.0 P-t-P:10.244.1.0 Mask:255.255.0.0
inet6 addr: fe80::92dc:d0db:d2b6:f5e8/64 Scope:Link
UP POINTOPOINT RUNNING NOARP MULTICAST MTU:1472 Metric:1
RX packets:45 errors:0 dropped:0 overruns:0 frame:0
TX packets:122 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:500
RX bytes:3836 (3.8 KB) TX bytes:7476 (7.4 KB)
Host node routing table also has a slight difference, route
shows 10.244.0.0/16 routing table entry in udp mode while vxlan it is using 10.244.0.0/24 (host node subnet routing)
ernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
default 172.16.0.1 0.0.0.0 UG 0 0 0 eth0
10.244.0.0 * 255.255.0.0 U 0 0 0 flannel0
10.244.1.0 * 255.255.255.0 U 0 0 0 cni0
168.63.129.16 172.16.0.1 255.255.255.255 UGH 0 0 0 eth0
169.254.169.254 172.16.0.1 255.255.255.255 UGH 0 0 0 eth0
172.16.0.0 * 255.255.255.0 U 0 0 0 eth0
172.17.0.0 * 255.255.0.0 U 0 0 0 docker0
4.3 How flannel0 wrap the packet in UDP
Let's capture a network trace to see how packet gets wrapped in udp mode, from node k8snode-342zzth442uje-1, run below command
tcpdump -i eth0 -s 65535 -w flannel_udp.cap "udp"
Using wireshark to decode the UDP data as IP packet, we can see below result
- ICMP ping request
Frame 59: 126 bytes on wire (1008 bits), 126 bytes captured (1008 bits)
Encapsulation type: Ethernet (1)
Arrival Time: Jun 4, 2018 15:47:23.058527000 China Standard Time
[Time shift for this packet: 0.000000000 seconds]
Epoch Time: 1528098443.058527000 seconds
[Time delta from previous captured frame: 0.321470000 seconds]
[Time delta from previous displayed frame: 0.999498000 seconds]
[Time since reference or first frame: 2.731338000 seconds]
Frame Number: 59
Frame Length: 126 bytes (1008 bits)
Capture Length: 126 bytes (1008 bits)
[Frame is marked: False]
[Frame is ignored: False]
[Protocols in frame: eth:ethertype:ip:udp:ip:icmp:data]
[Coloring Rule Name: ICMP]
[Coloring Rule String: icmp || icmpv6]
Ethernet II, Src: Microsof_a3:fa:d5 (00:0d:3a:a3:fa:d5), Dst: 12:34:56:78:9a:bc (12:34:56:78:9a:bc)
Destination: 12:34:56:78:9a:bc (12:34:56:78:9a:bc)
Address: 12:34:56:78:9a:bc (12:34:56:78:9a:bc)
.... ..1. .... .... .... .... = LG bit: Locally administered address (this is NOT the factory default)
.... ...0 .... .... .... .... = IG bit: Individual address (unicast)
Source: Microsof_a3:fa:d5 (00:0d:3a:a3:fa:d5)
Address: Microsof_a3:fa:d5 (00:0d:3a:a3:fa:d5)
.... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
.... ...0 .... .... .... .... = IG bit: Individual address (unicast)
Type: IPv4 (0x0800)
Internet Protocol Version 4, Src: 172.16.0.5, Dst: 172.16.0.4
0100 .... = Version: 4
.... 0101 = Header Length: 20 bytes (5)
Differentiated Services Field: 0x00 (DSCP: CS0, ECN: Not-ECT)
0000 00.. = Differentiated Services Codepoint: Default (0)
.... ..00 = Explicit Congestion Notification: Not ECN-Capable Transport (0)
Total Length: 112
Identification: 0x3db7 (15799)
Flags: 0x4000, Don't fragment
0... .... .... .... = Reserved bit: Not set
.1.. .... .... .... = Don't fragment: Set
..0. .... .... .... = More fragments: Not set
...0 0000 0000 0000 = Fragment offset: 0
Time to live: 64
Protocol: UDP (17)
Header checksum: 0xa49c [validation disabled]
[Header checksum status: Unverified]
Source: 172.16.0.5
Destination: 172.16.0.4
User Datagram Protocol, Src Port: 8285, Dst Port: 8285
Source Port: 8285
Destination Port: 8285
Length: 92
Checksum: 0x5897 [unverified]
[Checksum Status: Unverified]
[Stream index: 0]
Internet Protocol Version 4, Src: 10.244.1.14, Dst: 10.244.0.6
0100 .... = Version: 4
.... 0101 = Header Length: 20 bytes (5)
Differentiated Services Field: 0x00 (DSCP: CS0, ECN: Not-ECT)
0000 00.. = Differentiated Services Codepoint: Default (0)
.... ..00 = Explicit Congestion Notification: Not ECN-Capable Transport (0)
Total Length: 84
Identification: 0x3fd8 (16344)
Flags: 0x4000, Don't fragment
0... .... .... .... = Reserved bit: Not set
.1.. .... .... .... = Don't fragment: Set
..0. .... .... .... = More fragments: Not set
...0 0000 0000 0000 = Fragment offset: 0
Time to live: 62
Protocol: ICMP (1)
Header checksum: 0xe5d5 [validation disabled]
[Header checksum status: Unverified]
Source: 10.244.1.14
Destination: 10.244.0.6
Internet Control Message Protocol
Type: 8 (Echo (ping) request)
Code: 0
Checksum: 0x451d [correct]
[Checksum Status: Good]
Identifier (BE): 1536 (0x0600)
Identifier (LE): 6 (0x0006)
Sequence number (BE): 32 (0x0020)
Sequence number (LE): 8192 (0x2000)
[Response frame: 60]
Data (56 bytes)
ICMP ping reply
Frame 60: 126 bytes on wire (1008 bits), 126 bytes captured (1008 bits)
Encapsulation type: Ethernet (1)
Arrival Time: Jun 4, 2018 15:47:23.059315000 China Standard Time
[Time shift for this packet: 0.000000000 seconds]
Epoch Time: 1528098443.059315000 seconds
[Time delta from previous captured frame: 0.000788000 seconds]
[Time delta from previous displayed frame: 0.000788000 seconds]
[Time since reference or first frame: 2.732126000 seconds]
Frame Number: 60
Frame Length: 126 bytes (1008 bits)
Capture Length: 126 bytes (1008 bits)
[Frame is marked: False]
[Frame is ignored: False]
[Protocols in frame: eth:ethertype:ip:udp:ip:icmp:data]
[Coloring Rule Name: ICMP]
[Coloring Rule String: icmp || icmpv6]
Ethernet II, Src: Cisco_01:0f:ef (a0:3d:6f:01:0f:ef), Dst: Microsof_a3:fa:d5 (00:0d:3a:a3:fa:d5)
Destination: Microsof_a3:fa:d5 (00:0d:3a:a3:fa:d5)
Address: Microsof_a3:fa:d5 (00:0d:3a:a3:fa:d5)
.... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
.... ...0 .... .... .... .... = IG bit: Individual address (unicast)
Source: Cisco_01:0f:ef (a0:3d:6f:01:0f:ef)
Address: Cisco_01:0f:ef (a0:3d:6f:01:0f:ef)
.... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
.... ...0 .... .... .... .... = IG bit: Individual address (unicast)
Type: IPv4 (0x0800)
Internet Protocol Version 4, Src: 172.16.0.4, Dst: 172.16.0.5
0100 .... = Version: 4
.... 0101 = Header Length: 20 bytes (5)
Differentiated Services Field: 0x00 (DSCP: CS0, ECN: Not-ECT)
0000 00.. = Differentiated Services Codepoint: Default (0)
.... ..00 = Explicit Congestion Notification: Not ECN-Capable Transport (0)
Total Length: 112
Identification: 0x4cbe (19646)
Flags: 0x4000, Don't fragment
0... .... .... .... = Reserved bit: Not set
.1.. .... .... .... = Don't fragment: Set
..0. .... .... .... = More fragments: Not set
...0 0000 0000 0000 = Fragment offset: 0
Time to live: 64
Protocol: UDP (17)
Header checksum: 0x9595 [validation disabled]
[Header checksum status: Unverified]
Source: 172.16.0.4
Destination: 172.16.0.5
User Datagram Protocol, Src Port: 8285, Dst Port: 8285
Source Port: 8285
Destination Port: 8285
Length: 92
Checksum: 0x6652 [unverified]
[Checksum Status: Unverified]
[Stream index: 0]
Internet Protocol Version 4, Src: 10.244.0.6, Dst: 10.244.1.14
0100 .... = Version: 4
.... 0101 = Header Length: 20 bytes (5)
Differentiated Services Field: 0x00 (DSCP: CS0, ECN: Not-ECT)
0000 00.. = Differentiated Services Codepoint: Default (0)
.... ..00 = Explicit Congestion Notification: Not ECN-Capable Transport (0)
Total Length: 84
Identification: 0xd317 (54039)
Flags: 0x0000
0... .... .... .... = Reserved bit: Not set
.0.. .... .... .... = Don't fragment: Not set
..0. .... .... .... = More fragments: Not set
...0 0000 0000 0000 = Fragment offset: 0
Time to live: 62
Protocol: ICMP (1)
Header checksum: 0x9296 [validation disabled]
[Header checksum status: Unverified]
Source: 10.244.0.6
Destination: 10.244.1.14
Internet Control Message Protocol
Type: 0 (Echo (ping) reply)
Code: 0
Checksum: 0x4d1d [correct]
[Checksum Status: Good]
Identifier (BE): 1536 (0x0600)
Identifier (LE): 6 (0x0006)
Sequence number (BE): 32 (0x0020)
Sequence number (LE): 8192 (0x2000)
[Request frame: 59]
[Response time: 0.788 ms]
Data (56 bytes)
5 Flannel 'host-gw' Mode
In this mode, flannel simply configures each host node as a gateway and replies on routing table to route the traffics between Pod network and host. There will be no 'flannel.1' or flannel0
interface created, all traffics are routed from eth0
interface.
cni0 Link encap:Ethernet HWaddr 0a:58:0a:f4:01:01
inet addr:10.244.1.1 Bcast:0.0.0.0 Mask:255.255.255.0
inet6 addr: fe80::706e:16ff:feb3:d611/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:5714 errors:0 dropped:0 overruns:0 frame:0
TX packets:5686 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:927660 (927.6 KB) TX bytes:660171 (660.1 KB)
docker0 Link encap:Ethernet HWaddr 02:42:b1:69:82:a9
inet addr:172.17.0.1 Bcast:0.0.0.0 Mask:255.255.0.0
UP BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
eth0 Link encap:Ethernet HWaddr 00:0d:3a:a3:fa:d5
inet addr:172.16.0.5 Bcast:172.16.0.255 Mask:255.255.255.0
inet6 addr: fe80::20d:3aff:fea3:fad5/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:62962 errors:0 dropped:0 overruns:0 frame:0
TX packets:50882 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:52712586 (52.7 MB) TX bytes:15592086 (15.5 MB)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:162 errors:0 dropped:0 overruns:0 frame:0
TX packets:162 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:11940 (11.9 KB) TX bytes:11940 (11.9 KB)
veth0edd4b41 Link encap:Ethernet HWaddr 86:d9:82:60:45:7b
inet6 addr: fe80::84d9:82ff:fe60:457b/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:4892 errors:0 dropped:0 overruns:0 frame:0
TX packets:5591 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:928980 (928.9 KB) TX bytes:651249 (651.2 KB)
veth1bcc16bd Link encap:Ethernet HWaddr ca:46:7d:58:6d:8b
inet6 addr: fe80::c846:7dff:fe58:6d8b/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:822 errors:0 dropped:0 overruns:0 frame:0
TX packets:213 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:78676 (78.6 KB) TX bytes:17870 (17.8 KB)
If the cluster is created under cloud environment, cloud provider also needs to make sure each node is acting as a gateway. For example, if we want to make it work from Azure environment, we also need to enable IP forwarding from each NIC attached to host node
UDR also need to be configured(UDR will be automatically created with Azure kubernetes cloud provider) like below
From each node, the routing table will be set to below, we can see for each 10.244.0.0/24 subnet, there will be a specific routing table entry to route the traffics to each host 10.244.0.0 172.16.0.4 255.255.255.0 UG 0 0 0 eth0
through eth0 interface.
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
default 172.16.0.1 0.0.0.0 UG 0 0 0 eth0
10.244.0.0 172.16.0.4 255.255.255.0 UG 0 0 0 eth0
10.244.1.0 * 255.255.255.0 U 0 0 0 cni0
168.63.129.16 172.16.0.1 255.255.255.255 UGH 0 0 0 eth0
169.254.169.254 172.16.0.1 255.255.255.255 UGH 0 0 0 eth0
172.16.0.0 * 255.255.255.0 U 0 0 0 eth0
172.17.0.0 * 255.255.0.0 U 0 0 0 docker0
Let's attach to busybox
kubectl attach busybox-5858cc4697-5jc7f -c busybox -i -t
And ping 10.244.0.7
PING 10.244.0.7 (10.244.0.7): 56 data bytes
64 bytes from 10.244.0.7: seq=0 ttl=62 time=1.715 ms
64 bytes from 10.244.0.7: seq=1 ttl=62 time=0.686 ms
64 bytes from 10.244.0.7: seq=2 ttl=62 time=0.889 ms
...
If we capture a network trace from eth0
interface by issuing tcpdump -i eth0 -n -e "icmp or arp"
, we can see the traffics are below
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
01:32:20.355486 00:0d:3a:a3:fa:d5 > 12:34:56:78:9a:bc, ethertype IPv4 (0x0800), length 98: 10.244.1.16 > 10.244.0.7: ICMP echo request, id 1536, seq 39, length 64
01:32:20.356255 a0:3d:6f:01:0f:ef > 00:0d:3a:a3:fa:d5, ethertype IPv4 (0x0800), length 98: 10.244.0.7 > 10.244.1.16: ICMP echo reply, id 1536, seq 39, length 64
01:32:21.355645 00:0d:3a:a3:fa:d5 > 12:34:56:78:9a:bc, ethertype IPv4 (0x0800), length 98: 10.244.1.16 > 10.244.0.7: ICMP echo request, id 1536, seq 40, length 64
01:32:21.356351 a0:3d:6f:01:0f:ef > 00:0d:3a:a3:fa:d5, ethertype IPv4 (0x0800), length 98: 10.244.0.7 > 10.244.1.16: ICMP echo reply, id 1536, seq 40, length 64
6 Flannel Configuration in ETCD
Flannel stores its configuration in ETCD or from APIServer to ETCD, in either case, we can directly access ETCD to dump Flannel's configuration
Here are the steps
- Attach to etcd container from kubectl
kubectl exec -it etcd-k8snode-342zzth442uje-0 -n=kube-system -- /bin/sh
Dump flannel's configuration
ETCDCTL_API=3 etcdctl --key /etc/kubernetes/pki/etcd/peer.key --cert /etc/kubernetes/pki/etcd/peer.crt --cacert /etc/kubernetes/pki/etcd/ca.crt --endpoints=https://localhost:2379 get /registry/configmaps/kube-system/kube-flannel-cfg
The result will be like
/registry/configmaps/kube-system/kube-flannel-cfg
k8s
v1 ConfigMap
kube-flannel-cfg
kube-system"*$d6b253a4-535a-11e8-94dd-000d3aa3fc012ȊՐZ
appflannelZ
tiernodeb
0kubectl.kubernetes.io/last-applied-configurationۄ{"apiVersion":"v1","data":{"cni-conf.json":"{\n \"name\": \"cbr0\",\n \"plugins\": [\n {\n \"type\": \"flannel\",\n \"delegate\": {\n \"hairpinMode\": true,\n \"isDefaultGateway\": true\n }\n },\n {\n \"type\": \"portmap\",\n \"capabilities\": {\n \"portMappings\": true\n }\n }\n ]\n}\n","net-conf.json":"{\n \"Network\": \"10.244.0.0/16\",\n \"Backend\": {\n \"Type\": \"udp\"\n }\n}\n"},"kind":"ConfigMap","metadata":{"annotations":{},"labels":{"app":"flannel","tier":"node"},"name":"kube-flannel-cfg","namespace":"kube-system"}}
z
cni-conf.json{
"name": "cbr0",
"plugins": [
{
"type": "flannel",
"delegate": {
"hairpinMode": true,
"isDefaultGateway": true
}
},
{
"type": "portmap",
"capabilities": {
"portMappings": true
}
}
]
}
X
net-conf.json{
"Network": "10.244.0.0/16",
"Backend": {
"Type": "udp"
}
}
"