
So let’s put this down into a crudely drawn diagram before moving on.
Here you can see the standard host networking and the docker0 bridge. You can also see the container networking. And you can see the paired veth interfaces which essentially form a tunnel out of the container network namespace into the host’s docker0 bridge.
Three more things I want to show you before we cut this one off.
First, from the container what can I reach?
#bridge docker0 IP? Yes! [root@98c3dbff6afc /]# ping -c 3 172.17.0.1 PING 172.17.0.1 (172.17.0.1) 56(84) bytes of data. 64 bytes from 172.17.0.1: icmp_seq=1 ttl=64 time=0.081 ms 64 bytes from 172.17.0.1: icmp_seq=2 ttl=64 time=0.059 ms 64 bytes from 172.17.0.1: icmp_seq=3 ttl=64 time=0.056 ms --- 172.17.0.1 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 1999ms rtt min/avg/max/mdev = 0.056/0.065/0.081/0.012 ms #host public IP? Yes! [root@98c3dbff6afc /]# ping -c 3 10.0.0.205 PING 10.0.0.205 (10.0.0.205) 56(84) bytes of data. 64 bytes from 10.0.0.205: icmp_seq=1 ttl=64 time=0.058 ms 64 bytes from 10.0.0.205: icmp_seq=2 ttl=64 time=0.062 ms 64 bytes from 10.0.0.205: icmp_seq=3 ttl=64 time=0.057 ms --- 10.0.0.205 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2000ms rtt min/avg/max/mdev = 0.057/0.059/0.062/0.002 ms #host default gateway? Yes! [root@98c3dbff6afc /]# ping -c 3 10.0.0.1 PING 10.0.0.1 (10.0.0.1) 56(84) bytes of data. 64 bytes from 10.0.0.1: icmp_seq=1 ttl=63 time=0.490 ms 64 bytes from 10.0.0.1: icmp_seq=2 ttl=63 time=0.413 ms 64 bytes from 10.0.0.1: icmp_seq=3 ttl=63 time=0.438 ms --- 10.0.0.1 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2000ms rtt min/avg/max/mdev = 0.413/0.447/0.490/0.032 ms #google public dns? Yes!!! [root@98c3dbff6afc /]# ping -c 3 8.8.8.8 PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data. 64 bytes from 8.8.8.8: icmp_seq=1 ttl=42 time=24.2 ms 64 bytes from 8.8.8.8: icmp_seq=2 ttl=42 time=22.7 ms 64 bytes from 8.8.8.8: icmp_seq=3 ttl=42 time=23.2 ms --- 8.8.8.8 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2002ms rtt min/avg/max/mdev = 22.756/23.393/24.212/0.633 ms
I’ll explore this in another post, but I wanted to show this up front as this is interesting. We already saw that we could yum install a package so this isn’t too earth shattering but still…notice that by default my container can effectively go anywhere my host can. This may be fine, but it may not be. It is important to understand how the networking is configured before you expose anything to the world.
Second, check out the network statistics on the veth in the host and the veth in the container:
#host [root@dockernet ~]# ip -s addr show veth304f268 5: veth304f268@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP <...snip...> RX: bytes packets errors dropped overrun mcast 422343 6330 0 0 0 0 TX: bytes packets errors dropped carrier collsns 16219178 10005 0 0 0 0 #container [root@98c3dbff6afc /]# ip -s addr show eth0 4: eth0@if5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP <...snip...> RX: bytes packets errors dropped overrun mcast 16219178 10005 0 0 0 0 TX: bytes packets errors dropped carrier collsns 422343 6330 0 0 0 0
You can see that they are obviously reflections of one another. A transmit from one is a receive on the other, and vice versa. This really demonstrates the veth pairing/peering mechanism as essentially just a pipeline from one network namespace to another.
Finally, I’m going to spin up a 2nd container and show some relevant output on the CLI. Note: -dit means keep it running interactively but detach it so it runs in the background.
[root@dockernet ~]# docker run -dit centos 15e8b19418afda223c13b7a47118b955f25665ec93354541a3c45ed6371bee21 [root@dockernet ~]# ip addr show <...snip...> 5: veth304f268@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP link/ether 72:f9:71:83:12:22 brd ff:ff:ff:ff:ff:ff link-netnsid 0 inet6 fe80::70f9:71ff:fe83:1222/64 scope link valid_lft forever preferred_lft forever 13: veth1b64a59@if12: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP link/ether 1e:93:b8:9a:55:8c brd ff:ff:ff:ff:ff:ff link-netnsid 1 inet6 fe80::1c93:b8ff:fe9a:558c/64 scope link valid_lft forever preferred_lft forever
One thing you’ll notice is the veth pairing here is 12/13. Nothing special about that, I actually just spun up some other containers as part of testing and veth assignment is sequential. The next container will be 14/15. But the pairing is there just as we expect.
Also notice that the link-netnsid here is 1 instead of 0. So a totally different network namespace which requires a totally different veth pairing. I can similarly query the network namespace because my softlink is still in place.
[root@dockernet ~]# ip netns list 3cbc2c3d550b (id: 1) 3be322af84fc (id: 0) [root@dockernet ~]# ip netns exec 3cbc2c3d550b ip addr show 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 12: eth0@if13: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP link/ether 02:42:ac:11:00:03 brd ff:ff:ff:ff:ff:ff link-netnsid 0 inet 172.17.0.3/16 scope global eth0 valid_lft forever preferred_lft forever
Much as you would expect here, the only oddity being that the netnsid here is 0. I assume this is because the container knows nothing about the outside host, so netns 1 doesn’t really exist for it.
And, from my original container can I ping the new container? You guessed it.
[root@98c3dbff6afc /]# ping 172.17.0.3 PING 172.17.0.3 (172.17.0.3) 56(84) bytes of data. 64 bytes from 172.17.0.3: icmp_seq=1 ttl=64 time=0.115 ms 64 bytes from 172.17.0.3: icmp_seq=2 ttl=64 time=0.055 ms 64 bytes from 172.17.0.3: icmp_seq=3 ttl=64 time=0.057 ms ^C --- 172.17.0.3 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 1999ms rtt min/avg/max/mdev = 0.055/0.075/0.115/0.029 ms
So the second container would add to our diagram like this:
Notice that the first veth can’t be reused as it is paired already with the other namespace. Hence we spin up a new pair and attach to the same bridge interface. Additional containers simply spin up an additional veth pair and take another IP from the container subnet.
So when we ping from one container to another, it travels down the veth pipeline, across the docker0 bridge, and up the other veth pipeline to the second container.
Hopefully this was helpful as it took me a while to wrap my head around all these different pieces and nuances. I plan to do some more explanatory posts around containers in general, and hope you’ll be reading those too!