Jusene's Blog

Docker macvlan跨主机通信

字数统计: 1.8k阅读时长: 10 min
2018/04/12 Share

macvlan

macvlan是linux kernel的模块,其功能是允许在同一个物理卡上配置多个mac地址,即多个interface, 每个insterface可以配置自己的IP。

macvlan的最大优点就是性能极好,相比其他实现,macvlan不需要创建Linux Bridge,而是通过以太interface连接到物理网络。

环境准备

1.首先需要开启macvlan网络中网卡的混杂模式

1
~]# ip link set eth0 promisc on
2
~]# ip link show eth0
3
2: eth0: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT qlen 1000
4
    link/ether 00:1c:42:a9:3a:a6 brd ff:ff:ff:ff:ff:ff

2.创建macvlan网络
host1:

1
~]# docker network create -d macvlan --subnet 10.211.55.0/24 \
2
                                     --gateway=10.211.55.1 \
3
                                     -o parent=eth0 mac_net1

注意:在host2上也需要执行相同的命令,只是分配的subnet更改成不一样

  • -d macvlan指定driver为macvlan
  • macvlan网络为local网络,为了保证跨主机能够通信,用户需要自己管理IP subnet
  • 与其他网络不通,docker不会为macvlan创建网关,这里的网关要是真实存在的,否则无法路由
  • -o parent指定使用的网络interface

3.创建容器
host1:

1
~]# docker run -itd --name bbox1 --ip=10.211.55.50 --network mac_net1 busybox

为了避免ip冲突,最好通过–ip指定

host2:

1
~]# docker run -itd --name bbox2 --ip=10.211.55.51 --network mac_net1 busybox

4.验证连通性

1
~]# docker exec bbox1 ping 10.211.55.51
2
PING 10.211.55.51 (10.211.55.51): 56 data bytes
3
64 bytes from 10.211.55.51: seq=0 ttl=64 time=0.503 ms
4
64 bytes from 10.211.55.51: seq=1 ttl=64 time=0.454 ms
5
64 bytes from 10.211.55.51: seq=2 ttl=64 time=0.466 ms
6
64 bytes from 10.211.55.51: seq=3 ttl=64 time=0.406 ms
7
...

docker没有为macvlan提供dns服务,所以无法使用主机名通信

网络结构

macvlan不依赖Linux Bridge,查看下容器的网络设备
host1:

1
~]# docker exec bbox1 ip a
2
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue qlen 1
3
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
4
    inet 127.0.0.1/8 scope host lo
5
       valid_lft forever preferred_lft forever
6
34: eth0@if2: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue 
7
    link/ether 02:42:0a:d3:37:32 brd ff:ff:ff:ff:ff:ff
8
    inet 10.211.55.50/24 brd 10.211.55.255 scope global eth0
9
       valid_lft forever preferred_lft forever

容器只有一个 eth0,请注意 eth0 后面的 @if2,这表明该 interface 有一个对应的 interface,其全局的编号为 2。根据 macvlan 的原理,我们有理由猜测这个 interface 就是主机的 eth0,确认如下:

1
~]# ip a
2
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
3
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
4
    inet 127.0.0.1/8 scope host lo
5
       valid_lft forever preferred_lft forever
6
    inet6 ::1/128 scope host 
7
       valid_lft forever preferred_lft forever
8
2: eth0: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
9
    link/ether 00:1c:42:a9:3a:a6 brd ff:ff:ff:ff:ff:ff
10
    inet 10.211.55.17/24 brd 10.211.55.255 scope global dynamic eth0
11
       valid_lft 1387sec preferred_lft 1387sec
12
    inet6 fdb2:2c26:f4e4:0:3b2b:6db8:fa6e:5c8/64 scope global noprefixroute dynamic 
13
       valid_lft 2591707sec preferred_lft 604507sec
14
    inet6 fe80::6499:5c43:d4fa:d8d/64 scope link 
15
       valid_lft forever preferred_lft forever
16
3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN 
17
    link/ether 02:42:cc:94:b0:7e brd ff:ff:ff:ff:ff:ff
18
    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
19
       valid_lft forever preferred_lft forever

可见,容器的 eth0 就是 eth0 通过 macvlan 虚拟出来的 interface。容器的 interface 直接与主机的网卡连接,这种方案使得容器无需通过 NAT 和端口映射就能与外网直接通信(只要有网关),在网络上与其他独立主机没有区别。

用sub-interface实现多macvlan网络

macvlan会独占主机的网卡,也就是一块网卡只能创建一个macvlan网络,还好macvlan不仅支持连接到interface(eth0),也支持连接到sub-interface(eth0:0)。

创建eth0:10和eth0:20

host1:

1
~]# ip address add 10.211.55.53 dev eth0 label eth0.10
2
~]# ip address add 10.211.55.55 dev eth0 label eth0.20
3
~]# ifconfig
4
docker0: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
5
        inet 172.17.0.1  netmask 255.255.0.0  broadcast 172.17.255.255
6
        ether 02:42:cc:94:b0:7e  txqueuelen 0  (Ethernet)
7
        RX packets 0  bytes 0 (0.0 B)
8
        RX errors 0  dropped 0  overruns 0  frame 0
9
        TX packets 0  bytes 0 (0.0 B)
10
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
11
12
eth0: flags=4419<UP,BROADCAST,RUNNING,PROMISC,MULTICAST>  mtu 1500
13
        inet 10.211.55.17  netmask 255.255.255.0  broadcast 10.211.55.255
14
        inet6 fdb2:2c26:f4e4:0:3b2b:6db8:fa6e:5c8  prefixlen 64  scopeid 0x0<global>
15
        inet6 fe80::6499:5c43:d4fa:d8d  prefixlen 64  scopeid 0x20<link>
16
        ether 00:1c:42:a9:3a:a6  txqueuelen 1000  (Ethernet)
17
        RX packets 198313  bytes 103000165 (98.2 MiB)
18
        RX errors 0  dropped 0  overruns 0  frame 0
19
        TX packets 128036  bytes 17296662 (16.4 MiB)
20
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
21
22
eth0.10: flags=4419<UP,BROADCAST,RUNNING,PROMISC,MULTICAST>  mtu 1500
23
        inet 10.211.55.53  netmask 255.255.255.255  broadcast 0.0.0.0
24
        ether 00:1c:42:a9:3a:a6  txqueuelen 1000  (Ethernet)
25
26
eth0.20: flags=4419<UP,BROADCAST,RUNNING,PROMISC,MULTICAST>  mtu 1500
27
        inet 10.211.55.55  netmask 255.255.255.255  broadcast 0.0.0.0
28
        ether 00:1c:42:a9:3a:a6  txqueuelen 1000  (Ethernet)

host2:

1
~]# ip address add 10.211.55.54 dev eth0 label eth0.10
2
~]# ip address add 10.211.55.56 dev eth0 label eth0.20
3
~]# ifconfig
4
docker0: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
5
        inet 172.17.0.1  netmask 255.255.0.0  broadcast 172.17.255.255
6
        ether 02:42:ee:64:71:0c  txqueuelen 0  (Ethernet)
7
        RX packets 0  bytes 0 (0.0 B)
8
        RX errors 0  dropped 0  overruns 0  frame 0
9
        TX packets 0  bytes 0 (0.0 B)
10
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
11
12
eth0: flags=4419<UP,BROADCAST,RUNNING,PROMISC,MULTICAST>  mtu 1500
13
        inet 10.211.55.18  netmask 255.255.255.0  broadcast 10.211.55.255
14
        inet6 fdb2:2c26:f4e4:0:d8f5:c8d9:47f4:6fbf  prefixlen 64  scopeid 0x0<global>
15
        inet6 fe80::a66a:4224:d693:4ecd  prefixlen 64  scopeid 0x20<link>
16
        ether 00:1c:42:7d:73:b1  txqueuelen 1000  (Ethernet)
17
        RX packets 198801  bytes 103116292 (98.3 MiB)
18
        RX errors 0  dropped 0  overruns 0  frame 0
19
        TX packets 128105  bytes 16995328 (16.2 MiB)
20
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
21
22
eth0.10: flags=4419<UP,BROADCAST,RUNNING,PROMISC,MULTICAST>  mtu 1500
23
        inet 10.211.55.54  netmask 255.255.255.255  broadcast 0.0.0.0
24
        ether 00:1c:42:7d:73:b1  txqueuelen 1000  (Ethernet)
25
26
eth0.20: flags=4419<UP,BROADCAST,RUNNING,PROMISC,MULTICAST>  mtu 1500
27
        inet 10.211.55.56  netmask 255.255.255.255  broadcast 0.0.0.0
28
        ether 00:1c:42:7d:73:b1  txqueuelen 1000  (Ethernet)

创建macvlan网络:
host1

1
~]# sysctl -w net.ipv4.ip_forward=1
2
~]# docker network create -d macvlan --subnet=192.168.1.0/24 --gateway=192.168.1.1 -o parent=eth0.10 mac_net10 
3
~]# docker network create -d macvlan --subnet=192.168.2.0/24 --gateway=192.168.2.1 -o parent=eth0.20 mac_net20
4
~]# docker run -itd --name bbox1 --ip=192.168.1.2 --network mac_net10 busybox
5
~]# docker run -itd --name bbox2 --ip=192.168.2.2 --network mac_net20 busybox

host2:

1
~]# sysctl -w net.ipv4.ip_forward=1
2
~]# docker network create -d macvlan --subnet=192.168.1.0/24 --gateway=192.168.1.1 -o parent=eth0.10 mac_net10 
3
~]# docker network create -d macvlan --subnet=192.168.2.0/24 --gateway=192.168.2.1 -o parent=eth0.20 mac_net20
4
~]# docker run -itd --name bbox1 --ip=192.168.1.3 --network mac_net10 busybox
5
~]# docker run -itd --name bbox2 --ip=192.168.2.3 --network mac_net20 busybox

不同的sub-interface的macvlan相互隔离,架构如下图:

不同的macvlan网络不能在二层上通信,而三层上可以通过网关将macvlan连通,设置服务作为虚拟路由器配置,设置网关并转发VLAN 10和VLAN 20的流量。

host3:

1
~]# ip address add 192.168.1.1 dev eth0 label eth0.10
2
~]# ip address add 192.168.2.1 dev eth0 label eth0.20
3
## 配置iptables规则,转发不通vlan的数据包
4
~]# iptables -t nat -A POSTROUTING -o eth0.10 -j MASQUERADE
5
~}# iptables -t nat -A POSTROUTING -o eth0.20 -j MASQUERADE
6
~]# iptables -A FORWARD -i eth0.10 -o eth0.20 -m state --state RELATED,ESTABLISHED -j ACCEPT
7
~]# iptables -A FORWARD -i eth0.20 -o eth0.10 -m state --state RELATED,ESTABLISHED -j ACCEPT
8
~]# iptables -A FORWARD -i eth0.10 -o eth0.20 -j ACCEPT
9
~]# iptables -A FORWARD -i eth0.20 -o eth0.10 -j ACCEPT
CATALOG
  1. 1. macvlan
  2. 2. 环境准备
  3. 3. 网络结构
  4. 4. 用sub-interface实现多macvlan网络