OVS-DPDK 编译与使用

ovs 一般是供虚拟机一起使用, 因此本文的主机已开启了 sriov 与虚拟化相关功能支持, DPDK 使用 vfio 驱动.

0x00 DPDK 编译

请参考另一篇.

0x01 OVS 编译

源码下载:

1
2
wget https://www.openvswitch.org/releases/openvswitch-x.x.x.tar.gz
tar -zxvf openvswitch-x.x.x.tar.gz

安装依赖:

1
yum install -y python3 python36-six python2 python2-six gcc-c++ ...

编译:

1
2
3
4
5
./boot.sh

./configure --with-dpdk=$RTE_SDK/$RTE_TARGET --disable-ssl --disable-libcapng

make -j16 && make install

0x02 系统配置

1. 绑定驱动

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# 加载 vfio
modprobe vfio-pci

# 首先保证网卡非活动
ip link set ethx down

# 查看当前网卡信息
./usertools/dpdk-devbind.py --status

# 使用dpdk脚本进行绑定
./usertools/dpdk-devbind.py --bind=vfio-pci 00:09.0 ...

# 还原
./usertools/dpdk-devbind.py --bind=ixgbe 00:09.0 ...

2. 大页设置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# 查看numa支持 支持的机器会显示 node, 需要安装 numactl
numastat

# 让系统自动去分配大页内存
echo 2048 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages

# 对于有 numa 的机器, 假设有两个 node
# 清空大页内存只需要把数字改 0 即可
echo 1024 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
echo 1024 > /sys/devices/system/node/node1/hugepages/hugepages-2048kB/nr_hugepages

# 挂载大页内存
mkdir /mnt/huge
mount -t hugetlbfs nodev /mnt/huge
# mount -t hugetlbfs none /mnt/huge -o pagesize=2MB
# mount -t hugetlbfs none /mnt/huge -o pagesize=1GB

# 可以把大页内存加入 /etc/fstab 中
echo "nodev /mnt/huge hugetlbfs defaults 0 0" >> /etc/fstab
# 如果要使用 1GB 的页
echo "nodev /mnt/huge_1GB hugetlbfs pagesize=1GB 0 0" >> /etc/fstab

0x03 启动 OVS

1. 生成默认配置

1
2
3
mkdir -p /usr/local/etc/openvswitch
ovsdb-tool create /usr/local/etc/openvswitch/conf.db \
vswitchd/vswitch.ovsschema

2. 启动 ovsdb

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
mkdir -p /usr/local/var/run/openvswitch
mkdir -p /usr/local/var/log/openvswitch/ovsdb-server
ovsdb-server \
--remote=punix:/usr/local/var/run/openvswitch/db.sock \
--remote=db:Open_vSwitch,Open_vSwitch,manager_options \
--pidfile --detach --log-file

# with ssl
ovsdb-server \
--remote=punix:/usr/local/var/run/openvswitch/db.sock \
--remote=db:Open_vSwitch,Open_vSwitch,manager_options \
--private-key=db:Open_vSwitch,SSL,private_key \
--certificate=db:Open_vSwitch,SSL,certificate \
--bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert \
--pidfile --detach --log-file

If you built Open vSwitch without SSL support, then omit --private-key, --certificate, and --bootstrap-ca-cert.)

3. 启动 ovs-vswitchd

1
2
3
4
5
export PATH=$PATH:/usr/local/share/openvswitch/scripts
export DB_SOCK=/usr/local/var/run/openvswitch/db.sock
ovs-vsctl --no-wait init
ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true
ovs-ctl --no-ovsdb-server --db-sock="$DB_SOCK" start

4. 验证

1
2
3
4
5
6
7
8
9
ovs-vsctl get Open_vSwitch . dpdk_initialized
true

ovs-vswitchd --version
ovs-vswitchd (Open vSwitch) 2.11.0
DPDK 18.11.7

ovs-vsctl get Open_vSwitch . dpdk_version
"DPDK 18.11.7"

0x04 创建网桥和网口

1. 创建使用 dpdk 的网桥

1
ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev

-- set bridge br0 datapath_type=netdev 通过该命令创建出来的虚拟网桥无法与宿主机进行通信, 数据包能 ping 通但无法进行 tcp 交互. 如要在 vm 内与宿主机进行直接通信需要再创建一个不带此命令的网桥, xml 文件中连接上该网桥即可.

2. 插入网口

把 dpdk 网口插入网桥中

1
2
3
ovs-vsctl add-port br0 dpdk-p0 -- \
set Interface dpdk-p0 \
type=dpdk options:dpdk-devargs=0000:86:00.0

创建 vhostuser 网口

1
ovs-vsctl add-port br0 dpdkvhostuser0 -- set Interface dpdkvhostuser0 type=dpdkvhostuser [ofport_request=<N>]

创建 vhostuserclient 网口

1
ovs-vsctl add-port br0 dpdkvhostclient0 -- set Interface dpdkvhostclient0 type=dpdkvhostuserclient options:vhost-server-path=/usr/local/var/run/openvswitch/dpdkvhostclient0

3. 使用网口

3.1 libvirt

在 xml 配置中添加相关网口. 需要配置大页内存的配置, 让 qemu 能访问大页内存与 ovs-dpdk 进行通信, 配置该项之后, 虚拟机所用的内存会全部使用大页内存, 不会再使用普通内存.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
<memoryBacking>
<hugepages>
<page size='2' unit='M' nodeset='0'/>
</hugepages>
</memoryBacking>
<cputune>
<shares>4096</shares>
</cputune>
<cpu ...>
...
<numa>
<cell id='0' cpus='0-1' memory='4194304' unit='KiB' memAccess='shared'/>
</numa>
</cpu>
  • 通过 ovs 桥接派生网口
1
2
3
4
5
6
7
8
<interface type='vhostuser'>
<mac address='de:ad:be:ef:ac:02'/>
<source type='unix' path='/usr/local/var/run/openvswitch/dpdkvhostclient0' mode='server'/>
<model type='virtio'/>
<driver queues='2'>
<host mrg_rxbuf='on'/>
</driver>
</interface>
  • 使用 vhostuser
1
2
3
4
5
6
7
8
<interface type='vhostuser'>
<mac address='00:00:00:00:00:01'/>
<source type='unix' path='/usr/local/var/run/openvswitch/dpdkvhostuser0' mode='client'/>
<model type='virtio'/>
<driver queues='2'>
<host mrg_rxbuf='on'/>
</driver>
</interface>
  • 使用 vhostuserclient
1
2
3
4
5
6
7
8
<interface type='vhostuser'>
<mac address='00:00:00:00:00:02'/>
<source type='unix' path='/usr/local/var/run/openvswitch/dpdkvhostclient0' mode='server'/>
<model type='virtio'/>
<driver queues='2'>
<host mrg_rxbuf='on'/>
</driver>
</interface>

3.2 qemu

在启动参数中加入相关参数:

  • vhostuser

    1
    2
    3
    4
    5
    -chardev socket,id=char1,path=/usr/local/var/run/openvswitch/vhost-user-1 \
    -netdev type=vhost-user,id=mynet1,chardev=char1,vhostforce \
    -device virtio-net-pci,mac=00:00:00:00:00:01,netdev=mynet1 \
    -object memory-backend-file,id=mem,size=4096M,mem-path=/dev/huge,share=on \
    -numa node,memdev=mem -mem-prealloc
  • vhostuserclient

    1
    2
    3
    4
    5
    -chardev socket,id=char1,path=$VHOST_USER_SOCKET_PATH,server \
    -netdev type=vhost-user,id=mynet1,chardev=char1,vhostforce \
    -device virtio-net-pci,mac=00:00:00:00:00:01,netdev=mynet1 \
    -object memory-backend-file,id=mem,size=4096M,mem-path=/dev/hugepages,share=on \
    -numa node,memdev=mem -mem-prealloc

0xff 参考

Open vSwitch on Linux, FreeBSD and NetBSD

Open vSwitch with DPDK

Configuring OVS-DPDK with VM

DPDK Virtual Devices

DPDK vHost User Ports

DPDK Physical Ports

OVS-DPDK: Migrating to vhostuser socket mode in Red Hat OpenStack