+++ title = "QEMU virtio configurations" [taxonomies] tags = ["linux", "qemu", "virtio"] +++ For my own reference I wanted to document some minimal [`virtio`][virtio] device configurations with qemu and the required Linux kernel configuration to enable those devices. The devices we will use are `virtio console`, `virtio blk` and `virtio net`. To make use of the virtio devices in qemu we are going to build and boot into busybox based [`initramfs`][initramfs]. ## Build initramfs For the initramfs there is not much magic, we will grab a copy of busybox, configure it with the default config (`defconfig`) and enable static linking as we will use it as rootfs. For the `init` process we will use the one provided by busybox but we have to symlink it to `/init` as during boot, the kernel will extract the cpio compressed initramfs into `rootfs` and look for the `/init` file. If that's not found the kernel will fallback to an older mechanism an try to mount a root partition (which we don't have). > Optionally the init binary could be specified with the `rdinit=` kernel boot > parameter. We populate the `/etc/inittab` and `/etc/init.d/rcS` with a minimal configuration to mount the `proc`, `sys` and `dev` filesystems and drop into a shell after the boot is completed. \ Additionally we setup `/etc/passwd` and `/etc/shadow` with an entry for the `root` user with the password `1234`, so we can login via the virtio console later. ```sh,hide_lines=1-30 68-1000 {{ include(path="content/2021-12-02-toying-with-virtio/build_initramfs.sh") }} ``` The full build script is available under [build_initramfs.sh][build-initramfs]. ## Virtio console To enable support for the virtio console we enable the kernel configs shown below. The pci configurations are enabled because in qemu the virtio console front-end device (the one presented to the guest) is attached to the pci bus. ```sh,hide_lines=1-31 39-1000 {{ include(path="content/2021-12-02-toying-with-virtio/build_kernel.sh") }} ``` The full build script is available under [build_kernel.sh][build-kernel]. To boot-up the guest we use the following qemu configuration. ```sh qemu-system-x86_64 \ -nographic \ -cpu host \ -enable-kvm \ -kernel ./linux-$(VER)/arch/x86/boot/bzImage \ -append "earlyprintk=ttyS0 console=ttyS0 root=/dev/ram0 ro" \ -initrd ./initramfs.cpio.gz \ -device virtio-serial-pci \ -device virtconsole,chardev=vcon,name=console.0 \ -chardev socket,id=vcon,ipv4=on,host=localhost,port=2222,server,telnet=on,wait=off ``` The important parts in this configuration are the last three lines. The `virtio-serial-pci` device creates the serial bus where the virtio console is attached to. The `virtconsole` creates the virtio console device exposed to the guest (front-end). The `chardev=vcon` option specifies that the chardev with `id=vcon` is attached as back-end to the virtio console. The back-end device is the one we will have access to from the host running the emulation. The chardev back-end we configure to be a `socket`, running a telnet server listening on port 2222. The `wait=off` tells qemu that it can directly boot without waiting for a client connection. After booting the guest we are dropped into a shell and can verify that our device is being detected properly. ```sh root@virtio-box ~ # ls /sys/bus/virtio/devices/ virtio0 root@virtio-box ~ # cat /sys/bus/virtio/devices/virtio0/virtio-ports/vport0p0/name console.0 ``` In `/etc/inittab`, we already configured to spawn `getty` on the first hypervisor console `/dev/hvc0`. This will effectively run `login(1)` over the serial console. From the host we can run `telnet localhost 2222` and are presented with a login shell to the guest. As we already included to launch `getty` on the first hypervisor console `/dev/hvc0` in `/etc/inittab`, we can directly connect to the back-end chardev and login to the guest with `root:1234`. ```sh > telnet -4 localhost 2222 Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. virtio-box login: root Password: root@virtio-box ~ # ``` ## Virtio blk To enable support for the virtio block device we enable the kernel configs shown below. First we enable general support for block devices and then for virtio block devices. Additionally we enable support for the `ext2` filesystem because we are creating an ext2 filesystem to back the virtio block device. ```sh,hide_lines=1-39 48-1000 {{ include(path="content/2021-12-02-toying-with-virtio/build_kernel.sh") }} ``` The full build script is available under [build_kernel.sh][build-kernel]. Next we are creating the ext2 filesystem image. This we'll do by creating an `128M` blob and format it with ext2 afterwards. Then we can mount the image via a `loop` device and populate the filesystem. ```sh,hide_lines=1-2 8-1000 {{ include(path="content/2021-12-02-toying-with-virtio/build_ext2.sh") }} ``` Before booting the guest we will attach the virtio block device to the VM. Therefore we add the `-drive` configuration to our previous qemu invocation. ```sh qemu-system-x86_64 \ ... -drive if=virtio,file=fs.ext2,format=raw ``` The `-drive` option is a shortcut for a `-device (front-end) / -blockdev (back-end)` pair. The `if=virtio` flag specifies the interface of the front-end device to be `virtio`. The `file` and `format` flags configure the back-end to be a disk image. After booting the guest we are dropped into a shell and can verify a few things. First we check if the virtio block device is detected, then we check if we have support for the ext2 filesystem and finally we mount the disk. ```sh root@virtio-box ~ # ls -l /sys/block/ lrwxrwxrwx 1 root 0 0 Dec 3 22:46 vda -> ../devices/pci0000:00/0000:00:05.0/virtio1/block/vda root@virtio-box ~ # cat /proc/filesystems ... ext2 root@virtio-box ~ # mount -t ext2 /dev/vda /mnt EXT2-fs (vda): warning: mounting unchecked fs, running e2fsck is recommended ext2 filesystem being mounted at /mnt supports timestamps until 2038 (0x7fffffff) root@virtio-box ~ # cat /mnt/hello world ``` ## Virtio net To enable support for the virtio network device we enable the kernel configs shown below. First we enable general support for networking and TCP/IP and then enable the core networking driver and the virtio net driver. ```sh,hide_lines=1-48 63-1000 {{ include(path="content/2021-12-02-toying-with-virtio/build_kernel.sh") }} ``` The full build script is available under [build_kernel.sh][build-kernel]. For the qemu device emulation we already decided on the front-end device, which will be our virtio net device. \ On the back-end we will choose the [`user`][qemu-user-net] option. This enables a network stack implemented in userspace based on [libslirp][libslirp], which has the benefit that we do not need to setup additional network interfaces and therefore require any privileges. Fundamentally, [libslirp][libslirp] works by replaying [Layer 2][osi-2] packets received from the guest NIC via the socket API on the host ([Layer 4][osi-4]) and vice versa. User networking comes with a set of limitations, for example - Can not use `ping` inside the guest as `ICMP` is not supported. - The guest is not accessible from the host. With the guest, qemu and the host in the picture this looks something like the following. ``` +--------------------------------------------+ | host | | +-------------------------+ | | | guest | | | | | | | | user | | | +------+------+-----------+ | | | | eth0 | kernel | | | | +--+---+ | | | | | | | | | +-----v--------+ | | | | | nic (virtio) | | | | +--+---+-----+--------+------+--+ | | | | Layer 2 qemu | | | | | (eth frames) | | | | +----v-----+ | | | | | libslirp | | | | | +----+-----+ | | | | | Layer 4 | | | | | (socket API) | user | +--+---------+--v---+--------------+---------+ | | eth0 | kernel | | +------+ | +--------------------------------------------+ ``` The user networking implements a virtually NAT'ed sub-network with the address range `10.0.2.0/24` running an internal dhcp server. By default, the dhcp server assigns the following IP addresses which are interesting to us: - `10.0.2.2` host running the qemu emulation - `10.0.2.3` virtual DNS server > The netdev options `net=addr/mask`, `host=addr`, `dns=addr` can be used to > re-configure the sub-network (see [network options][qemu-nic-opts]). With the details of the sub-network in mind we can add some additional setup to the initramfs which performs the basic network setup. We add the virtual DNS server to `/etc/resolv.conf` which will be used by the libc resolver functions. Additionally we assign a static ip to the `eth0` network interface, bring the interface up and define the default route via the host `10.0.2.2`. ```sh,hide_lines=1-68 86-1000 {{ include(path="content/2021-12-02-toying-with-virtio/build_initramfs.sh") }} ``` The full build script is available under [build_initramfs.sh][build-initramfs]. Before booting the guest we will attach the virtio net device and configure to use the user network stack. Therefore we add the `-nic` configuration to our previous qemu invocation. ```sh qemu-system-x86_64 \ ... -nic user,model=virtio-net-pci ``` The `-nic` option is a shortcut for a `-device (front-end) / -netdev (back-end)` pair. After booting the guest we are dropped into a shell and can verify a few things. First we check if the virtio net device is detected. Then we check if the interface got configured and brought up correctly. ```sh root@virtio-box ~ # ls -l /sys/class/net/ lrwxrwxrwx 1 root 0 0 Dec 4 16:56 eth0 -> ../../devices/pci0000:00/0000:00:03.0/virtio0/net/eth0 lrwxrwxrwx 1 root 0 0 Dec 4 16:56 lo -> ../../devices/virtual/net/lo root@virtio-box ~ # ip -o a 2: eth0 inet 10.0.2.15/24 scope global eth0 ... root@virtio-box ~ # ip route default via 10.0.2.2 dev eth0 10.0.2.0/24 dev eth0 scope link src 10.0.2.15 ``` We can resolve out domain and see that the virtual DNS gets contacted. ```sh root@virtio-box ~ # nslookup memzero.de Server: 10.0.2.3 Address: 10.0.2.3:53 Non-authoritative answer: Name: memzero.de Address: 46.101.148.203 ``` Additionally we can try to access a service running on the host. Therefore we run a simple http server on the host (where we launched qemu) with the following command `python3 -m http.server --bind 0.0.0.0 1234`. This will launch the server to listen for any incoming address at port `1234`. From within the guest we can manually craft a simple http `GET` request and send it to the http server running on the host. For that we use the IP address `10.0.2.2` which the dhcp assigned to our host. ```sh root@virtio-box ~ # echo "GET / HTTP/1.0" | nc 10.0.2.2 1234 HTTP/1.0 200 OK Server: SimpleHTTP/0.6 Python/3.9.7 Date: Sat, 04 Dec 2021 16:58:56 GMT Content-type: text/html; charset=utf-8 Content-Length: 917 Directory listing for /

Directory listing for /



``` ## Appendix: Workspace To re-produce the setup and play around with it just grab a copy of the following files: - [Dockerfile][dockerfile] - [Makefile][makefile] - [build_initramfs.sh][build-initramfs] - [build_kernel.sh][build-kernel] - [build_ext2.sh][build-ext2] Then run the following steps to build everything. The prefix `[H]` and `[C]` indicate whether this command is run on the host or inside the container respectively. ```sh # To see all the make targets. [H] make help # Build docker image, start a container with the current working dir # mounted. On the first invocation this takes some minutes to build # the image. [H]: make docker # Build kernel and initramfs. [C]: make # Build ext2 fs as virtio blkdev backend. [H]: make ext2 # Start qemu guest. [H]: make run ``` [build-initramfs]: https://git.memzero.de/blog/tree/content/2021-12-02-toying-with-virtio/build_initramfs.sh?h=main [build-kernel]: https://git.memzero.de/blog/tree/content/2021-12-02-toying-with-virtio/build_kernel.sh?h=main [build-ext2]: https://git.memzero.de/blog/tree/content/2021-12-02-toying-with-virtio/build_ext2.sh?h=main [makefile]: https://git.memzero.de/blog/tree/content/2021-12-02-toying-with-virtio/Makefile?h=main [dockerfile]: https://git.memzero.de/blog/tree/content/2021-12-02-toying-with-virtio/Dockerfile?h=main [initramfs]: https://www.kernel.org/doc/Documentation/filesystems/ramfs-rootfs-initramfs.txt [virtio]: http://docs.oasis-open.org/virtio/virtio/v1.1/virtio-v1.1.pdf [qemu-nic-opts]: https://www.qemu.org/docs/master/system/invocation.html#hxtool-5 [qemu-user-net]: https://www.qemu.org/docs/master/system/devices/net.html#using-the-user-mode-network-stack [libslirp]: https://gitlab.com/qemu-project/libslirp [osi-2]: https://osi-model.com/data-link-layer [osi-4]: https://osi-model.com/transport-layer