From 6059b8d8c6085426fce1a6e638af069750c9dd54 Mon Sep 17 00:00:00 2001 From: Johannes Stoelp Date: Sat, 4 Dec 2021 18:30:54 +0100 Subject: added virtio post --- content/2021-12-02-toying-with-virtio.md | 332 +++++++++++++++++++++ content/2021-12-02-toying-with-virtio/Dockerfile | 44 +++ content/2021-12-02-toying-with-virtio/Makefile | 57 ++++ .../build_initramfs.sh | 90 ++++++ .../2021-12-02-toying-with-virtio/build_kernel.sh | 67 +++++ templates/shortcodes/include_range.md | 17 ++ 6 files changed, 607 insertions(+) create mode 100644 content/2021-12-02-toying-with-virtio.md create mode 100644 content/2021-12-02-toying-with-virtio/Dockerfile create mode 100644 content/2021-12-02-toying-with-virtio/Makefile create mode 100755 content/2021-12-02-toying-with-virtio/build_initramfs.sh create mode 100755 content/2021-12-02-toying-with-virtio/build_kernel.sh create mode 100644 templates/shortcodes/include_range.md diff --git a/content/2021-12-02-toying-with-virtio.md b/content/2021-12-02-toying-with-virtio.md new file mode 100644 index 0000000..9de9a94 --- /dev/null +++ b/content/2021-12-02-toying-with-virtio.md @@ -0,0 +1,332 @@ ++++ +title = "QEMU virtio configurations" + +[taxonomies] +tags = ["linux", "qemu", "virtio"] ++++ + +For my own reference I wanted to document some minimal [`virtio`][virtio] +device configurations with qemu and the required Linux kernel configuration to +enable those devices. + +The devices we will use are `virtio console`, `virtio blk` and `virtio net`. + +To make use of the virtio devices in qemu we are going to build and boot into +busybox based [`initramfs`][initramfs]. + +## Build initramfs + +For the initramfs there is not much magic, we will grab a copy of busybox, +configure it with the default config (`defconfig`) and enable static linking as +we will use it as rootfs. + +For the `init` process we will use the one provided by busybox but we have to +symlink it to `/init` as during boot, the kernel will extract the cpio +compressed initramfs into `rootfs` and look for the `/init` file. If that's not +found the kernel will fallback to an older mechanism an try to mount a root +partition (which we don't have). +> Optionally the init binary could be specified with the `rdinit=` kernel boot +> parameter. + +We populate the `/etc/inittab` and `/etc/init.d/rcS` with a minimal +configuration to mount the `proc`, `sys` and `dev` filesystems and drop into a +shell after the boot is completed. \ +Additionally we setup `/etc/passwd` and `/etc/shadow` with an entry for the +`root` user with the password `1234`, so we can login via the virtio console +later. + +```sh +{{ include_range(path="content/2021-12-02-toying-with-virtio/build_initramfs.sh", start=31, end=67) }} +``` + +The full build script is available under [build_initramfs.sh][build-initramfs]. + +## Virtio console + +To enable support for the virtio console we enable the kernel configs shown +below. +The pci configurations are enabled because in qemu the virtio console front-end +device (the one presented to the guest) is attached to the pci bus. + +```sh +{{ include_range(path="content/2021-12-02-toying-with-virtio/build_kernel.sh", start=32, end=38) }} +``` + +The full build script is available under [build_kernel.sh][build-kernel]. + +To boot-up the guest we use the following qemu configuration. + +```sh +qemu-system-x86_64 \ + -nographic \ + -cpu host \ + -enable-kvm \ + -kernel ./linux-$(VER)/arch/x86/boot/bzImage \ + -append "earlyprintk=ttyS0 console=ttyS0 root=/dev/ram0 ro" \ + -initrd ./initramfs.cpio.gz \ + -device virtio-serial-pci \ + -device virtconsole,chardev=vcon,name=console.0 \ + -chardev socket,id=vcon,ipv4=on,host=localhost,port=2222,server,telnet=on,wait=off +``` + +The important parts in this configuration are the last three lines. + +The `virtio-serial-pci` device creates the serial bus where the virtio console +is attached to. + +The `virtconsole` creates the virtio console device exposed to the guest +(front-end). The `chardev=vcon` option specifies that the chardev with +`id=vcon` is attached as back-end to the virtio console. +The back-end device is the one we will have access to from the host running the +emulation. + +The chardev back-end we configure to be a `socket`, running a telnet server +listening on port 2222. The `wait=off` tells qemu that it can directly boot +without waiting for a client connection. + +After booting the guest we are dropped into a shell and can verify that our +device is being detected properly. +```sh +root@virtio-box ~ # ls /sys/bus/virtio/devices/ +virtio0 +root@virtio-box ~ # cat /sys/bus/virtio/devices/virtio0/virtio-ports/vport0p0/name +console.0 +``` + +In `/etc/inittab`, we already configured to spawn `getty` on the first +hypervisor console `/dev/hvc0`. This will effectively run `login(1)` over the +serial console. + +From the host we can run `telnet localhost 2222` and are presented with a login shell to the guest. + +As we already included to launch `getty` on the first hypervisor console +`/dev/hvc0` in `/etc/inittab`, we can directly connect to the back-end chardev +and login to the guest with `root:1234`. + +```sh +> telnet -4 localhost 2222 +Trying 127.0.0.1... +Connected to localhost. +Escape character is '^]'. + +virtio-box login: root +Password: +root@virtio-box ~ # +``` + +## Virtio blk + +To enable support for the virtio block device we enable the kernel configs +shown below. +First we enable general support for block devices and then for virtio block +devices. Additionally we enable support for the `ext2` filesystem because we +are creating an ext2 filesystem to back the virtio block device. + +```sh +{{ include_range(path="content/2021-12-02-toying-with-virtio/build_kernel.sh", start=40, end=47) }} +``` + +The full build script is available under [build_kernel.sh][build-kernel]. + +Next we are creating the ext2 filesystem image. This we'll do by creating an +`128M` blob and format it with ext2 afterwards. Then we can mount the image +via a `loop` device and populate the filesystem. +```sh +dd if=/dev/zero of=rootfs.ext2 bs=1M count=128 +mkfs.ext2 rootfs.ext2 +mount -t ext2 -o loop rootfs.ext2 /mnt +echo world > /mnt/hello +umount /mnt +``` + +Before booting the guest we will attach the virtio block device to the VM. +Therefore we add the `-drive` configuration to our previous qemu invocation. + +```sh +qemu-system-x86_64 \ + ... + -drive if=virtio,file=rootfs.ext2,format=raw +``` + +The `-drive` option is a shortcut for a `-device (front-end) / -blockdev +(back-end)` pair. + +The `if=virtio` flag specifies the interface of the front-end device to be +`virtio`. + +The `file` and `format` flags configure the back-end to be a disk image. + +After booting the guest we are dropped into a shell and can verify a few +things. First we check if the virtio block device is detected, then we check if +we have support for the ext2 filesystem and finally we mount the disk. + +```sh +root@virtio-box ~ # ls -l /sys/block/ +lrwxrwxrwx 1 root 0 0 Dec 3 22:46 vda -> ../devices/pci0000:00/0000:00:05.0/virtio1/block/vda + +root@virtio-box ~ # cat /proc/filesystems +... + ext2 + +root@virtio-box ~ # mount -t ext2 /dev/vda /mnt +EXT2-fs (vda): warning: mounting unchecked fs, running e2fsck is recommended +ext2 filesystem being mounted at /mnt supports timestamps until 2038 (0x7fffffff) + +root@virtio-box ~ # cat /mnt/hello +world +``` + +## Virtio net + +To enable support for the virtio network device we enable the kernel configs +shown below. +First we enable general support for networking and TCP/IP and then enable the +core networking driver and the virtio net driver. + +```sh +{{ include_range(path="content/2021-12-02-toying-with-virtio/build_kernel.sh", start=49, end=62) }} +``` + +The full build script is available under [build_kernel.sh][build-kernel]. + +For the qemu device emulation we already decided on the front-end device, which +will be our virtio net device. \ +On the back-end we will choose the [`user`][qemu-user-net] network stack. This +runs a dhcp server in qemu which implements a virtually NAT'ed sub-network with +the address range `10.0.2.0/24`. By default, the dhcp server assigns the +following IP addresses interesting to us: +- `10.0.2.2` host running the qemu emulation +- `10.0.2.3` virtual DNS server +> The netdev options `net=addr/mask`, `host=addr`, `dns=addr` can be used to +> re-configure the sub-network (see [network options][qemu-nic-opts]). + +With the details of the sub-network in mind we can add some additional setup to +the initramfs which performs the basic network setup. + +We add the virtual DNS server to `/etc/resolv.conf` which will be used by the +libc resolver functions. + +Additionally we assign a static ip to the `eth0` network interface, bring the +interface up and define the default route via the host `10.0.2.2`. + +```sh +{{ include_range(path="content/2021-12-02-toying-with-virtio/build_initramfs.sh", start=69, end=85) }} +``` + +The full build script is available under [build_initramfs.sh][build-initramfs]. + +Before booting the guest we will attach the virtio net device and configure to +use the user network stack. +Therefore we add the `-nic` configuration to our previous qemu invocation. + +```sh +qemu-system-x86_64 \ + ... + -nic user,model=virtio-net-pci +``` + +After booting the guest we are dropped into a shell and can verify a few +things. First we check if the virtio net device is detected. Then we check if +the interface got configured and brought up correctly. + +```sh +root@virtio-box ~ # ls -l /sys/class/net/ +lrwxrwxrwx 1 root 0 0 Dec 4 16:56 eth0 -> ../../devices/pci0000:00/0000:00:03.0/virtio0/net/eth0 +lrwxrwxrwx 1 root 0 0 Dec 4 16:56 lo -> ../../devices/virtual/net/lo + + +root@virtio-box ~ # ip -o a +2: eth0 inet 10.0.2.15/24 scope global eth0 ... + +root@virtio-box ~ # ip route +default via 10.0.2.2 dev eth0 +10.0.2.0/24 dev eth0 scope link src 10.0.2.15 +``` + +We can resolve out domain and see that the virtual DNS gets contacted. + +```sh +root@virtio-box ~ # nslookup memzero.de +Server: 10.0.2.3 +Address: 10.0.2.3:53 + +Non-authoritative answer: +Name: memzero.de +Address: 46.101.148.203 +``` + +Additionally we can try to access a service running on the host. Therefore we +run a simple http server on the host (where we launched qemu) with the +following command `python3 -m http.server --bind 0.0.0.0 1234`. This will +launch the server to listen for any incoming address at port `1234`. + +From within the guest we can manually craft a simple http `GET` request and +send it to the http server running on the host. For that we use the IP address +`10.0.2.2` which the dhcp assigned to our host. + +```sh +root@virtio-box ~ # echo "GET / HTTP/1.0" | nc 10.0.2.2 1234 +HTTP/1.0 200 OK +Server: SimpleHTTP/0.6 Python/3.9.7 +Date: Sat, 04 Dec 2021 16:58:56 GMT +Content-type: text/html; charset=utf-8 +Content-Length: 917 + + + + + +Directory listing for / + + +

Directory listing for /

+
+ +
+ + +``` + +## Appendix: Workspace + +To re-produce the setup and play around with it just grab a copy of the +following files: +- [Dockerfile][dockerfile] +- [Makefile][makefile] +- [build_initramfs.sh][build-initramfs] +- [build_kernel.sh][build-kernel] + +Then run the following steps to build everything. The prefix `[H]` and `[C]` +indicate whether this command is run on the host or inside the container +respectively. +```sh +# To see all the make targets. +[H] make help + +# Build docker image, start a container with the current working dir +# mounted. On the first invocation this takes some minutes to build +# the image. +[H]: make docker + +# Build kernel and initramfs. +[C]: make + +# Create the rootfs.ext2 disk image as described in the virtio blk +# section above, or remove the drive from the qemu command line +# in the make `run` target. + +# Start qemu guest. +[H]: make run +``` + +[build-initramfs]: https://git.memzero.de/johannst/blog/src/branch/main/content/2021-12-02-toying-with-virtio/build_initramfs.sh +[build-kernel]: https://git.memzero.de/johannst/blog/src/branch/main/content/2021-12-02-toying-with-virtio/build_kernel.sh +[makefile]: https://git.memzero.de/johannst/blog/src/branch/main/content/2021-12-02-toying-with-virtio/Makefile +[dockerfile]: https://git.memzero.de/johannst/blog/src/branch/main/content/2021-12-02-toying-with-virtio/Dockerfile +[initramfs]: https://www.kernel.org/doc/Documentation/filesystems/ramfs-rootfs-initramfs.txt +[virtio]: http://docs.oasis-open.org/virtio/virtio/v1.1/virtio-v1.1.pdf +[qemu-nic-opts]: https://www.qemu.org/docs/master/system/invocation.html#hxtool-5 +[qemu-user-net]: https://www.qemu.org/docs/master/system/devices/net.html#using-the-user-mode-network-stack diff --git a/content/2021-12-02-toying-with-virtio/Dockerfile b/content/2021-12-02-toying-with-virtio/Dockerfile new file mode 100644 index 0000000..f892fef --- /dev/null +++ b/content/2021-12-02-toying-with-virtio/Dockerfile @@ -0,0 +1,44 @@ +FROM ubuntu:20.04 +MAINTAINER Johannes Stoelp + +ARG UID + +RUN apt update \ + && DEBIAN_FRONTEND=noninteractive \ + apt install \ + --yes \ + --no-install-recommends \ + # Download & unpack. + wget \ + ca-certificates \ + xz-utils \ + # Build tools & deps (kernel). + make \ + bc \ + gcc g++ \ + flex bison \ + libelf-dev \ + libncurses-dev \ + # Build tools & deps (initrd). + cpio \ + # Run & debug. + qemu-system-x86 \ + # Convenience. + sudo \ + telnet \ + ripgrep \ + fd-find \ + neovim \ + && rm -rf /var/lib/apt/lists/* \ + && apt-get clean + +# Allow 'user' to use sudo without password. +# Convenience in case we want to install some packages in the container later. +RUN echo "user ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers.d/user + +# Create user with the UID passed during docker build. +RUN useradd --create-home --home-dir /home/user --uid $UID --shell /bin/bash user +# Start container with user. +USER user +# Change default working dir. +WORKDIR /develop diff --git a/content/2021-12-02-toying-with-virtio/Makefile b/content/2021-12-02-toying-with-virtio/Makefile new file mode 100644 index 0000000..4009862 --- /dev/null +++ b/content/2021-12-02-toying-with-virtio/Makefile @@ -0,0 +1,57 @@ +VER := 5.15.6 + +all: kernel init + +help: + @echo "Build targets:" + @echo "* init - Build busybox based initramfs." + @echo "* kernel - Build minimal linux kernel." + @echo " clean - Cleanup downloads & builds." + @echo "" + @echo "Run targets:" + @echo " run - Boot into the initramfs (qemu)." + @echo " vcon - Attach to guest virtio console." + @echo " This needs an already running guest VM and must be" + @echo " executed in the same domain as the 'run' target" + @echo " (docker or host)." + @echo " Login as 'root' user, no passwd required." + @echo "" + @echo "Docker targets:" + @echo " docker - Build and start the docker container." + @echo " attach - Start an additional bash in the container." + @echo " This needs an already running container." + +kernel: + ./build_kernel.sh + +init: + ./build_initramfs.sh + +run: + qemu-system-x86_64 \ + -nographic \ + -cpu host \ + -enable-kvm \ + -kernel ./linux-$(VER)/arch/x86/boot/bzImage \ + -append "earlyprintk=ttyS0 console=ttyS0 root=/dev/ram0 ro" \ + -initrd ./initramfs.cpio.gz \ + -device virtio-serial-pci \ + -device virtconsole,chardev=vcon,name=console.0 \ + -chardev socket,id=vcon,ipv4=on,host=localhost,port=2222,server,telnet=on,wait=off \ + -drive format=raw,if=virtio,file=rootfs.ext2 \ + -nic user,model=virtio-net-pci + +vcon: + telnet -4 localhost 2222 + +clean: + $(RM) initramfs.cpio.gz + $(RM) -r busybox-* + $(RM) -r linux-$(VER)* + +docker: + DOCKER_BUILDKIT=1 docker build --build-arg UID=$(shell id -u) -t virtio-dev . + docker run --name virtio -it --rm -v $(PWD):/develop --device /dev/kvm virtio-dev + +attach: + docker exec -it virtio bash diff --git a/content/2021-12-02-toying-with-virtio/build_initramfs.sh b/content/2021-12-02-toying-with-virtio/build_initramfs.sh new file mode 100755 index 0000000..c30fa5e --- /dev/null +++ b/content/2021-12-02-toying-with-virtio/build_initramfs.sh @@ -0,0 +1,90 @@ +#!/bin/bash + +set -e + +BUSYBOX=busybox-1.34.1 +INITRAMFS=$PWD/initramfs.cpio.gz + +## Build busybox (static). + +test -f $BUSYBOX.tar.bz2 || wget -nc https://busybox.net/downloads/$BUSYBOX.tar.bz2 +test -d $BUSYBOX || tar xf $BUSYBOX.tar.bz2 + +cd $BUSYBOX +make defconfig +sed -i 's/[# ]*CONFIG_STATIC[ =].*/CONFIG_STATIC=y/' .config +make -j$(nproc) busybox +make install + +## Create initramfs. + +cd _install + +# 1. Create initramfs folder structure. + +mkdir -p bin sbin etc/init.d proc sys usr/bin usr/sbin dev mnt + +# 2. Prepare init. + +# By default, initramfs executes /init. +# Optionally change with rdinit= kernel parameter. +ln -sfn sbin/init init + +cat < etc/inittab +# Initialization after boot. +::sysinit:/etc/init.d/rcS + +# Shell on console after user presses key. +::askfirst:/bin/cttyhack /bin/sh -l + +# Spawn getty on first virtio console. +::respawn:/sbin/getty hvc0 9600 vt100 +EOF + +cat < etc/init.d/rcS +#!/bin/sh + +# Mount devtmpfs, which automatically populates /dev with devices nodes. +# So no mknod for our experiments :} +mount -t devtmpfs none /dev + +# Mount procfs and sysfs. +mount -t proc none /proc +mount -t sysfs none /sys + +# Set hostname. +hostname virtio-box +EOF +chmod +x etc/init.d/rcS + +cat < etc/profile +export PS1="\[\e[31m\e[1m\]\u@\h\[\e[0m\] \w # " +EOF + +# 3. Create minimal passwd db with 'root' user and password '1234'. +# Mainly used for login on virtual console in this experiments. +echo "root:x:0:0:root:/:/bin/sh" > etc/passwd +echo "root:$(openssl passwd -crypt 1234):0::::::" > etc/shadow + +# 4. Create minimal setup for basic networking. + +# Virtul DNS from qemu user network. +echo "nameserver 10.0.2.3" > etc/resolv.conf + +# Assign static IP address, bring-up interface and define default route. +cat <> etc/init.d/rcS +# Assign static IP address to eth0 interface. +ip addr add 10.0.2.15/24 dev eth0 + +# Bring up eth0 interface. +ip link set dev eth0 up + +# Add default route via the host (qemu user networking exposes host at this +# address by default). +ip route add default via 10.0.2.2 +EOF + +# 5. Created cpio compressed initramfs. +find . -print0 \ + | cpio --null -ov --format=newc \ + | gzip -9 > $INITRAMFS diff --git a/content/2021-12-02-toying-with-virtio/build_kernel.sh b/content/2021-12-02-toying-with-virtio/build_kernel.sh new file mode 100755 index 0000000..b219b57 --- /dev/null +++ b/content/2021-12-02-toying-with-virtio/build_kernel.sh @@ -0,0 +1,67 @@ +#!/bin/bash + +set -e + +LINUX=linux-5.15.6 + +test -f $LINUX.tar.xz || wget -nc https://cdn.kernel.org/pub/linux/kernel/v5.x/$LINUX.tar.xz +test -d $LINUX || tar xf $LINUX.tar.xz + +cd $LINUX + +cat < kernel_fragment.config +# 64bit kernel. +CONFIG_64BIT=y +# Enable support for cpio compressed initramfs (gzip). +CONFIG_BLK_DEV_INITRD=y +CONFIG_RD_GZIP=y +# Support for ELF and #! binary format. +CONFIG_BINFMT_ELF=y +CONFIG_BINFMT_SCRIPT=y +# Enable devtmpfs (can automatically populate /dev). +CONFIG_DEVTMPFS=y +CONFIG_DEVTMPFS_MOUNT=y +# Enable tty & console. +CONFIG_TTY=y +CONFIG_SERIAL_8250=y +CONFIG_SERIAL_8250_CONSOLE=y +# Enable proc & sys pseudo fs. +CONFIG_PROC_FS=y +CONFIG_SYSFS=y + +# Enable support for virtio pci. +CONFIG_PCI=y +CONFIG_VIRTIO_MENU=y +CONFIG_VIRTIO_PCI=y + +# Enable virtio console driver. +CONFIG_VIRTIO_CONSOLE=y + +# Enable support for block devices. +CONFIG_BLK_DEV=y + +# Enable virtio blk driver. +CONFIG_VIRTIO_BLK=y + +# Enable support for ext2 filesystems. +CONFIG_EXT2_FS=y + +# Enable general networking support. +CONFIG_NET=y + +# Enable support for TCP/IP. +CONFIG_INET=y + +# Enable support for network devices. +CONFIG_NETDEVICES=y + +# Enable networking core drivers. +CONFIG_NET_CORE=y + +# Enable virtio net driver. +CONFIG_VIRTIO_NET=y +EOF + +make tinyconfig +./scripts/kconfig/merge_config.sh -n ./kernel_fragment.config +make -j$(nproc) diff --git a/templates/shortcodes/include_range.md b/templates/shortcodes/include_range.md new file mode 100644 index 0000000..cfcce76 --- /dev/null +++ b/templates/shortcodes/include_range.md @@ -0,0 +1,17 @@ +{# Args: #} +{# path - file to load #} +{# start - start line #} +{# end - end line #} +{# #} +{# Example: #} +{# {{ include_range(path="..", start=1, end=2) }} #} + +{% set data = load_data(path=path) | split(pat="\n") | slice(start=start-1, end=end) %} +{% for line in data -%} + {{ line }} +{% endfor %} + +{# The '-' in the for loop is important, it removes the whitespaces after the stmt. #} +{# See: https://tera.netlify.app/docs/#whitespace-control #} + +{# Note: I wasn't able to pull this off with a join(), maybe I'll debug one day. #} -- cgit v1.2.3