Skip to main content

Rethinking My VM Image Pipeline

Today, my pipeline regularly builds images for my disposable VMs. Here's the current process:

  • A dedicated builder VM reads Containerfiles for all VMs, including itself.

  • The builder VM uses podman build to create container images for all VMs.

  • The builder VM then uses bootc-image-builder to create disk images for all VMs.

This process works well, but it has a significant issue: the disk images aren't built efficiently. Unlike container images, which benefit from reusable, cacheable layers, disk images are always built from scratch. This leads to long build times and limited opportunities for data deduplication.

To address this, I've been exploring alternative options to improve the pipeline.

Disk Image Formats and Deduplication

My Current Format: QCOW2

I currently use QCOW2 with compression enabled. This format offers several features like snapshots, compression, and sparse files, which are useful when the underlying filesystem doesn't support them. However, if the filesystem does provide these features, QCOW2 doesn't offer many additional benefits over a simple raw disk image, at least for my use case.

Some notes:

  • Raw disk images are more transparent and widely supported by various tools. It's also much easier to deduplicate raw image files than compressed QCOW2 images. A QCOW2 image without compression should theoretically be similar to a raw image, but I haven't verified this.

  • The compression in QCOW2 is "read-only," meaning new writes aren't compressed. This isn't a problem for me because my VMs are immutable, so the images are rarely written to after creation.

  • bootc-image-builder actually builds the raw image first, before converting it into the QCOW2 format.

The Power of Deduplication

I expect deduplication to be highly effective in my setup because most of my disk images are very similar. There are a few ways to achieve this:

  • Filesystem Deduplication: This approach can be either online (e.g., ZFS) or offline (e.g., btrfs). The filesystem finds duplicate data blocks within files and removes redundant data from the disk. This is a general solution but doesn't necessarily speed up the initial build process.

  • Proactive Deduplication: This method is about building new images by applying small changes to an existing one. For example, you can "fork" an image using cp --reflink a.img b.img or qemu-img create -b a.qcow2 b.qcow2. Only the differences between the two images are stored on disk. This approach can significantly speed up the build process because you are not building from scratch, but it requires images to be built incrementally, not from a clean slate.

Exploring New Approaches

Bootc and In-Place Updates

I'm not currently using bootc images in their intended way. bootc is designed so you build a single disk image once and then update it in-place via a container registry.

I've considered two ways of leveraging this:

  1. I could trust the VMs to update themselves.

  2. I could maintain a "trusted base image" and follow this process:

    • Create a base disk image using bootc. This image is only used for building other images and never for running services.

    • To create the disk image for a specific VM, say VM X, I would first fork the base image using cp --reflink or qemu-img create -b to create X.img.

    • I would then boot a VM using X.img and have it upgrade itself using VM X's specific container image. This container image could either be served from the builder VM via a server or a mounted directory, or it could be built locally within the forked VM, potentially using shared layers from a mounted cache.

This process seems workable, but it's overly complex for my taste. It involves running VMs during the build process, which would require a significant amount of scripting.

Plain Disk Images and In-Place Updates

This is similar to the bootc approach but uses standard raw disk images. Again, I could set up temporary VMs for the build process, but instead of relying on bootc's update mechanism, I would need custom scripts. This starts to resemble tools like cloud-init or Ansible.

A key benefit here is that a VM isn't a strict dependency. I could use something like systemd-nspawn to directly modify the disk images in-place, which would simplify scripting and make the process more reliable. I did attempt this with bootc images, but they don't work well with systemd-nspawn out of the box because the partitions lack the UUIDs that systemd-nspawn requires.

Final Thoughts

Ultimately, I haven't found a truly satisfying improvement to my current build process. While some of these approaches could theoretically improve build times and reduce disk usage, they also make the build pipeline more complicated and less reliable. At this moment, I don't think the trade-off is worth it.

For now, I'll probably just experiment with deduplication on ZFS and reflink on XFS. I noted that ZFS doesn't support reflink (zfs_bclone_enabled) by default, so that's a small hurdle.

This exploration has been an interesting learning experience. I've revisited/discovered some relevant tools:

  • libvirt

  • incus

  • systemd-nspawn

  • cloud-init

  • ansible

  • systemd-volatile-root.service

Sometimes, when I'm writing my own scripts, I feel like I'm building a slimmed-down version of these tools myself. However, I'm not yet convinced that it's the right time to fully switch to them.

Comments

Popular posts from this blog

Determine Perspective Lines With Off-page Vanishing Point

In perspective drawing, a vanishing point represents a group of parallel lines, in other words, a direction. For any point on the paper, if we want a line towards the same direction (in the 3d space), we simply draw a line through it and the vanishing point. But sometimes the vanishing point is too far away, such that it is outside the paper/canvas. In this example, we have a point P and two perspective lines L1 and L2. The vanishing point VP is naturally the intersection of L1 and L2. The task is to draw a line through P and VP, without having VP on the paper. I am aware of a few traditional solutions: 1. Use extra pieces of paper such that we can extend L1 and L2 until we see VP. 2. Draw everything in a smaller scale, such that we can see both P and VP on the paper. Draw the line and scale everything back. 3. Draw a perspective grid using the Brewer Method. #1 and #2 might be quite practical. #3 may not guarantee a solution, unless we can measure distances/p...

Qubes OS: First Impressions

A few days ago, while browsing security topics online, Qubes OS surfaced—whether via YouTube recommendations or search results, I can't recall precisely. Intrigued by its unique approach to security through compartmentalization, I delved into the documentation and watched some demos. My interest was piqued enough that I felt compelled to install it and give it a try firsthand. My overall first impression of Qubes OS is highly positive. Had I discovered it earlier, I might have reconsidered starting my hardware password manager project. Conceptually, Qubes OS is not much different from running a bunch of virtual machines simultaneously. However, its brilliance lies in the seamless desktop integration and the well-designed template system, making it far more user-friendly than a manual VM setup. I was particularly impressed by the concept of disposable VMs for temporary tasks and the clear separation of critical functions like networking (sys-net) and USB handling (sys-usb) into the...

Exploring Immutable Distros and Declarative Management

My current server setup, based on Debian Stable and Docker, has served me reliably for years. It's stable, familiar, and gets the job done. However, an intriguing article I revisited recently about Fedora CoreOS, rpm-ostree, and OSTree native containers sparked my curiosity and sent me down a rabbit hole exploring alternative approaches to system management. Could there be a better way? Core Goals & Requirements Before diving into new technologies, I wanted to define what "better" means for my use case: The base operating system must update automatically and reliably. Hosted services (applications) should be updatable either automatically or manually, depending on the service. Configuration and data files need to be easy to modify, and crucially, automatically tracked and backed up. Current Setup: Debian Stable + Docker My current infrastructure consists of several servers, all running Debian Stable. System Updates are andled automatically via unattended-upgrades. Se...