UID and GID: The New Order

When I have important data on a device, I back it up to my server using dedicated user accounts. The other day, I checked /etc/passwd on my server and found entries like this:

some-backup-user1:x:1003:1004:...

some-backup-user2:x:1004:1007:...

A few inconsistencies immediately bothered me:

UID/GID Mismatches: Many users have UIDs that don't match their primary GIDs. While this technically works and might seem like just an aesthetic concern, I realized that UIDs and GIDs are crucial metadata. I need to preserve them accurately for future system migrations to maintain correct file ownership.ID Ambiguity:
ID Ambiguity: The same number (e.g., 1004) could represent a User ID for one account and a Group ID for a completely different group. This overlap is a recipe for mistakes during administration tasks if I'm not paying close attention.
Lack of Structure: User and group accounts created for very different purposes – regular logins, backup processes, container management, specific ACLs – were all jumbled together in the same ID range. This made management and auditing more cumbersome than necessary.

So, I decided it was time for a cleanup. My goal was to reorganize these user and group IDs to achieve clarity and predictablity, aiming for a system where I could infer the purpose of an ID even without direct access to /etc/passwd or /etc/group.

My Strategy

To achieve this, I established the following strategy:

Purpose-Based ID Ranges: Users and groups serving similar functions will be grouped together by assigning them IDs within dedicated numerical ranges. This makes it easier to understand the role of an account at a glance. For exampe:

1000-1999: Backup-related users and groups.
2000-2999: Groups specifically for managing ACLs.

UID/GID Correspondence: If a number X is used as both a UID and a GID, then GID X must be the primary group for the user with UID X. Unrelated users and groups never share the same ID number.
Allocation Within Ranges: Within each designated range, users and their primary groups start from the lower end, secondary groups starts from the higher end.

Example: backup-user1 has 1000:1000, while a group backup-users has GID 1999.

Options for Managing IDs

I considered several approaches to implement this strategy:

Manual Scripting: This would involve carefully crafting scripts using useradd, usermod, groupadd, and groupmod. However, this approach is fraught with risk – a single typo could cause significant problems. It's also labor-intensive to get right and tedious to maintain. I quickly ruled this out.
Virtual Machine with cloud-init: Using a VM combined with cloud-init offered a more structured way to script user/group creation via its built-in directives, executing reliably during boot. This reduces some risks compared to updating an existing system but it introduces the overhead of managing a VM (file sharing, updates, resource consumption), which I also wanted to avoid.
Virtual Machine with NixOS/Guix System: These operating systems offer truly declarative user management, which is very appealing. While I run simple NixOS instances, fully embracing either OS just for UID/GID management felt like overkill and required a significant learning investment. Plus, this option still carried the VM overhead.
Using sysusers.d: While searching for "declarative user management" solutions, I discovered sysusers.d. This systemd mechanism uses simple configuration files (like fstab) to declare users and groups and their properties (UIDs/GIDs). Systemd ensures these users/groups exist on boot. Crucially, it was already supported and installed on my Debian server. This offered a declarative, script-free, VM-free solution integrated directly into my existing OS – making it the clear choice.

The Migration

For each user and group, I added a corresponding line to a configuration file within /etc/sysusers.d. All IDs are manually allocated. Then I deleted the old users one by one and restarted the systemd-sysusers service. Note that the service does not touch existing users or groups.

After that I needed to migrate the permission of existing files. This proved trickier than I had expected:

chown has the nice `--from=OWNER:GROUP` parameter. The man page says "Either may be omitted, in which case a match is not required for the omitted attribute." However:

`chown --from :group` works
`chown --from user` works
`chown --from user:` doesn't work.

`chown user:` actually means `chown user:user`, i.e. it also updates group, but `chown user` updates only the owner
`setfacl` by default updates the mask, so `setfacl -m g:group:rx` actually made all my files executable! I had to rollback a ZFS snapshot and use `setfacl --no-mask` instead.

Finally, I also learned about `pwck -s` and `grpck -s` to improve the quality of my life.

What's Next

I plan to apply similar principles to my container setup. Specifically, I want to create dedicated users for each rootless container. And I will have to figure out:

How to manage subuid and subgid ranges in a structured, preferably declarative, way.
Choosing between docker and podman.
Whether to use VMs. If so, which OS to use, e.g. NixOS, GuixSystem, CoreOS.

WangLu's Notes

Search This Blog