I want to filter the outgoing network traffic for all of my containers based on a set of rules. For example:
- Some containers should be blocked from accessing the internet entirely.
- Some containers should have unrestricted internet access.
- Some containers should be able to access the internet, but not a specific list of URLs.
- Some containers should only be allowed to access a specific list of URLs.
To manage this, I will define logical policy groups and assign each container to one. As a general rule, only DNS and HTTP/HTTPS traffic will be permitted.
Option 1: A Proxy for Each Policy Group
Imagine Container A is only allowed to access www.google.com
. Here’s how this approach would work:
- Create an Nginx (or
socat
) container that listens on port 443 and acts as a reverse proxy forwww.google.com
. - Place both the Nginx proxy and Container A into an internal container network.
- Within this network, add
www.google.com
as a network alias for the Nginx container. - Connect the Nginx container to a second network that has internet access.
Thoughts
This is my current solution using docker-compose
, and I believe it should also work with Podman.
It is possible to use a single Nginx container to proxy multiple domains, even for HTTPS traffic. By using the ngx_stream_ssl_preread_module
, Nginx can inspect the requested domain from the TLS handshake and forward the traffic accordingly without needing to decrypt it.
This option is straightforward to implement, and a key advantage is that I don't need to set up a custom DNS server. It is also relatively easier to write firewall rules.
On the other hand, configuring and managing a separate proxy container for each rule can become tedious. I think using Quadlet files, especially with templates and drop-in overrides, could simplify this process.
Another significant downside is the inability to log blocked traffic. If a container tries to access a domain that isn't explicitly proxied, the connection will simply fail without a log entry, making troubleshooting difficult.
Option 2: Central Proxy on a Single Network
In this design, we set up a central proxy for both HTTP/S and DNS traffic and then perform the following steps:
- Intercept and redirect all traffic from containers to the central proxy using
nftables
rules. For DNS, this is simpler, as I can configure the container network to use my custom DNS server. - The proxy must identify the source container to determine which policy group it belongs to.
- The proxy must identify the requested destination. This is easy for HTTP (from the URL) and DNS (from the query). For HTTPS, we can again use the SSL preread technique to find the domain in the TLS handshake.
- The proxy applies the policy, then either blocks or forwards the traffic.
Networking
First, I would create a veth
pair. On one end, I would create a macvlan
network in "private" mode and connect the containers to it. The other end would be assigned an IP address on the host to allow routing. This essentially creates a bridge where connected containers are isolated from each other but can reach the gateway.
Podman doesn't seem to support configuring a standard bridge with a mix of isolated and non-isolated ports. Note that the --isolate
option in podman network
isolates the entire network from other container networks, not individual ports on the bridge.
In the diagram, the proxies are shown on a separate bridge connected to the internet, mainly for illustration. In practice, it might be easier to connect all containers to the same macvlan
network and use a firewall to control traffic flow. Although the macvlan
network is in private mode, the firewall may allow "hairpin" packets to allow traffic between specific containers.
Identifying Containers
We can identify containers by their IP addresses. The tricky part is ensuring these IP addresses are trustworthy and that the setup isn't prone to errors.
Let's review the IPAM drivers supported by podman network
:
- dhcp: For each container, we can assign a fixed MAC address and create a static reservation in the DHCP server. The firewall can then reliably use the container's IP address to identify it. This assumes that containers are unprivileged and cannot change their own MAC or IP addresses. Ideally, the default address pool of the DHCP server should be disabled to prevent unassigned containers from getting an IP.
- host-local: With this driver, we assign a static IP address during
podman run
. While this sounds simple, it's easy to forget to provide an IP when running a container manually. If that happens, Podman will assign an IP automatically. This could accidentally grant a container internet access or cause an IP conflict. I haven't found a way to disable this automatic IP address allocation. - none: This driver does not assign an IP address, and you cannot manually provide one either.
As a coclusion, only "dhcp" works.
Deciding the Policy Group
Once the container is identified, applying the policy is relatively easy:
- CoreDNS has the
view
plugin, which can apply different rules based on the client's IP address. - Nginx has the
geo
module, which can be used to map a client's IP address to a variable for use in access rules. You can also usemap $remote_addr
.
Option 3: One Network Per Policy Group
This approach extends the "veth+macvlan" technique by creating a separate network for each policy group. We then use nftables
rules to forward traffic from all networks to a central proxy. This is similar to Option 2, but this time nftables
can identify the source policy group by the network interface the traffic arrives on.
This approach is more secure if you are concerned about IP or MAC spoofing since the network interface is a more reliable identifier than an IP address alone.
Identifying the Policy Group
- By IP Address: We can configure a DHCP server for each network with a non-overlapping IP range. The proxies can then identify containers by their IP address, just like in Option 2, but with greater trust since the IP is tied to a specific network. We still need to be cautious to ensure IP ranges don't overlap.
- By Interface: We can identify traffic by the interface it comes from.
- CoreDNS has the
bind
plugin, which allows it to listen on specific host interfaces. However, this requires CoreDNS to run in the host network, and the proxy would need to be restarted every time a new policy group (and thus a new interface) is added. It's also unclear how this would work with Nginx. - A variation is to run CoreDNS and use port forwarding (or maybe socket activation) to listen on all interfaces, then I can use
redirect
in nftables. This way the traffic within each policy group should be redirected to the corresponding gateway. However, this setup sounds complicated, and similar to above, I'm not sure if it'll work for nginx. - A more complex option is to use
nftables
to map each incoming interface to a different port on the host. We could then run a proxy instance for each policy group, listening on its assigned port. This essentially moves the identification logic intonftables
and is useful if a proxy doesn't support IP-based policies, but the rules would be complicated and fragile. For example, we would need rules to prevent a container from accessing a proxy port it's not authorized for.
- CoreDNS has the
My Plan
Ultimately, I need to find a balance between two goals:
- Maximum Security: Resisting vulnerabilities and malicious actors.
- Ease of Maintenance: Requiring minimal effort and not being error-prone.
I will most likely implement Option 2 with a few modifications. It offers a good blend of centralized control and flexibility without the complexity of managing dozens of networks or proxy containers.
Comments