cross-posted from: https://leminal.space/post/28955576
I learned how to do this recently, and I wanted to share. Once you know what to
do VPN confinement is easy to set up on NixOS.
The scenario: you want selected processes to run through a VPN, but you want
everything else to not run through the VPN. On Linux you can do this with
a network namespace. That's a kernel feature that defines a network stack that
is isolated from your default network stack. Processes can be configured to run
in a new namespace, and when they do they cannot access the usual
not-VPN-protected network interfaces. Network namespaces work along with other
types of namespaces, like process namespaces, to allow Docker containers to
function almost as though they are separate machines from the host system.
Actually Docker containers are regular processes that are carefully isolated
using namespaces, cgroups, and private filesystems. Because of that isolation
Docker containers are a popular choice for VPN confinement. But since all you
really need is network isolation you can skip the middleman, and use network
namespaces directly.
There is a third-party NixOS module that automates this,
VPN-Confinement. Here's an
example that runs a Borg backup job through a VPN connection. (This example also
uses the third-party sops-nix module to
encrypt VPN credentials.)
{ config, ... }:
let
vpnNamespace = "wg";
in
{
# Define the network namespace for VPN confinement. Creates a VPN network
# interface in the namespace; creates a bridge; sets up routing; creates
# firewall rules to prevent DNS leaking. The VPN-Confinement module requires
# using Wireguard as the VPN protocol.
vpnNamespaces.${vpnNamespace} = {
enable = true;
wireguardConfigFile = config.sops.secrets.wireguard_config.path;
};
# Set up whatever service should run via VPN
services.borgbackup.jobs.homelab = {
paths = "/home/jesse";
encryption.mode = "none";
environment.BORG_RSH = "ssh -i /home/jesse/.ssh/id_ed25519";
repo = "ssh://offsite.sitr.us/backups/homelab";
compression = "auto,zstd";
startAt = "daily";
};
# Modify the systemd unit for your service to run its processes in the VPN
# namespace.
#
# - sets Service.NetworkNamespacePath in the systemd unit
# - sets Service.InaccessiblePaths = [ "/run/nscd" "/run/resolvconf" ] to prevent DNS leaking
# - adds a dependency to the unit that brings up the VPN network namespace
#
# I found the name of the systemd service that services.borgbackups.jobs
# creates by looking at the Borg module source. You can find the source for
# NixOS modules by searching for config options on https://search.nixos.org/options
systemd.services.borgbackup-job-homelab = {
vpnConfinement = {
enable = true;
inherit vpnNamespace;
# `inherit vpnNamespace;` has the same effect as `vpnNamespace = vpnNamespace;`
# I used a variable to be certain that the value here matches the name
# I used to set up the namespace on line 11. If the names don't match then your
# service won't run through the VPN.
};
};
# Load your wireguard config file however you want. Your VPN provider probably
# supports wireguard, and will likely generate a config file for you.
sops.secrets.wireguard_config = {
sopsFile = ./secrets.yaml;
owner = "root";
group = "root";
};
}
This setup assumes using the Wireguard VPN protocol, and assumes that programs
you want to be VPNed are run by systemd. VPN providers mostly support Wireguard,
including Tailscale. But my understanding is that Tailscale's mesh routing
requires additional setup beyond creating a Wireguard interface. So you'd likely
want a different setup for confinement with Tailscale. You can run the Tailscale
client in a network namespace (there is a start on such a setup
here);
or you might use Tailscale's subnet router feature to blend VPN and local
network traffic instead of selective confinement.
Normally when you turn on a VPN your VPN client software creates a network
interface that transparently sends traffic through an encrypted tunnel, and
configures a default route to send network traffic through that interface. So
traffic from all programs is routed through the tunnel. VPN-Confinement creates
that network interface in the isolated namespace, and sets that default route in
the namespace, so that only programs running in the namespace are affected.
There is much more detail in this blog
post. The VPN-Confinement
module differs from the setup in that post in a couple of ways: it has some
extra setup to block DNS requests that aren't properly tunneled; it creates
a network bridge instead of a simple virtual ethernet cable for port forwarding;
and it provides more options for firewall and routing configuration.
VPN-Confinement has an option to forward ports from the default network stack
into the VPN namespace. This is useful if you want all outbound traffic to go
through the VPN, but you want to accept inbound traffic from programs on the
host, or from other machines on your local network, or anywhere else. This is
handy if, for example, you're running a program on a headless server that
provides a web UI for remote administration. Here's an expanded VPN namespace
example:
vpnNamespaces.${vpnNamespace} = {
enable = true;
wireguardConfigFile = config.sops.secrets.wireguard_config.path;
# Forward traffic to specified ports from the default network namespace to
# the VPN namespace.
portMappings = [{ from = 8080; to = 8080; }];
accessibleFrom = [
# Accept traffic from machines on the local network, and route through the
# mapped ports.
"192.168.1.0/24"
];
};
Requests to mapped ports from the host machine need to be addressed to the
network bridge that VPN-Confinement sets up. You can configure its addresses
using the bridgeAddress and bridgeAddressIPv6 options. By default the
addresses are 192.168.15.5 and fd93:9701:1d00::1. If you're configuring
addresses elsewhere in your NixOS config you can use an expression like this:
url = "http://${config.vpnNamespaces.${vpnNamespace}.bridgeAddress}:8080/";
If you look at the source for VPN-Confinement you'll see that namespace
configuration and routing require a lot of stateful ip commands. I think it
would be nice if there were an alternative, declarative interface to iproute2.
But VPN-Confinement is able to encapsulate the stateful stuff in systemd
ExecStart and ExecStopPost scripts.
I ran into an issue where mDNS stopped working while the VPN network namespace
was active. I fixed that problem by configuring Avahi to ignore
VPN-Confinement's network bridge:
services.avahi.denyInterfaces = [ "${vpnNamespace}-br" ];