solstice-ci/docs/ai/decisions/004-user-mode-networking.md

36 lines
1.6 KiB
Markdown
Raw Permalink Normal View History

# ADR-004: User-Mode (SLIRP) Networking for VMs
**Date:** 2026-04-07
**Status:** Accepted
**Deciders:** Till Wegmueller
## Context
The orchestrator needs network access to VMs for SSH (uploading runner binary, executing commands). Two options:
1. **TAP with bridge** — VM gets a real IP on a bridge network (e.g., virbr0). Requires NET_ADMIN capability, host bridge access, and TAP device creation. IP discovery via ARP/DHCP lease parsing.
2. **User-mode (SLIRP)** — QEMU provides NAT via user-space networking. VM gets a private IP (10.0.2.x). SSH access via host port forwarding (`hostfwd=tcp::{port}-:22`). No special capabilities needed.
## Decision
Use user-mode (SLIRP) networking with deterministic SSH port forwarding.
Port assignment: `10022 + (hash(vm_name) % 100)` — range 10022-10122.
Guest IP is always `127.0.0.1` from the orchestrator's perspective.
## Consequences
### Positive
- **Container-friendly**: no NET_ADMIN, no bridge access, no host configuration
- **Trivial IP discovery**: always `127.0.0.1` with a known port
- **No host bridge dependency**: works on any host with just `/dev/kvm`
- **Network isolation**: VMs cannot reach each other or the host network directly
### Negative
- **Port collision risk**: with 100 ports and concurrent VMs, hash collisions are possible (mitigated by UUID-based VM names having good hash distribution)
- **No inbound connections**: external services cannot reach the VM directly (not needed for CI)
- **SLIRP performance**: slightly slower than TAP for network-heavy workloads (acceptable for CI)
- **No VM-to-VM communication**: VMs are fully isolated (acceptable for CI)