mirror of
https://codeberg.org/Toasterson/solstice-ci.git
synced 2026-04-10 21:30:41 +00:00
Plans: - 001: vm-manager migration (completed) - 002: runner-only architecture (active) Decision records (ADRs): - 001: Runner-only architecture — retire webhooks + logs service - 002: Direct QEMU over libvirt - 003: Ephemeral SSH keys with opt-in debug access - 004: User-mode (SLIRP) networking for VMs
48 lines
2.5 KiB
Markdown
48 lines
2.5 KiB
Markdown
# Plan: Migrate Orchestrator to vm-manager + Containerize
|
|
|
|
**Status:** Completed (2026-04-07)
|
|
**Planner ID:** `5fc6f5f5-33c1-4e3d-9201-c4c9c4fc43df`
|
|
|
|
## Summary
|
|
|
|
Replace the orchestrator's built-in libvirt hypervisor code with the `vm-manager` library crate, then containerize the orchestrator. This eliminates the libvirt dependency and makes deployment straightforward (only `/dev/kvm` needed).
|
|
|
|
## Motivation
|
|
|
|
The orchestrator used libvirt (via the `virt` crate) requiring:
|
|
- Libvirt daemon on the host
|
|
- Libvirt sockets mounted into containers
|
|
- KVM device access
|
|
- Host-level libvirt configuration and networking
|
|
|
|
This made containerization painful — the orchestrator ran as a systemd service on the host.
|
|
|
|
## Approach
|
|
|
|
1. Extended vm-manager with console log tailing (`console` module)
|
|
2. Chose user-mode (SLIRP) networking over TAP for container simplicity
|
|
3. Created `vm_adapter.rs` bridging orchestrator's Hypervisor trait to vm-manager
|
|
4. Replaced scheduler's SSH/IP-discovery/console code with vm-manager APIs
|
|
5. Replaced image download with vm-manager's `ImageManager`
|
|
6. Removed 712 lines of libvirt-specific code
|
|
7. Updated Containerfile: libvirt packages replaced with QEMU + qemu-utils
|
|
|
|
## Tasks completed
|
|
|
|
| # | Task | Summary |
|
|
|---|------|---------|
|
|
| 1 | Add serial console tailing to vm-manager | `ConsoleTailer` for async Unix socket streaming |
|
|
| 2 | Verify networking | User-mode SLIRP chosen — no bridge needed |
|
|
| 3 | Add vm-manager adapter layer | `vm_adapter.rs` with VmSpec/VmHandle conversion |
|
|
| 4 | Update scheduler SSH + console | vm-manager SSH/connect_with_retry/upload/exec |
|
|
| 5 | Update image config | vm-manager `ImageManager::download()` |
|
|
| 6 | Remove libvirt dependencies | -712 lines, removed virt/ssh2/zstd crates |
|
|
| 7 | Update Containerfile | Ubuntu 24.04 runtime, QEMU direct, no libvirt |
|
|
| 8 | Integration test | End-to-end job via containerized orchestrator |
|
|
|
|
## Key decisions
|
|
|
|
- **QEMU direct over libvirt**: vm-manager spawns QEMU processes directly, manages via QMP socket. Simpler, no daemon dependency.
|
|
- **User-mode networking**: SSH via port forwarding (`hostfwd=tcp::{port}-:22`). No bridge, no NET_ADMIN, no TAP device creation.
|
|
- **IDE CDROM for seed ISO**: Ubuntu cloud images expect root disk as first virtio device. Seed ISO uses IDE CDROM to avoid device ordering conflicts.
|
|
- **Pre-built binary Containerfile**: vm-manager uses workspace-inherited deps making cross-workspace path deps difficult. Git dep used for CI, local patch for dev.
|