mirror of
https://github.com/CloudNebulaProject/reddwarf.git
synced 2026-04-10 21:30:40 +00:00
Implement Kubernetes-style health probes that run during the reconcile loop to detect unhealthy applications inside running zones. Previously the pod controller only checked zone liveness via get_zone_state(), missing cases where the zone is running but the application inside has crashed. - Add exec_in_zone() to ZoneRuntime trait, implemented via zlogin on illumos and with configurable mock results for testing - Add probe type system (ProbeKind, ProbeAction, ContainerProbeConfig) that decouples from k8s_openapi and extracts probes from pod container specs with proper k8s defaults (period=10s, timeout=1s, failure=3, success=1) - Add ProbeExecutor for exec/HTTP/TCP checks with tokio timeout support (HTTPS falls back to TCP-only with warning) - Add ProbeTracker state machine that tracks per-pod/container/probe-kind state, respects initial delays and periods, gates liveness on startup probes, and aggregates results into PodProbeStatus - Integrate into PodController reconcile loop: on liveness failure set phase=Failed with reason LivenessProbeFailure; on readiness failure set Ready=False; on all-pass restore Ready=True - Add ProbeFailed error variant with miette diagnostic Known v1 limitation: probes execute at reconcile cadence (~30s), not at their configured periodSeconds. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
5.1 KiB
5.1 KiB
Reddwarf Production Readiness Audit
Last updated: 2026-02-14
Baseline commit: 58171c7 (Add periodic reconciliation, node health checker, and graceful pod termination)
1. Zone Runtime (reddwarf-runtime)
| Requirement | Status | Notes |
|---|---|---|
| Pod spec to zonecfg | DONE | zone/config.rs, controller.rs:pod_to_zone_config() |
| Zone lifecycle (zoneadm) | DONE | illumos.rs — create, install, boot, halt, uninstall, delete |
| Container to Zone mapping | DONE | Naming, sanitization, 64-char truncation |
| CPU limits to capped-cpu | DONE | Aggregates across containers, limits preferred over requests |
| Memory limits to capped-memory | DONE | Aggregates across containers, illumos G/M/K suffixes |
| Network to Crossbow VNIC | DONE | dladm create-etherstub, create-vnic, per-pod VNIC+IP |
| Volumes to ZFS datasets | DONE | Create, destroy, clone, quota, snapshot support |
| Image pull / clone | PARTIAL | ZFS clone works; LX tarball -s works. Missing: no image pull/registry, no .zar archive, no golden image bootstrap |
| Health probes (zlogin) | DONE | exec-in-zone via zlogin, liveness/readiness/startup probes with exec/HTTP/TCP actions, probe tracker state machine integrated into reconcile loop. v1 limitation: probes run at reconcile cadence, not per-probe periodSeconds |
2. Reconciliation / Controller Loop
| Requirement | Status | Notes |
|---|---|---|
| Event bus / watch | DONE | tokio broadcast channel, SSE watch API, multi-subscriber |
| Pod controller | DONE | Event-driven + full reconcile on lag, provision/deprovision |
| Node controller (NotReady) | DONE | node_health.rs — checks every 15s, marks stale (>40s) nodes NotReady with reason NodeStatusUnknown |
| Continuous reconciliation | DONE | controller.rs — periodic reconcile_all() every 30s via tokio::time::interval in select! loop |
| Graceful termination | DONE | DELETE sets deletion_timestamp + phase=Terminating; controller drives shutdown state machine; POST .../finalize for actual removal |
3. Pod Status Tracking
| Requirement | Status | Notes |
|---|---|---|
| Zone state to pod phase | DONE | 8 zone states mapped to pod phases |
Status subresource (/status) |
DONE | PUT endpoint, spec/status separation, fires MODIFIED events |
| ShuttingDown mapping | DONE | Fixed in 58171c7 — maps to "Terminating" |
4. Node Agent / Heartbeat
| Requirement | Status | Notes |
|---|---|---|
| Self-registration | DONE | Creates Node resource with allocatable CPU/memory |
| Periodic heartbeat | DONE | 10-second interval, Ready condition |
| Report zone states | NOT DONE | Heartbeat doesn't query actual zone states |
| Dynamic resource reporting | DONE | sysinfo.rs — detects CPU/memory via sys-info, capacity vs allocatable split with configurable reservations (--system-reserved-cpu, --system-reserved-memory, --max-pods). Done in d3eb0b2 |
5. Main Binary
| Requirement | Status | Notes |
|---|---|---|
| API + scheduler + runtime wired | DONE | All 4 components spawned as tokio tasks |
| CLI via clap | DONE | serve and agent subcommands |
| Graceful shutdown | DONE | SIGINT + CancellationToken + 5s timeout |
| TLS (rustls) | DONE | Auto-generated self-signed CA + server cert, or user-provided PEM. Added in cb6ca8c |
| SMF service manifest | DONE | SMF manifest + method script in smf/. Added in cb6ca8c |
6. Networking
| Requirement | Status | Notes |
|---|---|---|
| Etherstub creation | DONE | dladm create-etherstub |
| VNIC per zone | DONE | dladm create-vnic -l etherstub |
| ipadm IP assignment | PARTIAL | IP set in zonecfg allowed-address but no explicit ipadm create-addr call |
| IPAM | DONE | Sequential alloc, idempotent, persistent, pool exhaustion handling |
| Service ClusterIP / NAT | NOT DONE | Services stored at API level but no backend controller, no ipnat rules, no proxy, no DNS |
7. Scheduler
| Requirement | Status | Notes |
|---|---|---|
| Versioned bind_pod() | DONE | Fixed in c50ecb2 — creates versioned commits |
| Zone brand constraints | DONE | ZoneBrandMatch filter checks reddwarf.io/zone-brand annotation vs reddwarf.io/zone-brands node label. Done in 4c7f50a |
| Actual resource usage | NOT DONE | Only compares requests vs static allocatable — no runtime metrics |
Priority Order
Critical (blocks production)
- TLS — done in
cb6ca8c - SMF manifest — done in
cb6ca8c
High (limits reliability)
- Node health checker — done in
58171c7 - Periodic reconciliation — done in
58171c7 - Graceful pod termination — done in
58171c7
Medium (limits functionality)
- Service networking — no ClusterIP, no NAT/proxy, no DNS
- Health probes — exec/HTTP/TCP liveness/readiness/startup probes via zlogin
- Image management — no pull/registry, no
.zarsupport, no golden image bootstrap - Dynamic node resources — done in
d3eb0b2
Low (nice to have)
- Zone brand scheduling filter — done in
4c7f50a - ShuttingDown to Terminating mapping fix — done in
58171c7 - bhyve brand — type exists but no implementation