Implement Kubernetes-style health probes that run during the reconcile loop
to detect unhealthy applications inside running zones. Previously the pod
controller only checked zone liveness via get_zone_state(), missing cases
where the zone is running but the application inside has crashed.
- Add exec_in_zone() to ZoneRuntime trait, implemented via zlogin on illumos
and with configurable mock results for testing
- Add probe type system (ProbeKind, ProbeAction, ContainerProbeConfig) that
decouples from k8s_openapi and extracts probes from pod container specs
with proper k8s defaults (period=10s, timeout=1s, failure=3, success=1)
- Add ProbeExecutor for exec/HTTP/TCP checks with tokio timeout support
(HTTPS falls back to TCP-only with warning)
- Add ProbeTracker state machine that tracks per-pod/container/probe-kind
state, respects initial delays and periods, gates liveness on startup
probes, and aggregates results into PodProbeStatus
- Integrate into PodController reconcile loop: on liveness failure set
phase=Failed with reason LivenessProbeFailure; on readiness failure set
Ready=False; on all-pass restore Ready=True
- Add ProbeFailed error variant with miette diagnostic
Known v1 limitation: probes execute at reconcile cadence (~30s), not at
their configured periodSeconds.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace hardcoded memory (8Gi) and pod limits (110) in the node agent with
actual system detection via the sys-info crate. CPU and memory are detected
once at NodeAgent construction and reused on every heartbeat. Capacity
reports raw hardware values while allocatable subtracts configurable
reservations (--system-reserved-cpu, --system-reserved-memory, --max-pods),
giving the scheduler accurate data for filtering and scoring.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Three high-priority reliability features that close gaps identified in AUDIT.md:
1. Periodic reconciliation: PodController now runs reconcile_all() every 30s
via a tokio::time::interval branch in the select! loop, detecting zone
crashes between events.
2. Node health checker: New NodeHealthChecker polls node heartbeats every 15s
and marks nodes with stale heartbeats (>40s) as NotReady with reason
NodeStatusUnknown, preserving last_transition_time correctly.
3. Graceful pod termination: DELETE sets deletion_timestamp and phase=Terminating
instead of immediate removal. Controller drives a state machine (shutdown →
halt on grace expiry → deprovision → finalize) with periodic reconcile
advancing it. New POST .../finalize endpoint performs actual storage removal.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Decouple storage from the ZoneRuntime trait into a dedicated StorageEngine
trait with ZfsStorageEngine (illumos) and MockStorageEngine (testing)
implementations. Replace the per-zone ZfsConfig with a global
StoragePoolConfig that derives dataset hierarchy from a single --storage-pool
flag, with optional per-dataset overrides. This enables persistent volumes,
auto-created base datasets on startup, and a clean extension point for
future storage backends.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Each pod now gets a unique VNIC name and IP address from a configurable
CIDR pool, with IPs released on pod deletion. This replaces the
hardcoded single VNIC/IP that prevented multiple pods from running.
- Add redb-backed IPAM module with allocate/release/idempotent semantics
- Add prefix_len to EtherstubConfig and DirectNicConfig
- Generate allowed-address and defrouter in zonecfg net blocks
- Wire vnic_name_for_pod() into controller for unique VNIC names
- Add --pod-cidr and --etherstub-name CLI flags to agent subcommand
- Add StorageError and IpamPoolExhausted error variants with diagnostics
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implement the core reconciliation loop that connects Pod events to zone
lifecycle. Status subresource endpoints allow updating pod/node status
without triggering spec-level changes. The main binary now provides
`serve` (API server only) and `agent` (full node: API + scheduler +
controller + heartbeat) subcommands via clap.
- Status subresource: generic update_status in common.rs, PUT endpoints
for /pods/{name}/status and /nodes/{name}/status
- Pod controller: polls pods assigned to this node, provisions zones via
ZoneRuntime, updates status to Running/Failed, monitors zone health
- Node agent: registers host as a Node, sends periodic heartbeats with
Ready condition
- API client: lightweight reqwest-based HTTP client for controller and
node agent to talk to the API server
- Main binary: clap CLI with serve/agent commands, wires all components
together with graceful shutdown via ctrl-c
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implement an in-process broadcast event bus for resource mutations
(ADDED/MODIFIED/DELETED) with SSE watch endpoints on all list handlers,
following the Kubernetes watch protocol. Add the reddwarf-runtime crate
with a trait-based zone runtime abstraction targeting illumos zones,
including LX and custom reddwarf brand support, etherstub/direct VNIC
networking, ZFS dataset management, and a MockRuntime for testing on
non-illumos platforms.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>