mirror of
https://github.com/CloudNebulaProject/reddwarf.git
synced 2026-04-10 13:20:40 +00:00
Implement Kubernetes-style health probes that run during the reconcile loop to detect unhealthy applications inside running zones. Previously the pod controller only checked zone liveness via get_zone_state(), missing cases where the zone is running but the application inside has crashed. - Add exec_in_zone() to ZoneRuntime trait, implemented via zlogin on illumos and with configurable mock results for testing - Add probe type system (ProbeKind, ProbeAction, ContainerProbeConfig) that decouples from k8s_openapi and extracts probes from pod container specs with proper k8s defaults (period=10s, timeout=1s, failure=3, success=1) - Add ProbeExecutor for exec/HTTP/TCP checks with tokio timeout support (HTTPS falls back to TCP-only with warning) - Add ProbeTracker state machine that tracks per-pod/container/probe-kind state, respects initial delays and periods, gates liveness on startup probes, and aggregates results into PodProbeStatus - Integrate into PodController reconcile loop: on liveness failure set phase=Failed with reason LivenessProbeFailure; on readiness failure set Ready=False; on all-pass restore Ready=True - Add ProbeFailed error variant with miette diagnostic Known v1 limitation: probes execute at reconcile cadence (~30s), not at their configured periodSeconds. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| src | ||
| Cargo.toml | ||