reddwarf/crates
Till Wegmueller d79f8ce011
Add health probes (liveness/readiness/startup) with exec, HTTP, and TCP checks
Implement Kubernetes-style health probes that run during the reconcile loop
to detect unhealthy applications inside running zones. Previously the pod
controller only checked zone liveness via get_zone_state(), missing cases
where the zone is running but the application inside has crashed.

- Add exec_in_zone() to ZoneRuntime trait, implemented via zlogin on illumos
  and with configurable mock results for testing
- Add probe type system (ProbeKind, ProbeAction, ContainerProbeConfig) that
  decouples from k8s_openapi and extracts probes from pod container specs
  with proper k8s defaults (period=10s, timeout=1s, failure=3, success=1)
- Add ProbeExecutor for exec/HTTP/TCP checks with tokio timeout support
  (HTTPS falls back to TCP-only with warning)
- Add ProbeTracker state machine that tracks per-pod/container/probe-kind
  state, respects initial delays and periods, gates liveness on startup
  probes, and aggregates results into PodProbeStatus
- Integrate into PodController reconcile loop: on liveness failure set
  phase=Failed with reason LivenessProbeFailure; on readiness failure set
  Ready=False; on all-pass restore Ready=True
- Add ProbeFailed error variant with miette diagnostic

Known v1 limitation: probes execute at reconcile cadence (~30s), not at
their configured periodSeconds.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-14 22:41:30 +01:00
..
reddwarf Add ZoneBrandMatch scheduler filter to reject brand-incompatible nodes 2026-02-14 21:45:51 +01:00
reddwarf-apiserver Add periodic reconciliation, node health checker, and graceful pod termination 2026-02-14 20:39:36 +01:00
reddwarf-core Add container resource limits to zone caps: extract, aggregate, and convert 2026-02-14 17:34:39 +01:00
reddwarf-runtime Add health probes (liveness/readiness/startup) with exec, HTTP, and TCP checks 2026-02-14 22:41:30 +01:00
reddwarf-scheduler Add ZoneBrandMatch scheduler filter to reject brand-incompatible nodes 2026-02-14 21:45:51 +01:00
reddwarf-storage Format code 2026-01-28 23:17:19 +01:00
reddwarf-versioning Format code 2026-01-28 23:17:19 +01:00