Kubernetes but better and for Illumos - mirror
Find a file
Till Wegmueller d79f8ce011
Add health probes (liveness/readiness/startup) with exec, HTTP, and TCP checks
Implement Kubernetes-style health probes that run during the reconcile loop
to detect unhealthy applications inside running zones. Previously the pod
controller only checked zone liveness via get_zone_state(), missing cases
where the zone is running but the application inside has crashed.

- Add exec_in_zone() to ZoneRuntime trait, implemented via zlogin on illumos
  and with configurable mock results for testing
- Add probe type system (ProbeKind, ProbeAction, ContainerProbeConfig) that
  decouples from k8s_openapi and extracts probes from pod container specs
  with proper k8s defaults (period=10s, timeout=1s, failure=3, success=1)
- Add ProbeExecutor for exec/HTTP/TCP checks with tokio timeout support
  (HTTPS falls back to TCP-only with warning)
- Add ProbeTracker state machine that tracks per-pod/container/probe-kind
  state, respects initial delays and periods, gates liveness on startup
  probes, and aggregates results into PodProbeStatus
- Integrate into PodController reconcile loop: on liveness failure set
  phase=Failed with reason LivenessProbeFailure; on readiness failure set
  Ready=False; on all-pass restore Ready=True
- Add ProbeFailed error variant with miette diagnostic

Known v1 limitation: probes execute at reconcile cadence (~30s), not at
their configured periodSeconds.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-14 22:41:30 +01:00
crates Add health probes (liveness/readiness/startup) with exec, HTTP, and TCP checks 2026-02-14 22:41:30 +01:00
docs/ai/research Implement first 3 phases of implementation plan 2026-01-28 22:51:26 +01:00
smf Add optional TLS support and SMF service integration 2026-02-14 18:45:20 +01:00
.gitignore Initial commit 2026-01-28 22:50:31 +01:00
AUDIT.md Add health probes (liveness/readiness/startup) with exec, HTTP, and TCP checks 2026-02-14 22:41:30 +01:00
Cargo.lock Add dynamic node resource detection with configurable system reservations 2026-02-14 21:17:43 +01:00
Cargo.toml Add dynamic node resource detection with configurable system reservations 2026-02-14 21:17:43 +01:00
DEVELOPMENT.md Implement first 3 phases of implementation plan 2026-01-28 22:51:26 +01:00
LICENSE Initial commit 2026-01-28 22:50:31 +01:00
PHASE4_SUMMARY.md Implement phase 4 2026-01-28 23:06:06 +01:00
README.md Implement phase 4 2026-01-28 23:06:06 +01:00

Reddwarf: Rust-Based Single-Binary Kubernetes Control Plane

A pure Rust implementation of a Kubernetes control plane with DAG-based resource versioning.

Project Status

Current Phase: Phase 4 Complete (API Server)

Completed Phases

Phase 1: Foundation & Core Types

  • Workspace structure created
  • Core Kubernetes types and traits (Pod, Node, Service, Namespace)
  • Error handling with miette diagnostics
  • ResourceKey and GroupVersionKind types
  • JSON/YAML serialization helpers
  • 9 tests passing

Phase 2: Storage Layer with redb

  • KVStore trait abstraction
  • redb backend implementation (100% pure Rust)
  • Key encoding for resources
  • Transaction support
  • Prefix scanning and indexing
  • 9 tests passing

Phase 3: Versioning Layer

  • VersionStore for DAG-based versioning
  • Commit operations (create, get, list)
  • Conflict detection between concurrent modifications
  • DAG traversal for history
  • Common ancestor finding
  • 7 tests passing

Phase 4: API Server

  • Axum-based REST API server
  • HTTP verb handlers (GET, POST, PUT, PATCH, DELETE)
  • Pod, Node, Service, Namespace endpoints
  • LIST operations with prefix filtering
  • Resource validation
  • Kubernetes-compatible error responses
  • Health check endpoints (/healthz, /livez, /readyz)
  • 7 tests passing

Total: 32 tests passing

Architecture

reddwarf/
├── crates/
│   ├── reddwarf-core/          # ✅ Core K8s types & traits
│   ├── reddwarf-storage/       # ✅ redb storage backend
│   ├── reddwarf-versioning/    # ✅ DAG-based versioning
│   ├── reddwarf-apiserver/     # ✅ Axum REST API server
│   ├── reddwarf-scheduler/     # 🔄 Pod scheduler (pending)
│   └── reddwarf/               # 🔄 Main binary (pending)
└── tests/                      # 🔄 Integration tests (pending)

Building

# Build all crates
cargo build --workspace

# Run all tests
cargo test --workspace

# Run clippy
cargo clippy --workspace -- -D warnings

# Build release binary
cargo build --release

Next Phases

Phase 5: Basic Scheduler (Week 6)

  • Pod scheduling to nodes
  • Resource-based filtering
  • Simple scoring algorithm

Phase 6: Main Binary Integration (Week 7)

  • Single binary combining all components
  • Configuration and CLI
  • TLS support
  • Graceful shutdown
  • Observability (logging, metrics)

Phase 7: Testing & Documentation (Week 8)

  • Integration tests
  • End-to-end tests with kubectl
  • User documentation
  • API documentation

Key Features

  • Pure Rust: 100% Rust implementation, no C++ dependencies
  • Portable: Supports x86_64, ARM64, illumos
  • redb Storage: Fast, ACID-compliant storage with MVCC
  • DAG Versioning: Advanced resource versioning with conflict detection
  • Type-Safe: Leverages Rust's type system for correctness
  • Rich Errors: miette diagnostics for user-friendly error messages

License

MIT OR Apache-2.0