Commit graph

5 commits

Author SHA1 Message Date
Claude
d8425ad85d
Add service networking, bhyve brand, ipadm IP config, and zone state reporting
Service networking:
- ClusterIP IPAM allocation on service create/delete via reusable Ipam with_prefix()
- ServiceController watches Pod/Service events + periodic reconcile to track endpoints
- NatManager generates ipnat rdr rules for ClusterIP -> pod IP forwarding
- Embedded DNS server resolves {svc}.{ns}.svc.cluster.local to ClusterIP
- New CLI flags: --service-cidr (default 10.96.0.0/12), --cluster-dns (default 0.0.0.0:10053)

Quick wins:
- ipadm IP assignment: configure_zone_ip() runs ipadm/route inside zone via zlogin after boot
- Node heartbeat zone state reporting: reddwarf.io/zone-count and zone-summary annotations
- bhyve brand support: ZoneBrand::Bhyve, install args, zonecfg device generation, controller integration

189 tests passing, clippy clean.

https://claude.ai/code/session_016QLFjAyYGzMPbBjEGMe75j
2026-03-19 20:28:40 +00:00
Till Wegmueller
4bfcc39a69
Add container resource limits to zone caps: extract, aggregate, and convert
Move ResourceQuantities from reddwarf-scheduler to reddwarf-core so both
scheduler and runtime share K8s CPU/memory parsing. Add cpu_as_zone_cap()
and memory_as_zone_cap() conversions for illumos zonecfg format. Wire
pod_to_zone_config() to aggregate container limits (with requests fallback)
and pass capped-cpu/capped-memory to the zone, closing the resource pipeline.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-14 17:34:39 +01:00
Till Wegmueller
c50ecb2664
Close the control loop: versioned bind, event-driven controller, graceful shutdown
- Move WatchEventType and ResourceEvent to reddwarf-core so scheduler
  and runtime can use them without depending on the apiserver crate
- Fix scheduler bind_pod to create versioned commits and publish
  MODIFIED events so the pod controller learns about scheduled pods
- Replace polling loop in pod controller with event bus subscription,
  wire handle_delete for DELETED events, keep reconcile_all for
  startup sync and lag recovery
- Add allocatable/capacity resources (cpu, memory, pods) to node agent
  build_node so the scheduler's resource filter accepts nodes
- Bootstrap "default" namespace on startup to prevent pod creation
  failures in the default namespace
- Replace .abort() shutdown with CancellationToken-based graceful
  shutdown across scheduler, controller, and node agent

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-08 23:21:53 +01:00
Till Wegmueller
c15e5282ff
Format code
Signed-off-by: Till Wegmueller <toasterson@gmail.com>
2026-01-28 23:17:19 +01:00
Till Wegmueller
3a03400c1f
Implement first 3 phases of implementation plan
Signed-off-by: Till Wegmueller <toasterson@gmail.com>
2026-01-28 22:51:26 +01:00