- Add `logs-service` crate as a separate microservice to handle job log storage, retrieval, and categorization.
- Update orchestrator to redirect log endpoints to the new service with optional permanent redirects using `LOGS_BASE_URL`.
- Enhance log persistence by introducing structured fields such as category, level, and error flags.
- Implement migration to add new columns and indexes for job logs.
- Add ANSI escape sequence stripping and structured logging for cleaner log storage.
- Improve SSH log handling with interleaved stdout/stderr processing and pty request support.
- Revise Docker files and compose setup to include logs-service, with support for PostgreSQL and secure connections.
Signed-off-by: Till Wegmueller <toasterson@gmail.com>
- Refactor runner upload logic to use temporary files and atomic renaming for safer updates.
- Improve file permission handling during temporary file creation.
- Increment orchestrator version to 0.1.11.
Signed-off-by: Till Wegmueller <toasterson@gmail.com>
- Add validation for `RUNNER_LINUX_PATH` and `RUNNER_ILLUMOS_PATH` with detailed warnings and diagnostics for misconfigurations.
- Log fallback to default paths and warn if binaries are missing.
- Increment orchestrator version to 0.1.10.
Signed-off-by: Till Wegmueller <toasterson@gmail.com>
- Introduce `libvirt_uri` and `libvirt_network` in configuration structs, replacing reliance on environment variables.
- Update all `virsh`-related logic to use explicit parameters for libvirt connection and network settings.
- Align codebase with new guidelines rejecting runtime environment variable mutations.
- Document breaking changes in `.junie/guidelines.md`.
- Increment orchestrator version to 0.1.9.
Signed-off-by: Till Wegmueller <toasterson@gmail.com>
- Update IP selection logic to prefer the latest lease based on epoch timestamp.
- Remove redundant IP discovery logic in `net-dhcp-leases`.
- Increment orchestrator version to 0.1.8 for release.
Signed-off-by: Till Wegmueller <toasterson@gmail.com>
- Add default `LIBVIRT_URI`, `HOME`, and `XDG_CACHE_HOME` environment variable handling for `virsh` commands.
- Ensure writable cache directories for the service user in packaging scripts.
- Update systemd service to include libvirt-related environment defaults.
- Bump orchestrator version to 0.1.7.
Signed-off-by: Till Wegmueller <toasterson@gmail.com>
- Introduce `boot_wait_secs` configuration to delay IP discovery/SSH after VM startup.
- Capture console logs when no SSH logs are available for better debugging during failures.
- Add a utility function to snapshot and persist console logs into job logs.
- Update CLI and environment variable support for the `boot_wait_secs` parameter.
- Bump orchestrator version to 0.1.5.
Signed-off-by: Till Wegmueller <toasterson@gmail.com>
- Introduce `results_queue` and `results_routing_key` to MQ configuration.
- Update message publishing and queue declaration logic to leverage new fields.
- Increment orchestrator version to 0.1.4.
Signed-off-by: Till Wegmueller <toasterson@gmail.com>
- Add `--features libvirt` to orchestrator's Debian package build process.
- Update orchestrator version to 0.1.2 in `Cargo.toml`.
Signed-off-by: Till Wegmueller <toasterson@gmail.com>
- Introduce Debian package build script using `cargo-deb` for orchestrator releases.
- Add systemd unit file and post-installation script for automatic service setup.
- Update `compose.yml` with host-only port bindings for Postgres and RabbitMQ.
- Introduce NGINX-based log proxy for orchestrator logs with Traefik support.
- Bump orchestrator version to 0.1.1 and update related Cargo metadata for packaging.
- Add example environment file for orchestrator configuration.
Signed-off-by: Till Wegmueller <toasterson@gmail.com>
- Implement SSH execution retries with exponential backoff and timeout handling.
- Replace `virsh domifaddr` with a multi-strategy IP discovery approach.
- Introduce `OrchestratorError` for consistent, structured error reporting.
- Improve runner deployment and SSH session utilities for readability and reliability.
- Add dependencies: `thiserror`, `anyhow` for streamlined error handling.
Signed-off-by: Till Wegmueller <toasterson@gmail.com>
- Introduce fields in `JobContext` for per-job SSH configuration, including user, key paths, and PEM contents.
- Update the scheduler to support SSH-based execution of jobs, including VM lifecycle management and SSH session handling.
- Add utility functions for SSH execution, guest IP discovery, and runner deployment.
- Remove the unused `/runners/{name}` HTTP endpoint and its associated logic.
- Simplify router creation by refactoring out disabled runner directory handling.
Signed-off-by: Till Wegmueller <toasterson@gmail.com>
- Extend `.env.sample` with libvirt configuration, Forgejo secrets, and image mapping defaults.
- Update `compose.yml` to enable libvirt integration, including required mounts, devices, and environment variables.
- Add Forgejo webhook configuration and commit status reporting with optional HMAC validation.
- Enhance the orchestrator container with libvirt dependencies and optional features for VM management.
- Document host preparation for libvirt/KVM and image directories in the README.
- Set default fallback values for Traefik ACME CA server.
Signed-off-by: Till Wegmueller <toasterson@gmail.com>
This commit introduces:
- Log persistence feature with a new `job_logs` table and related APIs for recording and retrieving job logs.
- An HTTP server for serving log endpoints and job results.
- Updates to the CI pipeline to enable persistence by default and ensure PostgreSQL readiness.
- Docker Compose updates with a Postgres service and MinIO integration for object storage.
- Packaging scripts for Arch Linux, including systemd service units for deployment.
This commit updates multiple dependencies, including:
- `axum` upgraded to 0.8 for HTTP and webhook functionality.
- `tonic` upgraded to 0.14 for gRPC support.
- `prost` upgraded to 0.14 for protobuf processing.
- Addition of `tonic-prost` and `tonic-prost-build` for updated gRPC build configurations.
Relevant Cargo.toml entries and `build.rs` are adjusted to reflect these updates.
This commit updates parsing logic by simplifying `.and_then(|e| e.value().as_string())` calls to `.and_then(|v| v.as_string())`. Additionally, it upgrades several crate dependencies, including `thiserror`, `sea-orm`, `lapin`, `virt`, and `kdl`, to their latest compatible versions for improved functionality and stability.
This commit introduces gRPC-based log streaming between the VM runner (`solstice-runner`) and orchestrator. Key updates include:
- Implemented gRPC server in the orchestrator for receiving and processing runner logs.
- Added log streaming and job result reporting in the `solstice-runner` client.
- Defined `runner.proto` with messages (`LogItem`, `JobEnd`) and the `Runner` service.
- Updated orchestrator to accept gRPC settings and start the server.
- Modified cloud-init user data to include gRPC endpoint and request ID for runners.
- Enhanced message queue logic to handle job results via `publish_job_result`.
- Configured `Cross.toml` for cross-compilation of the runner.
This commit improves the hypervisor by:
- Adding support for detecting base image formats using `qemu-img info`.
- Dynamically setting the base image format for overlay creation.
- Automatically converting non-raw images to raw format for bhyve compatibility.
- Updating `Cargo.toml` to include `serde_json` for JSON parsing.
- Modifying default working directory logic for `ZonesHypervisor`.
This commit replaces the `libvirt` crate with the `virt` crate for managing the libvirt backend on Linux. Key changes include:
- Updated `Cargo.toml` dependencies and feature configuration.
- Refactored hypervisor implementation to align with `virt` crate API.
- Improved error handling and lifecycle management for VMs and networks.
This commit introduces a persistence layer to the Orchestrator, enabling it to optionally connect to a Postgres database for recording job and VM states. It includes:
- SeaORM integration with support for migrations from the migration crate.
- `Persist` module with methods for job and VM state upserts.
- No-op fallback when persistence is disabled or unavailable.
- Documentation updates and test coverage for persistence functionality.