Commit graph

100 commits

Author SHA256 Message Date
Till Wegmueller
d5faf319ab
Add boot wait configuration and improve VM startup logging, bump version to 0.1.5
- Introduce `boot_wait_secs` configuration to delay IP discovery/SSH after VM startup.
- Capture console logs when no SSH logs are available for better debugging during failures.
- Add a utility function to snapshot and persist console logs into job logs.
- Update CLI and environment variable support for the `boot_wait_secs` parameter.
- Bump orchestrator version to 0.1.5.

Signed-off-by: Till Wegmueller <toasterson@gmail.com>
2025-11-17 21:12:54 +01:00
Till Wegmueller
5d8e79c8d4
Add support for results queue and routing key in MQ configuration, bump version to 0.1.4
- Introduce `results_queue` and `results_routing_key` to MQ configuration.
- Update message publishing and queue declaration logic to leverage new fields.
- Increment orchestrator version to 0.1.4.

Signed-off-by: Till Wegmueller <toasterson@gmail.com>
2025-11-17 20:51:57 +01:00
Till Wegmueller
8e21c2ba47
Remove unused systemd unit file hardening options, bump version to 0.1.3
Signed-off-by: Till Wegmueller <toasterson@gmail.com>
2025-11-17 20:05:21 +01:00
Till Wegmueller
0724a4c526
Enable libvirt feature for orchestrator and bump version to 0.1.2
- Add `--features libvirt` to orchestrator's Debian package build process.
- Update orchestrator version to 0.1.2 in `Cargo.toml`.

Signed-off-by: Till Wegmueller <toasterson@gmail.com>
2025-11-17 20:01:06 +01:00
Till Wegmueller
fad8e60ec1
Add Debian packaging support and network configuration enhancements
- Introduce Debian package build script using `cargo-deb` for orchestrator releases.
- Add systemd unit file and post-installation script for automatic service setup.
- Update `compose.yml` with host-only port bindings for Postgres and RabbitMQ.
- Introduce NGINX-based log proxy for orchestrator logs with Traefik support.
- Bump orchestrator version to 0.1.1 and update related Cargo metadata for packaging.
- Add example environment file for orchestrator configuration.

Signed-off-by: Till Wegmueller <toasterson@gmail.com>
2025-11-17 19:57:19 +01:00
Till Wegmueller
9dfa9c4b95
Enhance SSH handling with retries and robust error management, refactor guest IP discovery
- Implement SSH execution retries with exponential backoff and timeout handling.
- Replace `virsh domifaddr` with a multi-strategy IP discovery approach.
- Introduce `OrchestratorError` for consistent, structured error reporting.
- Improve runner deployment and SSH session utilities for readability and reliability.
- Add dependencies: `thiserror`, `anyhow` for streamlined error handling.

Signed-off-by: Till Wegmueller <toasterson@gmail.com>
2025-11-15 21:46:54 +01:00
Till Wegmueller
038d1161a6
Refactor Orchestrator with enhanced SSH handling, error management, and IP discovery support
- Implement retries for SSH-based job execution with configurable timeouts.
- Introduce `OrchestratorError` for consistent error handling across modules.
- Replace `virsh domifaddr` based guest IP discovery with a robust, multi-strategy approach.
- Refactor runner deployment and SSH-related utility functions for clarity.
- Add `thiserror` and `anyhow` dependencies for error management.
- Update persistence layer with improved error handling for database operations.

Signed-off-by: Till Wegmueller <toasterson@gmail.com>

Signed-off-by: Till Wegmueller <toasterson@gmail.com>
2025-11-15 21:46:19 +01:00
Till Wegmueller
c2fefb5167
Add per-job SSH key support, refactor scheduler for SSH-based job execution, and remove unused runner endpoint
- Introduce fields in `JobContext` for per-job SSH configuration, including user, key paths, and PEM contents.
- Update the scheduler to support SSH-based execution of jobs, including VM lifecycle management and SSH session handling.
- Add utility functions for SSH execution, guest IP discovery, and runner deployment.
- Remove the unused `/runners/{name}` HTTP endpoint and its associated logic.
- Simplify router creation by refactoring out disabled runner directory handling.

Signed-off-by: Till Wegmueller <toasterson@gmail.com>
2025-11-15 18:37:30 +01:00
Till Wegmueller
930efe547f
Add public runner URL configuration and enhance log streaming support
- Introduce options for specifying public runner base URLs (`SOLSTICE_RUNNER_BASE_URL`) and orchestrator contact addresses (`ORCH_CONTACT_ADDR`).
- Update `.env.sample` and `compose.yml` with new configuration fields for external log streaming and runner binary serving.
- Refactor runner URL handling and generation logic for improved flexibility.
- Enhance `cloud-init` templates with updated runner URL environment variables (`RUNNER_SINGLE` and `RUNNER_URLS`).
- Add unit tests for runner URL generation to verify various input cases.

Signed-off-by: Till Wegmueller <toasterson@gmail.com>
2025-11-11 20:24:20 +01:00
Till Wegmueller
1e48b1de66
Update database connection details to production configuration
- Change data source name and JDBC URL for production environment.
- Add new data source mapping file for SQL console.

Signed-off-by: Till Wegmueller <toasterson@gmail.com>
2025-11-11 20:23:48 +01:00
Till Wegmueller
248885bdf8
Add runner binary serving via orchestrator, update configurations and documentation
- Extend `.env.sample` with `RUNNER_DIR_HOST` for serving workflow runner binaries.
- Update `compose.yml` with `RUNNER_DIR` and corresponding volume mount.
- Add instructions for runner binary setup and serving in `README.md`.
- Enhance `mise.toml` with new tooling dependencies for building runners.

Signed-off-by: Till Wegmueller <toasterson@gmail.com>
2025-11-09 19:02:42 +01:00
Till Wegmueller
f904cb88b2
Relax filesystem permissions for VM directories, overlays, and logs to support host libvirt/qemu access. Introduce dead-letter queue support with enriched error messages for failed jobs.
Signed-off-by: Till Wegmueller <toasterson@gmail.com>
2025-11-09 17:59:04 +01:00
Till Wegmueller
888aa26388
Add libvirt/KVM integration and Forgejo webhook support to Podman stack
- Extend `.env.sample` with libvirt configuration, Forgejo secrets, and image mapping defaults.
- Update `compose.yml` to enable libvirt integration, including required mounts, devices, and environment variables.
- Add Forgejo webhook configuration and commit status reporting with optional HMAC validation.
- Enhance the orchestrator container with libvirt dependencies and optional features for VM management.
- Document host preparation for libvirt/KVM and image directories in the README.
- Set default fallback values for Traefik ACME CA server.

Signed-off-by: Till Wegmueller <toasterson@gmail.com>
2025-11-09 17:58:36 +01:00
Till Wegmueller
fe7b4b9ce0
Update Podman deployment for rootless support and DNS fixes
- Document rootless Podman port binding limitations and workarounds in README.
- Update `.env.sample` with notes and default high ports for rootless runs.
- Adjust `compose.yml` for network configuration and privileged port handling.
- Introduce fixes for Traefik DNS timeouts using explicit public resolvers and network tweaks.
- Switch MinIO and MinIO setup to use the latest images for better compatibility.
2025-11-08 21:55:27 +00:00
Till Wegmueller
4228c7ae6c
Add .env to Podman deployment .gitignore 2025-11-08 20:26:19 +00:00
Till Wegmueller
1c5dc338f5
Add Podman Compose deployment stack with Traefik and services integration
This commit introduces:
- A production-ready Podman Compose stack using Traefik as a reverse proxy with Let's Encrypt integration.
- Per-environment logical separation for Postgres, RabbitMQ, and MinIO services.
- New deployment utilities, including a `.env.sample` template, `compose.yml`, and setup scripts for MinIO and Postgres.
- Updates to `github-integration` HTTP server with basic webhook handling using `axum` and configurable paths.
- Adjustments to packaging tasks for better tarball generation via `git archive`.
- Expanded dependencies for `PKGBUILD` to support SQLite and PostgreSQL libraries.
- Containerfiles for orchestrator and integration services to enable Rust multi-stage builds without sccache.

This enables simplified and secure CI deployments with automatic routing, TLS, and volume persistence.
2025-11-08 20:21:57 +00:00
Till Wegmueller
31a88343cb
Make packaging tasks executable
Signed-off-by: Till Wegmueller <toasterson@gmail.com>
2025-11-08 15:54:16 +00:00
Till Wegmueller
11ce9cc881
Introduce centralized configuration handling via KDL and environment variables
This commit adds:
- A unified configuration system (`AppConfig`) that aggregates KDL files and environment variables with precedence handling.
- Example KDL configuration files for the orchestrator and forge-integration modules.
- Updates to orchestrator and forge-integration to load and apply configurations from `AppConfig`.
- Improved AMQP and database configuration with overlays from CLI, environment, or KDL.
- Deprecated `TODO.txt` as it's now represented in the configuration examples.
2025-11-06 23:48:03 +01:00
Till Wegmueller
0dabdf2bb2
Auto-detect orchestrator contact address and enhance platform-specific configurations
This commit introduces:
- Automatic detection of the orchestrator contact address when not explicitly provided.
- Platform-specific logic for determining reachable IPs, including libvirt network parsing (Linux) and external IP detection.
- Updates to GRPC address processing to handle both specific and unspecified hosts.
- Additional utility functions for parsing and detecting IPs in libvirt configurations.
2025-11-06 21:56:57 +01:00
Till Wegmueller
97599eb48d
Move runner logs to debug level and enable runner binary serving via orchestrator
This commit includes:
- Adjusted runner logs from `info` to `debug` for reduced deployment log verbosity while retaining visibility in CI.
- Added functionality to serve runner binaries directly from the orchestrator via HTTP.
- Introduced new `RUNNER_DIR` configuration to specify the binary directory, with default paths and URL composition.
- Updated HTTP routing to include runner file serving with validation and logging.
- Improved AMQP body logging with a utility for better error debugging.
- Updated task scripts for runner cross-building and serving, consolidating configurations and removing redundant files.
2025-11-06 21:44:06 +01:00
Till Wegmueller
7ea24af24f
Update TODO.txt with testing note about repo commit status 2025-11-03 23:46:36 +01:00
Till Wegmueller
06ae079b14
Add repository owner/name parsing and integrate with commit status updates
This commit introduces:
- A utility function to parse repository owner and name from URLs, supporting HTTPS, SSH, and Git formats.
- Enhancements to job messages and results with optional `repo_owner` and `repo_name` fields for downstream integrations.
- Updated orchestrator and forge-integration workflows to leverage parsed repository details for status updates and accurate routing.
2025-11-03 23:36:25 +01:00
Till Wegmueller
c00ce54112
Add heuristic failure detection and improve runner URL configuration
This commit introduces:
- A heuristic to mark jobs as failed if VMs stop quickly without generating logs.
- Improved configuration for runner URLs, including auto-detection of host IPs and default multi-OS runner URLs.
- Updates to the orchestrator's HTTP routing for consistency.
- New task scripts for Forge integration and updates to environment defaults for local development.
2025-11-03 22:36:31 +01:00
Till Wegmueller
81a93ef1a7
Enable job log persistence, HTTP server, and extend CI/packaging support
This commit introduces:
- Log persistence feature with a new `job_logs` table and related APIs for recording and retrieving job logs.
- An HTTP server for serving log endpoints and job results.
- Updates to the CI pipeline to enable persistence by default and ensure PostgreSQL readiness.
- Docker Compose updates with a Postgres service and MinIO integration for object storage.
- Packaging scripts for Arch Linux, including systemd service units for deployment.
2025-11-02 23:37:11 +01:00
Till Wegmueller
6631ce4d6e
Add insecure TLS support, CA bundle handling, and package update for SunOS environments
This commit introduces the following updates:
- Adds an environment variable (`SOLSTICE_ALLOW_INSECURE`) to enable insecure TLS as a fallback for curl.
- Improves CA certificate handling and automatic installation on SunOS using IPS or pkgin.
- Extends fallback logic for repository fetching to cover scenarios with missing CA bundles.
- Updates Solstice job script dependencies to include `cmake`.
2025-11-02 20:48:05 +01:00
Till Wegmueller
b84e97e513
Enhance runner with log streaming details, fallback repository fetch, and improved error handling
This commit improves the runner's functionality by adding:
- Detailed log streaming with request ID, stdout, and stderr line counts.
- Fallback mechanisms for repository fetch using HTTP archive when git commands fail.
- Enhanced error reporting for missing job scripts and reading errors.
- Updates to ensure compatibility with SunOS environments and non-interactive shells.
2025-11-02 20:36:13 +01:00
Till Wegmueller
5cfde45e4c
Update default illumos image to omnios-bloody and enhance image configuration
This commit replaces `openindiana-hipster` with `omnios-bloody` as the default illumos image in the orchestrator. It adds detailed configuration for the new image, including source URL, local path, and resource defaults, while retaining `openindiana-hipster` as a reference. Corresponding test cases and YAML updates are included.
2025-11-02 18:58:46 +01:00
Till Wegmueller
c1380b1095
Add local file:// source support for orchestrator image preparation
This commit enhances image handling in the orchestrator by adding support for `file://` sources. It introduces logic to handle both local file copying and decompression options, complementing the existing `http(s)://` download functionality.
2025-11-02 18:38:56 +01:00
Till Wegmueller
9597bbf64d
Add VM suspend handling, persistence updates, and orchestrator enhancements
This commit introduces:
- VM suspend support for timeout scenarios, allowing investigation of frozen states.
- Enhanced orchestrator persistence initialization with skip option for faster startup.
- Improvements to orchestrator logging, job state tracking, and VM runtime monitoring.
- Updates to CI tasks for capturing job request IDs and tracking completion statuses.
- Extended hypervisor capabilities, including libvirt console logging configuration.
2025-11-01 18:38:17 +01:00
Till Wegmueller
f753265a79
Remove default-features override for lapin dependency 2025-11-01 16:46:06 +01:00
Till Wegmueller
821cfdf458
Add bindgen-related configuration and dependencies for cross-compilation
This commit introduces `bindgen-cli` installation and sets up `BINDGEN_EXTRA_CLANG_ARGS` passthrough in `Cross.toml`. It also updates the pre-build process with necessary packages (`clang`, `libclang-dev`), alternative compilers, and adds `aws-lc-rs` with `bindgen` support to `workflow-runner` dependencies.
2025-11-01 16:22:37 +01:00
Till Wegmueller
952262ede4
Upgrade dependencies for Axum, Tonic, Prost, and related build tools across crates
This commit updates multiple dependencies, including:
- `axum` upgraded to 0.8 for HTTP and webhook functionality.
- `tonic` upgraded to 0.14 for gRPC support.
- `prost` upgraded to 0.14 for protobuf processing.
- Addition of `tonic-prost` and `tonic-prost-build` for updated gRPC build configurations.

Relevant Cargo.toml entries and `build.rs` are adjusted to reflect these updates.
2025-11-01 15:24:09 +01:00
Till Wegmueller
7ca7966916
Add Rust toolchain support for Solaris-based environments
This commit updates the Solstice job script to install the Rust toolchain (`developer/rustc`) via the IPS package manager on Solaris-based systems (SunOS). It also adjusts the `ensure_rust` function to prioritize system package installation before falling back to `rustup`.
2025-11-01 15:03:09 +01:00
Till Wegmueller
033f9b5ab0
Format 2025-11-01 14:56:46 +01:00
Till Wegmueller
374dff5c04
Simplify variable initialization and remove unused imports across multiple crates 2025-11-01 14:44:42 +01:00
Till Wegmueller
1b7b2dd91b
Update parsing logic and upgrade dependencies across crates
This commit updates parsing logic by simplifying `.and_then(|e| e.value().as_string())` calls to `.and_then(|v| v.as_string())`. Additionally, it upgrades several crate dependencies, including `thiserror`, `sea-orm`, `lapin`, `virt`, and `kdl`, to their latest compatible versions for improved functionality and stability.
2025-11-01 14:44:16 +01:00
Till Wegmueller
0b54881558
Add support for multi-OS VM builds with cross-built runners and improved local development tooling
This commit introduces:
- Flexible runner URL configuration via `SOLSTICE_RUNNER_URL(S)` for cloud-init.
- Automated detection of OS-specific runner binaries during VM boot.
- Tasks for cross-building, serving, and orchestrating Solstice runners.
- End-to-end VM build flows for Linux and Illumos environments.
- Enhanced orchestration with multi-runner HTTP serving and log streaming.
2025-11-01 14:31:48 +01:00
Till Wegmueller
9bac2382fd
Remove unused imports across multiple crates 2025-11-01 12:16:07 +01:00
Till Wegmueller
855aecbb10
Add gRPC support for VM runner log streaming and orchestrator integration
This commit introduces gRPC-based log streaming between the VM runner (`solstice-runner`) and orchestrator. Key updates include:
- Implemented gRPC server in the orchestrator for receiving and processing runner logs.
- Added log streaming and job result reporting in the `solstice-runner` client.
- Defined `runner.proto` with messages (`LogItem`, `JobEnd`) and the `Runner` service.
- Updated orchestrator to accept gRPC settings and start the server.
- Modified cloud-init user data to include gRPC endpoint and request ID for runners.
- Enhanced message queue logic to handle job results via `publish_job_result`.
- Configured `Cross.toml` for cross-compilation of the runner.
2025-11-01 12:14:50 +01:00
Till Wegmueller
e73b6ff49f
Refactor Solstice bootstrapping logic into standalone script
This commit replaces inline workflow preparation logic with a dedicated `solstice-bootstrap.sh` script, simplifying workspace setup, job execution, and shutdown processes. The change ensures cleaner orchestration and improves maintainability by centralizing the bootstrapping logic.
2025-10-26 22:09:37 +01:00
Till Wegmueller
4ca78144f2
Add VM state monitoring and graceful shutdown enhancements
This commit enhances the `Scheduler` to monitor VM states for completion, enabling more accurate termination detection. It introduces periodic polling combined with shutdown signals to halt operations gracefully. Additionally, VM lifecycle management in the hypervisor is updated with `state` retrieval for precise status assessments. The VM domain configuration now includes serial console support.
2025-10-26 21:59:55 +01:00
Till Wegmueller
bddd36b16f
Add cooperative shutdown support for Scheduler and AMQP consumer
This commit updates the `Scheduler` to support cooperative shutdown using `Notify`, allowing graceful termination of tasks and cleanup of placeholder VMs. Additionally, the AMQP consumer is enhanced with an explicit shutdown mechanism, ensuring proper resource cleanup, including closing channels and connections.
2025-10-26 21:13:56 +01:00
Till Wegmueller
6ff88529e6
Add configurable placeholder VM runtime and graceful shutdown logic
This commit introduces the ability to configure placeholder VM run time via an environment variable (`VM_PLACEHOLDER_RUN_SECS`) and updates the `Scheduler` to accept this duration. Additionally, it implements a graceful shutdown mechanism for the orchestrator, allowing cooperative shutdown of consumers and cleanup of resources.
2025-10-26 19:06:32 +01:00
Till Wegmueller
7918db3468
Enhance hypervisor image handling with dynamic format detection and raw conversion
This commit improves the hypervisor by:
- Adding support for detecting base image formats using `qemu-img info`.
- Dynamically setting the base image format for overlay creation.
- Automatically converting non-raw images to raw format for bhyve compatibility.
- Updating `Cargo.toml` to include `serde_json` for JSON parsing.
- Modifying default working directory logic for `ZonesHypervisor`.
2025-10-26 18:17:02 +01:00
Till Wegmueller
f3831dac4a
Add Solstice CI workflow definition for Linux and Illumos builds 2025-10-26 16:37:16 +01:00
Till Wegmueller
d05121b378
Switch orchestrator from libvirt crate to virt crate for Linux hypervisor backend
This commit replaces the `libvirt` crate with the `virt` crate for managing the libvirt backend on Linux. Key changes include:

- Updated `Cargo.toml` dependencies and feature configuration.
- Refactored hypervisor implementation to align with `virt` crate API.
- Improved error handling and lifecycle management for VMs and networks.
2025-10-26 16:08:36 +01:00
Till Wegmueller
6568183d86
Add orchestrator persistence using SeaORM for initial database support
This commit introduces a persistence layer to the Orchestrator, enabling it to optionally connect to a Postgres database for recording job and VM states. It includes:

- SeaORM integration with support for migrations from the migration crate.
- `Persist` module with methods for job and VM state upserts.
- No-op fallback when persistence is disabled or unavailable.
- Documentation updates and test coverage for persistence functionality.
2025-10-26 15:38:54 +01:00
Till Wegmueller
6ddfa9a0b0
Update orchestrator documentation for libvirt lifecycle functionality and add migration crate to project files 2025-10-25 20:04:47 +02:00
Till Wegmueller
a71f9cc7d1
Initial Commit
Signed-off-by: Till Wegmueller <toasterson@gmail.com>
2025-10-25 20:01:08 +02:00
Till Wegmüller
38230f2787 Initial commit 2025-10-25 18:31:50 +02:00