Commit graph

78 commits

Author SHA256 Message Date
Till Wegmueller
b5ccd4e2aa Pass work dir to vm-manager for container volume compatibility
Configure vm-manager's QEMU backend to use /var/lib/solstice-ci as the
data directory (matching the compose.yml volume mount) instead of the
default ~/.local/share/vmctl/vms/ path.
2026-04-07 17:45:00 +02:00
Till Wegmueller
b5c7078adc Switch vm-manager to git dep + multi-stage Containerfile
- Use HTTPS git dep for vm-manager (works in CI and container builds)
- Add .cargo/ to .gitignore (local dev patch override)
- Restore multi-stage Containerfile: Rust build stage fetches vm-manager
  from GitHub, Ubuntu 24.04 runtime with QEMU
- Host orchestrator stopped and disabled (container-only from now on)
2026-04-07 17:24:17 +02:00
Till Wegmueller
c9fc05a00e Remove libvirt dependencies and clean up orchestrator
- Remove `virt` crate dependency and libvirt feature flag
- Remove `ssh2` crate dependency (vm-manager handles SSH)
- Remove `zstd` crate dependency (vm-manager handles decompression)
- Remove LibvirtHypervisor, ZonesHypervisor, RouterHypervisor from hypervisor.rs
- Remove libvirt error types from error.rs
- Remove libvirt_uri/libvirt_network CLI options, add network_bridge
- Replace RouterHypervisor::build() with VmManagerAdapter::build()
- Update deb package depends: libvirt → qemu-system-x86
- Keep Noop backend for development/testing
- Dead old SSH/console functions left for future cleanup
2026-04-07 15:56:10 +02:00
Till Wegmueller
2d971ef500 Replace image download with vm-manager ImageManager
Use vm-manager's ImageManager::download() for streaming image downloads
with automatic zstd decompression, replacing the hand-rolled reqwest +
zstd code. Supports http(s), file://, and OCI artifact URLs.
2026-04-07 15:52:02 +02:00
Till Wegmueller
190eb5532f Replace scheduler SSH/console code with vm-manager APIs
- IP discovery: use hv.guest_ip() with timeout loop instead of
  discover_guest_ip_virsh() (500+ lines removed from hot path)
- SSH: use vm_manager::ssh::connect_with_retry() + upload() + exec()
  instead of hand-rolled TCP/ssh2/SFTP code
- Console: use vm_manager::console::ConsoleTailer over Unix socket
  instead of file-based tail_console_to_joblog()
- Add guest_ip() to orchestrator Hypervisor trait with default impl
- Remove #[cfg(linux, libvirt)] gates from is_illumos_label, expand_tilde
- Keep orchestrator-specific: DB persistence, log recording, MQ publish,
  runner binary selection, env var injection
2026-04-07 15:50:54 +02:00
Till Wegmueller
a60053f030 Add vm-manager adapter layer to orchestrator
- Add vm-manager as dependency of orchestrator
- Create vm_adapter.rs that bridges orchestrator's Hypervisor trait
  to vm-manager's RouterHypervisor (QEMU/Propolis/Noop backends)
- Add Qemu and Propolis variants to BackendTag
- Add console_socket, ssh_host_port, mac_addr fields to VmHandle
- Adapter uses user-mode networking by default for containerization
- Maps orchestrator VmSpec + JobContext → vm-manager VmSpec with
  CloudInitConfig and SshConfig
2026-04-07 15:46:20 +02:00
Till Wegmueller
ceaac25a7e Send incremental UpdateTask with step states during log streaming
Streamer now sends UpdateTask alongside UpdateLog on each poll so
Forgejo maps log lines to steps in real time, not just at completion.
This prevents "Set up job" from accumulating all streamed logs.
2026-04-07 00:44:02 +02:00
Till Wegmueller
49c3ab03c4 Map per-step log ranges to YAML steps using KDL step order
- Streamer sorts step categories in KDL workflow order (not alphabetical)
- Reporter emits one StepState per KDL step, each mapped by position
  to the corresponding YAML step ID
- Setup logs auto-map to "Set up job", per-step logs to their steps
2026-04-07 00:41:26 +02:00
Till Wegmueller
ea3a249918 Fix step mapping: only real YAML steps get StepState entries
Forgejo's "Set up job" and "Complete job" are virtual UI steps that
auto-collect logs outside any real step's range. Only the actual YAML
step (id=0) needs a StepState. Setup logs before its log_index go to
"Set up job" automatically.
2026-04-07 00:36:33 +02:00
Till Wegmueller
f61588e68b Fix streamer category ordering to match step boundaries
Streamer now rebuilds the full sorted log (setup categories first,
then work categories) on each poll and only sends new lines. This
ensures log indices align with the reporter's step boundary
calculation regardless of when categories appear in the DB.
2026-04-07 00:32:54 +02:00
Till Wegmueller
61fca2673d Fix log streaming: no duplicates, proper step boundaries
- Streamer sends only new lines per category (tracks cursor per category)
- Reporter no longer re-uploads logs — only sets step state boundaries
  and sends the no_more marker
- Remove ::group:: markers that cluttered the Forgejo log viewer
- Step 0 (Set up job) gets setup categories (boot, env, tool_check)
- Step 1 (main step) gets workflow step output
2026-04-07 00:23:00 +02:00
Till Wegmueller
3a261b3f2e Add log streaming and fix Forgejo step mapping
- Stream logs to Forgejo in real-time during job execution (polls
  logs-service every 3s)
- Map setup logs (boot, env, tool_check) to "Set up job" step
- Map KDL workflow step logs to the main Actions step
- Add summary line to "Complete job" step
- Use ::group::/::endgroup:: markers for log category sections
2026-04-07 00:13:54 +02:00
Till Wegmueller
d8ef6ef236 Add log delivery and step state reporting to Forgejo runner
Fetches logs from logs-service per category, uploads them to Forgejo
via UpdateLog, and reports per-step StepState entries so the Forgejo
UI shows individual step results and log output.
2026-04-06 23:59:26 +02:00
Till Wegmueller
5dfd9c367b Fix Forgejo runner auth: use x-runner-token/x-runner-uuid headers
Forgejo's connect-rpc API uses custom headers for authentication, not
Authorization: Bearer. Registration uses x-runner-token only, while
post-registration calls require both x-runner-token and x-runner-uuid.
2026-04-06 23:43:07 +02:00
Till Wegmueller
70605a3c3a Add Forgejo Runner integration service
New crate that registers as a Forgejo Actions Runner, polls for tasks
via connect-rpc, translates them into Solstice JobRequests (with 3-tier
fallback: KDL workflow → Actions YAML run steps → unsupported error),
and reports results back to Forgejo.

Includes Containerfile and compose.yml service definition.
2026-04-06 23:34:53 +02:00
Till Wegmueller
ac81dedf82
Improve case-insensitive comparison for X-Hookdeck-Will-Retry-After header
Signed-off-by: Till Wegmueller <toasterson@gmail.com>
2026-01-25 23:28:36 +01:00
Till Wegmueller
d3841462cf
chore(format): Format code
Signed-off-by: Till Wegmueller <toasterson@gmail.com>
2026-01-25 23:16:36 +01:00
Till Wegmueller
c0a7f7e3f2
Add ANSI escape sequence parsing for enhanced log rendering
- Replace plain text rendering with `ansi_to_text` for displaying logs with styled ANSI sequences in TUI.
- Implement parsing logic for SGR parameters to apply text styling (e.g., bold, italic, colors).
- Extend TUI functionality to support dynamic styling based on ANSI codes.

Signed-off-by: Till Wegmueller <toasterson@gmail.com>
2026-01-25 23:15:12 +01:00
Till Wegmueller
e76f4c0278
Add configuration support and enhance logs handling in TUI
- Introduce `Config` command to manage local `ciadm` settings, including `set-base-url` for persisting logs-service URLs.
- Improve TUI with log category selection and navigation using the Tab key.
- Refactor logs retrieval to support category-based display and enhance error handling.
- Add local configuration file utilities for storing and loading settings.
- Update dependencies to include the `kdl` crate for configuration management.

Signed-off-by: Till Wegmueller <toasterson@gmail.com>
2026-01-25 22:54:35 +01:00
Till Wegmueller
9306de0acf
Add TUI support and logs-client crate for enhanced job and log management
- Introduce a Terminal User Interface (TUI) to enable interactive browsing of jobs and logs.
- Add a new `logs-client` crate to handle communication with the logs service, including job listing and log retrieval.
- Extend `ciadm` to include new commands: `jobs`, `logs`, and `tui`, for interacting with the logs service.
- Enhance the CLI to support repository filtering, job status retrieval, and detailed log viewing.
- Refactor dependencies and organize logs-related functionality for modularity and reusability.

Signed-off-by: Till Wegmueller <toasterson@gmail.com>
2026-01-25 22:31:11 +01:00
Till Wegmueller
4c5a8567a4
Add webhook crate for extensible signature validation and integration
- Introduce a new `webhook` crate to centralize signature validation for GitHub, Hookdeck, and Forgejo webhooks.
- Enable `github-integration` to perform unified webhook signature verification using the `webhook` crate.
- Refactor `github-integration`: replace legacy HMAC verification with the reusable `webhook` structure.
- Extend Podman configuration for Hookdeck webhook signature handling and improve documentation.
- Clean up unused dependencies by migrating to the new implementation.

Signed-off-by: Till Wegmueller <toasterson@gmail.com>
2026-01-25 22:16:11 +01:00
Till Wegmueller
a1592cd6c9
Add GitHub App support, AMQP integration, and webhook enhancements
- Extend GitHub webhook handler with signature validation, push, and pull request event handling.
- Add GitHub App authentication via JWT and installation token retrieval.
- Parse `.solstice/workflow.kdl` for job queuing with `runs_on`, `script`, and job grouping support.
- Integrate AMQP consumer for orchestrator results and structured job enqueueing.
- Add S3-compatible storage configuration for log uploads.
- Refactor CLI options and internal state for improved configuration management.
- Enhance dependencies for signature, JSON, and AMQP handling.
- Document GitHub integration

Signed-off-by: Till Wegmueller <toasterson@gmail.com>
2026-01-25 16:50:52 +01:00
Till Wegmueller
b53ccfb4e2
Chore cargo fmt
Signed-off-by: Till Wegmueller <toasterson@gmail.com>
2025-11-18 16:14:57 +01:00
Till Wegmueller
102f4a1c52
Add step slugification and category-based NDJSON logging
- Introduce `slugify_step_name` to generate URL-friendly step name slugs.
- Attach per-step categories to NDJSON logs for better traceability.
- Update stdout and stderr logging with step-specific categories.

Signed-off-by: Till Wegmueller <toasterson@gmail.com>
2025-11-18 16:00:51 +01:00
Till Wegmueller
633f658639
chore: format code with cargo fmt
Signed-off-by: Till Wegmueller <toasterson@gmail.com>
2025-11-18 15:43:18 +01:00
Till Wegmueller
0a9d46a455
Add grouped job listing endpoint to logs-service and define related models
- Introduce new `/jobs` endpoint for listing jobs grouped by `(repo_url, commit_sha)`, ordered by update timestamp.
- Add models `JobGroup`, `JobSummary`, and `JobLinks` to structure grouped job details.
- Implement grouping logic using `BTreeMap` for structured output.
- Extend router with the new endpoint and integrate ORM-backed query for fetching job data.

Signed-off-by: Till Wegmueller <toasterson@gmail.com>
2025-11-18 15:39:22 +01:00
Till Wegmueller
08eb82d7f7
Introduce GNU tar (gtar) support and workflow setup enhancements; bump version to 0.1.16
- Add detection and usage of GNU `tar` for platforms where BSD `tar` is incompatible with required options.
- Refactor `job.sh` to delegate all environment setup to newly introduced per-OS setup scripts.
- Add initial support for workflow setups via `workflow.kdl`, running pre-defined setup scripts before executing workflow steps.
- Integrate step-wise execution and logging for workflows, with structured NDJSON output for detailed traceability.
- Increment orchestrator version to 0.1.16.

Signed-off-by: Till Wegmueller <toasterson@gmail.com>
2025-11-18 15:17:03 +01:00
Till Wegmueller
8f909c0105
Update default SSH user to 'sol' and enhance cloud-init config; bump version to 0.1.15
- Change the default SSH username from 'ubuntu' to 'sol' for consistency with Solstice CI environment.
- Modify cloud-init user configuration to align with the new default, adding enhanced permissions and settings for 'sol' user.
- Increment orchestrator version to 0.1.15.

Signed-off-by: Till Wegmueller <toasterson@gmail.com>
2025-11-18 14:48:20 +01:00
Till Wegmueller
2c73c80619
Introduce workflow.jobs support and script path overrides; bump version to 0.1.14
- Add parsing and execution support for `.solstice/workflow.kdl` with job-specific configurations, including `runs_on`, `script path`, and `workflow_job_id`.
- Enable job grouping via `group_id` for cohesive workflow processing.
- Update orchestrator to pass workflow-specific parameters to `cloud-init` for finer control over execution.
- Refactor enqueue logic to handle multiple jobs per workflow with fallback to single job when no workflow is defined.
- Enhance dependencies for workflow parsing by integrating `base64`, `regex`, and `uuid`.
- Increment orchestrator version to 0.1.14 for release.

Signed-off-by: Till Wegmueller <toasterson@gmail.com>
2025-11-18 14:37:35 +01:00
Till Wegmueller
7cc6ff856b
Refactor runner setup and diagnostics; improve tooling support and NDJSON logging
- Enhance runner logging with status-specific messages and structured JSON fields for better traceability.
- Add SHA256 object format detection and initialization for Git repos when applicable.
- Improve shell script execution by adding verbose mode and safe commands handling.
- Extend package installation scripts to support Clang and related tooling across multiple environments.
- Increment orchestrator version for release.

Signed-off-by: Till Wegmueller <toasterson@gmail.com>
2025-11-18 12:50:20 +01:00
Till Wegmueller
df1a3126b1
Refactor tool checks and log categorization for improved clarity and backend portability; bump version to 0.1.14
- Enhance runner tool check diagnostics with more descriptive output and JSON fields for better observability.
- Replace raw SQL queries in `logs-service` with ORM-based logic for portable and backend-agnostic log categorization.
- Add error category aggregation and structured summary reporting in logs-service.
- Improve environment variable fallback mechanics for runner workdir selection.

Signed-off-by: Till Wegmueller <toasterson@gmail.com>
2025-11-18 12:28:22 +01:00
Till Wegmueller
7fc4e8edb7
Introduce logs-service for structured job logs management; bump version to 0.1.13
- Add `logs-service` crate as a separate microservice to handle job log storage, retrieval, and categorization.
- Update orchestrator to redirect log endpoints to the new service with optional permanent redirects using `LOGS_BASE_URL`.
- Enhance log persistence by introducing structured fields such as category, level, and error flags.
- Implement migration to add new columns and indexes for job logs.
- Add ANSI escape sequence stripping and structured logging for cleaner log storage.
- Improve SSH log handling with interleaved stdout/stderr processing and pty request support.
- Revise Docker files and compose setup to include logs-service, with support for PostgreSQL and secure connections.

Signed-off-by: Till Wegmueller <toasterson@gmail.com>
2025-11-18 11:48:09 +01:00
Till Wegmueller
20a0efd116
Atomically upload runner via SFTP to ensure safe file replacement; bump version to 0.1.11
- Refactor runner upload logic to use temporary files and atomic renaming for safer updates.
- Improve file permission handling during temporary file creation.
- Increment orchestrator version to 0.1.11.

Signed-off-by: Till Wegmueller <toasterson@gmail.com>
2025-11-17 23:18:55 +01:00
Till Wegmueller
b36e5c70a8
Validate runner paths at startup and improve diagnostics; bump version to 0.1.10
- Add validation for `RUNNER_LINUX_PATH` and `RUNNER_ILLUMOS_PATH` with detailed warnings and diagnostics for misconfigurations.
- Log fallback to default paths and warn if binaries are missing.
- Increment orchestrator version to 0.1.10.

Signed-off-by: Till Wegmueller <toasterson@gmail.com>
2025-11-17 22:48:33 +01:00
Till Wegmueller
931e5ac81a
Add explicit libvirt configuration support; remove environment variable reliance; bump version to 0.1.9
- Introduce `libvirt_uri` and `libvirt_network` in configuration structs, replacing reliance on environment variables.
- Update all `virsh`-related logic to use explicit parameters for libvirt connection and network settings.
- Align codebase with new guidelines rejecting runtime environment variable mutations.
- Document breaking changes in `.junie/guidelines.md`.
- Increment orchestrator version to 0.1.9.

Signed-off-by: Till Wegmueller <toasterson@gmail.com>
2025-11-17 22:40:50 +01:00
Till Wegmueller
f1d161655f
Refactor dnsmasq leases-based guest IP discovery and bump version to 0.1.8
- Update IP selection logic to prefer the latest lease based on epoch timestamp.
- Remove redundant IP discovery logic in `net-dhcp-leases`.
- Increment orchestrator version to 0.1.8 for release.

Signed-off-by: Till Wegmueller <toasterson@gmail.com>
2025-11-17 22:00:46 +01:00
Till Wegmueller
a6ed0f0c69
Add libvirt-related environment handling, directory preparation, and bump version to 0.1.7
- Add default `LIBVIRT_URI`, `HOME`, and `XDG_CACHE_HOME` environment variable handling for `virsh` commands.
- Ensure writable cache directories for the service user in packaging scripts.
- Update systemd service to include libvirt-related environment defaults.
- Bump orchestrator version to 0.1.7.

Signed-off-by: Till Wegmueller <toasterson@gmail.com>
2025-11-17 21:50:17 +01:00
Till Wegmueller
bf94664a30
Refactor VM lifecycle handling and improve guest IP discovery, bump version to 0.1.6
- Adjust stopping, destroying, and persisting VM lifecycle events to ensure better sequencing and avoid races.
- Enhance `discover_guest_ip_virsh` with detailed logging, structured attempt tracking, and robust fallback mechanisms.
- Introduce `Attempt` struct to capture detailed command execution context for debugging.
- Update console log handling to snapshot logs early, minimizing race conditions.
- Bump orchestrator version to 0.1.6.

Signed-off-by: Till Wegmueller <toasterson@gmail.com>
2025-11-17 21:34:19 +01:00
Till Wegmueller
d5faf319ab
Add boot wait configuration and improve VM startup logging, bump version to 0.1.5
- Introduce `boot_wait_secs` configuration to delay IP discovery/SSH after VM startup.
- Capture console logs when no SSH logs are available for better debugging during failures.
- Add a utility function to snapshot and persist console logs into job logs.
- Update CLI and environment variable support for the `boot_wait_secs` parameter.
- Bump orchestrator version to 0.1.5.

Signed-off-by: Till Wegmueller <toasterson@gmail.com>
2025-11-17 21:12:54 +01:00
Till Wegmueller
5d8e79c8d4
Add support for results queue and routing key in MQ configuration, bump version to 0.1.4
- Introduce `results_queue` and `results_routing_key` to MQ configuration.
- Update message publishing and queue declaration logic to leverage new fields.
- Increment orchestrator version to 0.1.4.

Signed-off-by: Till Wegmueller <toasterson@gmail.com>
2025-11-17 20:51:57 +01:00
Till Wegmueller
8e21c2ba47
Remove unused systemd unit file hardening options, bump version to 0.1.3
Signed-off-by: Till Wegmueller <toasterson@gmail.com>
2025-11-17 20:05:21 +01:00
Till Wegmueller
0724a4c526
Enable libvirt feature for orchestrator and bump version to 0.1.2
- Add `--features libvirt` to orchestrator's Debian package build process.
- Update orchestrator version to 0.1.2 in `Cargo.toml`.

Signed-off-by: Till Wegmueller <toasterson@gmail.com>
2025-11-17 20:01:06 +01:00
Till Wegmueller
fad8e60ec1
Add Debian packaging support and network configuration enhancements
- Introduce Debian package build script using `cargo-deb` for orchestrator releases.
- Add systemd unit file and post-installation script for automatic service setup.
- Update `compose.yml` with host-only port bindings for Postgres and RabbitMQ.
- Introduce NGINX-based log proxy for orchestrator logs with Traefik support.
- Bump orchestrator version to 0.1.1 and update related Cargo metadata for packaging.
- Add example environment file for orchestrator configuration.

Signed-off-by: Till Wegmueller <toasterson@gmail.com>
2025-11-17 19:57:19 +01:00
Till Wegmueller
9dfa9c4b95
Enhance SSH handling with retries and robust error management, refactor guest IP discovery
- Implement SSH execution retries with exponential backoff and timeout handling.
- Replace `virsh domifaddr` with a multi-strategy IP discovery approach.
- Introduce `OrchestratorError` for consistent, structured error reporting.
- Improve runner deployment and SSH session utilities for readability and reliability.
- Add dependencies: `thiserror`, `anyhow` for streamlined error handling.

Signed-off-by: Till Wegmueller <toasterson@gmail.com>
2025-11-15 21:46:54 +01:00
Till Wegmueller
038d1161a6
Refactor Orchestrator with enhanced SSH handling, error management, and IP discovery support
- Implement retries for SSH-based job execution with configurable timeouts.
- Introduce `OrchestratorError` for consistent error handling across modules.
- Replace `virsh domifaddr` based guest IP discovery with a robust, multi-strategy approach.
- Refactor runner deployment and SSH-related utility functions for clarity.
- Add `thiserror` and `anyhow` dependencies for error management.
- Update persistence layer with improved error handling for database operations.

Signed-off-by: Till Wegmueller <toasterson@gmail.com>

Signed-off-by: Till Wegmueller <toasterson@gmail.com>
2025-11-15 21:46:19 +01:00
Till Wegmueller
c2fefb5167
Add per-job SSH key support, refactor scheduler for SSH-based job execution, and remove unused runner endpoint
- Introduce fields in `JobContext` for per-job SSH configuration, including user, key paths, and PEM contents.
- Update the scheduler to support SSH-based execution of jobs, including VM lifecycle management and SSH session handling.
- Add utility functions for SSH execution, guest IP discovery, and runner deployment.
- Remove the unused `/runners/{name}` HTTP endpoint and its associated logic.
- Simplify router creation by refactoring out disabled runner directory handling.

Signed-off-by: Till Wegmueller <toasterson@gmail.com>
2025-11-15 18:37:30 +01:00
Till Wegmueller
930efe547f
Add public runner URL configuration and enhance log streaming support
- Introduce options for specifying public runner base URLs (`SOLSTICE_RUNNER_BASE_URL`) and orchestrator contact addresses (`ORCH_CONTACT_ADDR`).
- Update `.env.sample` and `compose.yml` with new configuration fields for external log streaming and runner binary serving.
- Refactor runner URL handling and generation logic for improved flexibility.
- Enhance `cloud-init` templates with updated runner URL environment variables (`RUNNER_SINGLE` and `RUNNER_URLS`).
- Add unit tests for runner URL generation to verify various input cases.

Signed-off-by: Till Wegmueller <toasterson@gmail.com>
2025-11-11 20:24:20 +01:00
Till Wegmueller
f904cb88b2
Relax filesystem permissions for VM directories, overlays, and logs to support host libvirt/qemu access. Introduce dead-letter queue support with enriched error messages for failed jobs.
Signed-off-by: Till Wegmueller <toasterson@gmail.com>
2025-11-09 17:59:04 +01:00
Till Wegmueller
888aa26388
Add libvirt/KVM integration and Forgejo webhook support to Podman stack
- Extend `.env.sample` with libvirt configuration, Forgejo secrets, and image mapping defaults.
- Update `compose.yml` to enable libvirt integration, including required mounts, devices, and environment variables.
- Add Forgejo webhook configuration and commit status reporting with optional HMAC validation.
- Enhance the orchestrator container with libvirt dependencies and optional features for VM management.
- Document host preparation for libvirt/KVM and image directories in the README.
- Set default fallback values for Traefik ACME CA server.

Signed-off-by: Till Wegmueller <toasterson@gmail.com>
2025-11-09 17:58:36 +01:00
Till Wegmueller
1c5dc338f5
Add Podman Compose deployment stack with Traefik and services integration
This commit introduces:
- A production-ready Podman Compose stack using Traefik as a reverse proxy with Let's Encrypt integration.
- Per-environment logical separation for Postgres, RabbitMQ, and MinIO services.
- New deployment utilities, including a `.env.sample` template, `compose.yml`, and setup scripts for MinIO and Postgres.
- Updates to `github-integration` HTTP server with basic webhook handling using `axum` and configurable paths.
- Adjustments to packaging tasks for better tarball generation via `git archive`.
- Expanded dependencies for `PKGBUILD` to support SQLite and PostgreSQL libraries.
- Containerfiles for orchestrator and integration services to enable Rust multi-stage builds without sccache.

This enables simplified and secure CI deployments with automatic routing, TLS, and volume persistence.
2025-11-08 20:21:57 +00:00