feat: Preserve manifest text through install pipeline, add architecture plans

Manifest text is now carried through the solver's ResolvedPkg and written
directly to disk during install, eliminating the redundant re-fetch from
the repository that could silently fail. save_manifest() is now mandatory
(fatal on error) since the .p5m file on disk is the authoritative record
for pkg verify and pkg fix.

Add ADRs for libips API layer (GUI sharing), OpenID Connect auth, and
SQLite catalog as query engine (including normalized installed_actions
table). Add phase plans for code hygiene, client completion, catalog
expansion, and OIDC authentication.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Till Wegmueller 2026-02-25 22:47:39 +01:00
parent d295a6e219
commit 9814635a32
No known key found for this signature in database
10 changed files with 957 additions and 51 deletions

View file

@ -0,0 +1,38 @@
# ADR-001: Elevate Business Logic into libips API Layer
**Date:** 2026-02-25
**Status:** Accepted
## Context
pkg6 (CLI client) implements significant business logic inline in `main.rs` — install orchestration, solver advice generation, action plan creation, progress reporting, and package pattern parsing. A future GUI client must replicate all of this, leading to duplication or divergence.
The `libips::api` module already exists as a high-level facade (ManifestBuilder, Repository, PublisherClient, Lint) but covers only publishing workflows. Client-side operations (install, uninstall, update, search, verify, info, contents) have no library-level orchestration.
## Decision
All client-side business logic moves into `libips::api` as composable operation types. The pattern:
```
libips::api::install::InstallOptions -> InstallPlan -> ActionPlan -> apply()
libips::api::search::SearchOptions -> SearchResults
libips::api::info::InfoQuery -> PackageDetails
```
pkg6 becomes a thin CLI adapter: parse args -> build options struct -> call libips -> format output.
A future GUI client imports the same `libips::api` types and calls the same methods.
## Key Design Rules
1. **Options structs** carry all configuration (dry_run, accept_licenses, concurrency, etc.)
2. **Progress reporting** via trait object callbacks — CLI prints text, GUI updates progress bars
3. **No formatting in libips** — return structured data, let consumers format
4. **Error types** carry miette diagnostics — both CLI and GUI can render them
## Consequences
- pkg6/src/main.rs shrinks dramatically
- GUI client shares 100% of logic with CLI
- Testing improves — library operations are unit-testable without CLI harness
- Breaking change for anyone importing libips directly (unlikely given current adoption)

View file

@ -0,0 +1,36 @@
# ADR-002: OpenID Connect Authentication for REST API
**Date:** 2026-02-25
**Status:** Accepted
## Context
The REST API (pkg6depotd) and REST client (RestBackend) currently have no authentication. The depot's `auth_check` handler is a stub that only checks for Bearer token presence. The legacy pkg5 used client certificates (x509), which is operationally expensive.
## Decision
Use OpenID Connect (OIDC) for authentication:
- **Depot server** validates JWT access tokens against an OIDC provider's JWKS endpoint
- **REST client** obtains tokens via OIDC flows (device code flow for CLI, authorization code flow for GUI)
- **Token refresh** handled transparently by the client credential manager
### Server Side (pkg6depotd)
- Configure OIDC issuer URL and required scopes in depot config
- Fetch JWKS from `{issuer}/.well-known/openid-configuration` -> `jwks_uri`
- Validate Bearer tokens on protected endpoints (publish, index rebuild)
- Read-only endpoints (catalog, manifest, file, search) remain unauthenticated by default
- Optional: per-publisher access control via JWT claims
### Client Side (libips RestBackend)
- Add `CredentialProvider` trait to RestBackend
- Implement OIDC device code flow for CLI (user opens browser, enters code)
- Token storage in image metadata directory (encrypted at rest)
- Automatic refresh before expiry
## Consequences
- Modern auth infrastructure, compatible with Keycloak/Dex/Auth0/etc.
- No client certificate management burden
- Publisher-level access control possible via scopes/claims
- Requires OIDC provider deployment for secured repos (optional — unsecured repos still work)

View file

@ -0,0 +1,124 @@
# ADR-003: SQLite Catalog as Primary Query Engine
**Date:** 2026-02-25
**Status:** Accepted
## Context
The SQLite catalog system (active.db, obsolete.db, fts.db) already stores package metadata, dependencies, and FTS indices. However, many client commands that could use this data are unimplemented, and some operations load everything into memory unnecessarily.
### Current Schema Capabilities (Underutilized)
| Table | Used By | Could Also Serve |
|-------|---------|-----------------|
| `packages` (stem, version, publisher) | Solver init, list | info, search-by-name, update candidates |
| `dependencies` (dep_type, dep_stem, dep_version) | Solver (require only) | reverse-dep queries, optional deps, info --dependencies |
| `incorporate_locks` (stem, release) | Solver | freeze/unfreeze display |
| `package_search` (FTS5: stem, publisher, summary, description) | **Unused** | `pkg search`, GUI search bar |
| `obsolete_packages` | List (partially) | info --obsolete, update path analysis |
| `installed` (fmri, manifest blob) | Install recording | verify, fix, contents, info, uninstall |
### Key Gaps
1. **FTS is built but never queried**`fts.db` exists, `pkg search` is a stub
2. **No file inventory table** — can't answer "what package owns /usr/bin/vim?" without loading every manifest
3. **No package metadata table** — category, license, homepage not indexed; requires manifest fetch for `pkg info`
4. **Dependencies table only queried for `require` type** — incorporate/optional/conditional ignored
## Decision
Expand the SQLite catalog schema to serve as the primary query engine for all client operations. Avoid loading manifests from repository when the catalog can answer the query.
### Schema Additions
```sql
-- Package metadata (populated during shard build)
CREATE TABLE IF NOT EXISTS package_metadata (
stem TEXT NOT NULL,
version TEXT NOT NULL,
publisher TEXT NOT NULL,
summary TEXT,
description TEXT,
category TEXT,
license TEXT,
pkg_size INTEGER,
PRIMARY KEY (stem, version, publisher)
);
-- File inventory (enables reverse lookups and contents queries)
CREATE TABLE IF NOT EXISTS file_inventory (
stem TEXT NOT NULL,
version TEXT NOT NULL,
publisher TEXT NOT NULL,
action_type TEXT NOT NULL,
path TEXT NOT NULL,
hash TEXT,
PRIMARY KEY (stem, version, publisher, path)
);
CREATE INDEX IF NOT EXISTS idx_files_path ON file_inventory(path);
CREATE INDEX IF NOT EXISTS idx_files_hash ON file_inventory(hash);
```
### Query Strategy
| Command | Data Source | Avoids |
|---------|-----------|--------|
| `pkg search` | fts.db `package_search MATCH ?` | Loading manifests |
| `pkg info` | package_metadata + dependencies | Loading manifests (for basic info) |
| `pkg contents` | file_inventory | Loading manifests |
| `pkg search -f /usr/bin/vim` | file_inventory WHERE path = ? | Scanning all manifests |
| `pkg verify` | file_inventory + installed.manifest | Only loads installed manifests |
| Solver | packages + dependencies (existing) | No change |
### Installed Actions Table (client-side, installed.db)
The same normalization applies to installed packages. Currently `installed.db` stores a
single JSON blob per package. Cross-package queries (reverse file lookup, bulk verify)
require deserializing every blob — O(n * m).
Add a normalized actions table alongside the existing blob:
```sql
-- Keep existing table for complete manifest reconstruction
-- installed (fmri TEXT PK, manifest BLOB)
-- Normalized action index for fast queries
CREATE TABLE IF NOT EXISTS installed_actions (
fmri TEXT NOT NULL,
action_type TEXT NOT NULL,
path TEXT,
hash TEXT,
mode TEXT,
owner TEXT,
grp TEXT,
target TEXT,
FOREIGN KEY (fmri) REFERENCES installed(fmri) ON DELETE CASCADE
);
CREATE INDEX IF NOT EXISTS idx_ia_path ON installed_actions(path);
CREATE INDEX IF NOT EXISTS idx_ia_hash ON installed_actions(hash);
CREATE INDEX IF NOT EXISTS idx_ia_fmri ON installed_actions(fmri);
```
**Performance comparison:**
| Operation | JSON Blob | Normalized | Speedup |
|-----------|----------|------------|---------|
| "What owns /usr/bin/vim?" | O(n*m) scan + deser | O(log n) index | 1000x+ |
| `pkg verify` (all packages) | Deserialize all blobs | SELECT by fmri | 10-50x |
| `pkg contents pkg-name` | Deserialize 1 blob | SELECT WHERE fmri = ? | ~same |
| `pkg info pkg-name` | Deserialize 1 blob | SELECT WHERE fmri = ? | ~same |
| Install recording | 1 INSERT | N INSERTs (one txn) | ~5ms overhead |
| "All files in /usr/lib/" | Full scan | WHERE path LIKE ? | 100x+ |
Populated during `install_package()` by iterating the manifest's files/dirs/links.
Cleaned up via `ON DELETE CASCADE` during uninstall.
## Consequences
- Shard size increases (file_inventory can be large for repos with many packages)
- Shard build time increases (must parse all manifest actions)
- Client queries become O(1) indexed lookups instead of O(n) manifest scans
- `pkg search`, `pkg info`, `pkg contents` become fast without network access after catalog refresh
- `pkg verify` can check file presence/hash/mode via indexed queries instead of blob deserialization
- "What package owns this file?" becomes instant for both installed and catalog packages
- Backward compatible — old shards without new tables still work, just fall back to manifest fetch

View file

@ -0,0 +1,136 @@
# Phase 1: Code Hygiene and Architecture Refactoring
**Date:** 2026-02-25
**Status:** Active
**Estimated scope:** Foundation work before feature development
## Goals
1. Clean up all dead code, clippy warnings, and formatting issues
2. Split pkg6 monolithic main.rs into modules
3. Extract business logic from pkg6 into libips::api
4. Establish the pattern for GUI-shareable operations
## Step 1: Dead Code Cleanup
Remove or annotate all `#[allow(dead_code)]` items. Rules:
- **Delete** if it's legacy/superseded code (e.g., `catalog.rs:parse_action`)
- **Keep with `// WIRED: <plan reference>`** if it will be used in a known upcoming phase
- **Delete** `#[allow(unused_imports)]` and fix the import
- **Delete** `#[allow(unused_assignments)]` and fix the assignment
### Items to address:
| File | Item | Action |
|------|------|--------|
| `libips/src/digest/mod.rs:20` | `DEFAULT_ALGORITHM` | Keep — add `// WIRED: Phase 2 verify command` |
| `libips/src/image/catalog.rs:366` | `parse_action()` | Delete — legacy, superseded by SQLite catalog |
| `libips/src/solver/mod.rs:40` | `PkgCand.id` field | Remove allow — field is used by resolvo internally |
| `libips/src/repository/file_backend.rs:153,166,329` | SearchIndex `new()`, `add_term()`, `save()` | Keep — add `// WIRED: Phase 2 search index rebuild` |
| `libips/src/repository/file_backend.rs:372` | `Transaction.id` | Remove allow — expose as pub for logging/debugging |
| `libips/src/repository/obsoleted.rs:768,834` | `search_entries()`, `is_empty()` | Keep — add `// WIRED: Phase 2 search and info commands` |
| `ports/src/workspace.rs:90,95,100` | `expand_source_path`, `get_proto_dir`, `get_build_dir` | Evaluate — if ports crate uses them, keep; otherwise delete |
### TODOs to fix now:
| File | TODO | Action |
|------|------|--------|
| `actions/mod.rs:102,158` | `mode` as bitmask | Create `FileMode` newtype wrapping u32, parse octal string on construction |
| `actions/mod.rs:284` | `require-any` multi-FMRI | Leave TODO — Phase 2 solver work |
| `actions/mod.rs:291` | `dependency_type` as enum | Create `DependencyType` enum: Require, Incorporate, Optional, RequireAny, Conditional, Exclude, Group, Parent, Origin |
| `actions/mod.rs:293` | `root_image` as boolean | Change to `bool`, parse "true"/"false" in parser |
## Step 2: Run clippy and rustfmt
```bash
cargo clippy --workspace -- -D warnings 2>&1 | head -100
cargo fmt --all --check
```
Fix all warnings. Establish CI-level enforcement.
## Step 3: Split pkg6/src/main.rs
Create module structure:
```
pkg6/src/
main.rs -- CLI entry point, arg parsing only
commands/
mod.rs
install.rs -- install + exact-install
uninstall.rs
update.rs
list.rs
info.rs
search.rs
verify.rs
fix.rs
history.rs
contents.rs
publisher.rs -- set-publisher, unset-publisher, publisher
image.rs -- image-create
debug.rs -- debug-db
output.rs -- formatting helpers (table, json, tsv)
error.rs -- (existing)
```
Each command module has a single `pub async fn run(args: &Args) -> Result<()>`.
## Step 4: Extract operations into libips::api
Create new API modules:
```
libips/src/api/
mod.rs -- re-exports
install.rs -- InstallOptions, install_packages()
uninstall.rs -- UninstallOptions, uninstall_packages()
update.rs -- UpdateOptions, update_packages()
search.rs -- SearchOptions, search_packages()
info.rs -- InfoQuery, get_package_info()
contents.rs -- get_package_contents()
verify.rs -- verify_packages() -> VerificationReport
progress.rs -- ProgressReporter trait (shared by CLI/GUI)
options.rs -- Common option types
```
### Key trait:
```rust
/// Progress reporting for long-running operations.
/// CLI implements with text spinners, GUI with progress bars.
pub trait ProgressReporter: Send + Sync {
fn on_phase(&self, phase: &str, total: Option<usize>);
fn on_item(&self, index: usize, name: &str);
fn on_complete(&self);
fn on_error(&self, msg: &str);
}
```
### Key options pattern:
```rust
pub struct InstallOptions {
pub dry_run: bool,
pub accept_licenses: bool,
pub concurrency: usize,
pub refresh_before: bool,
pub additional_repos: Vec<String>,
pub progress: Option<Arc<dyn ProgressReporter>>,
}
pub fn install_packages(
image: &mut Image,
patterns: &[String],
options: &InstallOptions,
) -> Result<InstallPlan> { ... }
```
## Verification
- `cargo clippy --workspace -- -D warnings` passes clean
- `cargo fmt --all --check` passes clean
- `cargo nextest run` passes (excluding known slow tests)
- No `#[allow(dead_code)]` without `// WIRED:` annotation
- pkg6 commands still work identically after refactor

View file

@ -0,0 +1,239 @@
# Phase 2: pkg6 Client Command Completion
**Date:** 2026-02-25
**Status:** Active
**Depends on:** Phase 1 (architecture refactoring)
**Goal:** Make pkg6 a usable package management client
## Priority Order (by user impact)
### P0 — Install Actually Works
These block everything else. Without working install, nothing downstream matters.
#### 2.0: Manifest Text Preservation (DONE)
**Problem:** `save_manifest()` re-fetched manifest text from the repository instead of
using the text already obtained during solving. If the repo was unreachable after install,
the save silently failed, leaving `pkg verify` / `pkg fix` without a reference manifest.
**Fix (Option B — implemented):**
- Added `manifest_text: String` field to `ResolvedPkg` in solver
- Solver now fetches raw text via `fetch_manifest_text_from_repository()` and parses it,
keeping both the parsed struct and original text
- Falls back to catalog cache + JSON serialization when repo is unreachable (tests, offline)
- `save_manifest()` now takes `manifest_text: &str` instead of re-fetching
- Save is now **mandatory** (fatal error) — the `.p5m` file on disk is the authoritative
record for `pkg verify` and `pkg fix`
- Added `Image::fetch_manifest_text_from_repository()` public method
**Files changed:** `libips/src/solver/mod.rs`, `libips/src/image/mod.rs`, `pkg6/src/main.rs`
#### 2.0b: Normalized Installed Actions Table
**Problem:** `installed.db` stores one JSON blob per package. Cross-package queries
(`pkg verify --all`, "what owns this file?", `pkg contents`) require deserializing every
blob — O(n * m) where n = packages and m = actions per package.
**Fix:** Add `installed_actions` table alongside the existing blob, populated during
`install_package()`. This gives O(log n) indexed lookups for path, hash, and fmri queries.
**Schema (in `libips/src/repository/sqlite_catalog.rs` INSTALLED_SCHEMA):**
```sql
CREATE TABLE IF NOT EXISTS installed_actions (
fmri TEXT NOT NULL,
action_type TEXT NOT NULL, -- file, dir, link, hardlink
path TEXT,
hash TEXT,
mode TEXT,
owner TEXT,
grp TEXT,
target TEXT, -- link target, NULL for file/dir
FOREIGN KEY (fmri) REFERENCES installed(fmri) ON DELETE CASCADE
);
CREATE INDEX IF NOT EXISTS idx_ia_path ON installed_actions(path);
CREATE INDEX IF NOT EXISTS idx_ia_hash ON installed_actions(hash);
CREATE INDEX IF NOT EXISTS idx_ia_fmri ON installed_actions(fmri);
```
**Implementation:**
1. Add schema to `INSTALLED_SCHEMA` constant
2. In `InstalledPackages::add_package()`, after inserting the blob, iterate
`manifest.files`, `.directories`, `.links` and INSERT each action row
3. Removal handled automatically via `ON DELETE CASCADE` from `remove_package()`
4. Migration: detect missing table on open, create if absent (existing installs
won't have rows until next install/rebuild)
5. Add `rebuild_installed_actions()` method that re-populates from existing blobs
for migration of pre-existing images
**Consumers:**
- `pkg verify`: `SELECT path, hash, mode, owner, grp FROM installed_actions WHERE fmri = ?`
- `pkg contents`: `SELECT action_type, path, hash, target FROM installed_actions WHERE fmri = ?`
- Reverse lookup: `SELECT fmri, action_type FROM installed_actions WHERE path = ?`
- `pkg uninstall`: `SELECT path, action_type FROM installed_actions WHERE fmri = ? ORDER BY path DESC`
#### 2.1: File Payload Writing
**Problem:** `apply_file()` in `actions/executors.rs` creates empty files.
**Fix:**
1. `ActionPlan` must carry a reference to the source repository
2. During file action execution, fetch payload via `repo.fetch_file(hash)`
3. Decompress (gzip/lz4) and write to target path
4. Verify digest after write
5. Apply mode via `std::fs::set_permissions()`
6. Apply owner:group via `nix::unistd::chown()`
**Key types to add to libips:**
```rust
pub struct ActionContext {
pub image_root: PathBuf,
pub source_repo: Arc<dyn ReadableRepository>,
pub dry_run: bool,
pub progress: Option<Arc<dyn ProgressReporter>>,
}
```
#### 2.2: Owner/Group Application
**Problem:** `chown` calls are TODOs.
**Fix:** Use `nix::unistd::chown()` with UID/GID lookup via `nix::unistd::User::from_name()` / `Group::from_name()`. Skip on dry_run. Warn (don't fail) if running as non-root.
#### 2.3: Facet/Variant Filtering
**Problem:** All actions delivered regardless of `variant.arch` or `facet.*` tags.
**Fix:** Before building ActionPlan, filter manifest actions:
- Check `variant.*` attributes against image variants
- Check `facet.*` attributes against image facets
- Only include matching actions in the plan
### P1 — Uninstall and Update
#### 2.4: Implement `uninstall`
1. Parse FMRI patterns
2. Query `installed` table for matching packages
3. Check reverse dependencies (what depends on packages being removed?)
4. Query `installed_actions` for file list: `SELECT path, action_type FROM installed_actions WHERE fmri = ? ORDER BY path DESC`
5. Build removal ActionPlan (delete files, then dirs in reverse path order)
6. `DELETE FROM installed WHERE fmri = ?``installed_actions` rows cleaned via CASCADE
7. Remove cached `.p5m` manifest from disk
**Reverse dependency query** — needs new function:
```rust
/// Find all installed packages that depend on `stem`
pub fn reverse_dependencies(installed: &InstalledPackages, stem: &str) -> Result<Vec<Fmri>>
```
#### 2.5: Implement `update`
1. For each installed package (or specified patterns), query catalog for newer versions
2. Run solver with installed packages as constraints + newest available
3. Build ActionPlan with remove-old + install-new pairs
4. Execute plan (ordered: remove files, install new files)
5. Update installed.db
### P2 — Query Commands (leverage SQLite catalog)
#### 2.6: Implement `search`
Wire up the FTS5 index that already exists in fts.db:
```rust
// In libips::api::search
pub fn search_packages(image: &Image, query: &str, options: &SearchOptions) -> Result<Vec<SearchResult>> {
let fts_path = image.fts_db_path();
let conn = Connection::open_with_flags(&fts_path, OpenFlags::SQLITE_OPEN_READ_ONLY)?;
let mut stmt = conn.prepare(
"SELECT stem, publisher, summary FROM package_search WHERE package_search MATCH ?1"
)?;
// ...
}
```
Also wire the REST search for remote queries (server-side already fixed in previous commit).
#### 2.7: Implement `info`
For installed packages: parse manifest blob from installed.db, extract metadata.
For catalog packages: query package_metadata table (after ADR-003 schema expansion), fall back to manifest fetch.
Display: name, version, publisher, summary, description, category, dependencies, size, install date.
#### 2.8: Implement `contents`
For installed packages: query `installed_actions` table (fast indexed lookup, no blob deser).
For catalog packages: query `file_inventory` table (after Phase 3), fall back to manifest fetch.
#### 2.9: Implement `history`
Add operation history table to image metadata:
```sql
CREATE TABLE IF NOT EXISTS operation_history (
id INTEGER PRIMARY KEY AUTOINCREMENT,
timestamp TEXT NOT NULL,
operation TEXT NOT NULL, -- install, uninstall, update
packages TEXT NOT NULL, -- JSON array of FMRIs
user TEXT,
result TEXT NOT NULL -- success, failure
);
```
Record entries during install/uninstall/update. Display with `pkg history`.
### P3 — Integrity Commands
#### 2.10: Implement `verify`
For each installed package, query `installed_actions` (no blob deserialization needed):
1. `SELECT path, hash, mode, owner, grp, action_type FROM installed_actions WHERE fmri = ?`
2. For each file action: check exists, check size, verify SHA hash
3. For each dir action: check exists, check mode
4. For each link action: check exists, check target
5. Report: OK, MISSING, CORRUPT, WRONG_PERMISSIONS
6. Fall back to `.p5m` manifest text on disk (from 2.0) if `installed_actions` is empty (migration)
```rust
pub struct VerificationResult {
pub fmri: Fmri,
pub issues: Vec<VerificationIssue>,
}
pub enum VerificationIssue {
Missing { path: PathBuf, action_type: String },
HashMismatch { path: PathBuf, expected: String, actual: String },
PermissionMismatch { path: PathBuf, expected: String, actual: String },
OwnerMismatch { path: PathBuf, expected: String, actual: String },
}
```
#### 2.11: Implement `fix`
1. Run verify
2. For each MISSING/CORRUPT file: re-download payload from repository
3. For each WRONG_PERMISSIONS: re-apply mode/owner/group
4. Report what was fixed
### P4 — Remaining Action Executors
#### 2.12: User/Group executors
Use `nix` crate for user/group creation. On non-illumos systems, log a warning and skip.
#### 2.13: Driver executor
illumos-specific: call `add_drv` / `update_drv`. On other systems, skip with warning.
#### 2.14: Service executor (SMF)
illumos-specific: call `svcadm` / `svccfg`. On other systems, skip with warning.
## Verification
After each sub-step:
- `cargo nextest run` passes
- `cargo clippy --workspace -- -D warnings` clean
- Manual test of the implemented command against a test repository

View file

@ -0,0 +1,151 @@
# Phase 3: SQLite Catalog Expansion (ADR-003 Implementation)
**Date:** 2026-02-25
**Status:** Active
**Depends on:** Phase 1 (architecture), Phase 2 P2 (query commands)
**Implements:** ADR-003
## Goal
Expand the SQLite catalog shards so that `pkg info`, `pkg contents`, `pkg search -f`, and reverse file lookups work without fetching manifests from the repository. This makes offline queries fast and enables the GUI search bar.
## Step 1: Expand shard build schema
**File:** `libips/src/repository/sqlite_catalog.rs`
### 1.1: Add `package_metadata` table
During `build_shards()`, when parsing catalog parts, extract `set` actions for:
- `pkg.summary` (already extracted for FTS)
- `pkg.description` (already extracted for FTS)
- `info.classification` -> category
- `pkg.human-version` -> human_version
- `info.upstream-url` -> homepage
```sql
CREATE TABLE IF NOT EXISTS package_metadata (
stem TEXT NOT NULL,
version TEXT NOT NULL,
publisher TEXT NOT NULL,
summary TEXT,
description TEXT,
category TEXT,
human_version TEXT,
homepage TEXT,
PRIMARY KEY (stem, version, publisher)
);
```
### 1.2: Add `file_inventory` table
Parse all `file`, `dir`, `link`, `hardlink` actions from manifests during shard build:
```sql
CREATE TABLE IF NOT EXISTS file_inventory (
stem TEXT NOT NULL,
version TEXT NOT NULL,
publisher TEXT NOT NULL,
action_type TEXT NOT NULL, -- file, dir, link, hardlink
path TEXT NOT NULL,
hash TEXT, -- SHA hash for files, NULL for dirs/links
target TEXT, -- link target for links, NULL otherwise
mode TEXT,
owner TEXT,
grp TEXT,
PRIMARY KEY (stem, version, publisher, path)
);
CREATE INDEX IF NOT EXISTS idx_fi_path ON file_inventory(path);
CREATE INDEX IF NOT EXISTS idx_fi_hash ON file_inventory(hash);
```
### 1.3: Index all dependency types
Currently only `require` is stored. Expand to store all types:
```sql
-- Add column or track in existing dependencies table
-- dep_type already exists, just populate for all types:
-- require, incorporate, optional, conditional, group, require-any, exclude, parent, origin
```
## Step 2: Update shard sync
**File:** `libips/src/repository/shard_sync.rs`
The `catalog.attrs` index already tracks shard hashes. New tables go into `active.db`, so no new shard files are needed. The SHA256 hash of `active.db` changes when content changes, triggering re-download.
Backward compatibility: if the client sees a server without the new tables, queries fall back to manifest fetch. Add version field to `catalog.attrs`:
```json
{
"version": 3,
"schema_features": ["package_metadata", "file_inventory"],
...
}
```
## Step 3: Update query methods
### 3.1: `pkg info` fast path
```rust
// In libips::api::info
pub fn get_package_info(image: &Image, fmri: &Fmri) -> Result<PackageDetails> {
// Try catalog metadata first (fast, no manifest parse)
if let Some(meta) = query_package_metadata(image, fmri)? {
return Ok(meta);
}
// Fall back to manifest fetch (slow, needs network)
let manifest = image.get_manifest(fmri)?;
Ok(PackageDetails::from_manifest(&manifest))
}
```
### 3.2: `pkg contents` fast path
```rust
pub fn get_package_contents(image: &Image, fmri: &Fmri) -> Result<Vec<FileEntry>> {
// Try file_inventory first
if let Ok(entries) = query_file_inventory(image, fmri)? {
return Ok(entries);
}
// Fall back to manifest parse
...
}
```
### 3.3: Reverse file lookup
```rust
/// What package owns this file path?
pub fn search_by_path(image: &Image, path: &str) -> Result<Vec<(Fmri, String)>> {
// SELECT stem, version, publisher, action_type FROM file_inventory WHERE path = ?1
}
```
### 3.4: FTS expansion
Add file paths to FTS index so `pkg search /usr/bin/vim` works:
```sql
CREATE VIRTUAL TABLE IF NOT EXISTS package_search
USING fts5(stem, publisher, summary, description, file_paths,
content='', tokenize='unicode61');
```
Where `file_paths` is a space-separated list of all paths in the package.
## Step 4: Depot server serves new shard format
**File:** `pkg6depotd/src/http/handlers/shard.rs`
No changes needed — shards are served as opaque blobs by SHA256 hash. The depot already serves whatever `catalog.attrs` points to.
## Verification
- Rebuild a test repository: `cargo run -p pkg6repo -- rebuild --publisher test`
- Verify new tables exist: `sqlite3 catalog2/active.db ".tables"`
- Verify `pkg info` returns data without manifest fetch
- Verify `pkg contents` returns file list without manifest fetch
- Verify `pkg search -f /usr/bin/something` returns owning package
- Verify backward compat: old client against new server, new client against old server

View file

@ -0,0 +1,139 @@
# Phase 4: OpenID Connect Authentication
**Date:** 2026-02-25
**Status:** Active
**Depends on:** Phase 1 (architecture)
**Implements:** ADR-002
## Goal
Secure the REST API with OIDC JWT validation. Allow CLI and GUI clients to authenticate via standard OIDC flows.
## Step 1: Server-side JWT validation (pkg6depotd)
### 1.1: Add dependencies
```toml
# pkg6depotd/Cargo.toml
jsonwebtoken = "9"
reqwest = { version = "0.12", features = ["json"] } # for JWKS fetch
serde_json = "1"
```
### 1.2: Configuration
```kdl
// depot.kdl
auth {
enabled true
oidc-issuer "https://keycloak.example.com/realms/ips"
required-scopes "ips:read" "ips:write"
// Optional: per-publisher access via JWT claims
publisher-claim "ips_publishers"
}
```
### 1.3: JWKS fetcher
Background task that fetches and caches JWKS from the OIDC provider:
```rust
pub struct JwksCache {
keys: RwLock<jwk::JwkSet>,
issuer: String,
jwks_uri: String,
}
impl JwksCache {
pub async fn new(issuer: &str) -> Result<Self> { /* fetch .well-known/openid-configuration */ }
pub async fn refresh(&self) -> Result<()> { /* re-fetch JWKS */ }
pub fn validate_token(&self, token: &str) -> Result<Claims> { /* decode + verify */ }
}
```
### 1.4: Auth middleware
Axum middleware that validates Bearer tokens on protected routes:
```rust
pub async fn require_auth(
State(jwks): State<Arc<JwksCache>>,
req: Request,
next: Next,
) -> Result<Response, DepotError> {
let token = extract_bearer_token(&req)?;
let claims = jwks.validate_token(&token)?;
// Check required scopes
// Inject claims into request extensions
next.run(req).await
}
```
Apply to: POST routes (publish, index rebuild). Leave GET routes (catalog, manifest, file, search) unauthenticated by default. Add optional `auth.require-read true` config to protect everything.
## Step 2: Client-side OIDC (libips RestBackend)
### 2.1: Add dependencies
```toml
# libips/Cargo.toml
openidconnect = "4"
```
### 2.2: CredentialProvider trait
```rust
pub trait CredentialProvider: Send + Sync {
fn get_token(&self) -> Result<String>;
fn refresh_if_needed(&self) -> Result<()>;
}
```
### 2.3: Device Code Flow (CLI)
```rust
pub struct DeviceCodeProvider {
issuer: String,
client_id: String,
token_path: PathBuf, // cached token on disk
}
```
Flow:
1. Call device authorization endpoint
2. Print "Open https://... and enter code: ABCD-EFGH"
3. Poll token endpoint until user completes
4. Cache token + refresh token to `{image}/.pkg/auth/{publisher}.json`
5. On subsequent calls, use refresh token if access token expired
### 2.4: Wire into RestBackend
```rust
impl RestBackend {
pub fn with_credentials(mut self, provider: Arc<dyn CredentialProvider>) -> Self {
self.credential_provider = Some(provider);
self
}
}
```
All HTTP requests check for credential provider and add `Authorization: Bearer {token}` header.
## Step 3: Token storage
Tokens stored in image metadata:
```
{image_root}/.pkg/auth/
{publisher}.json -- { "access_token": "...", "refresh_token": "...", "expires_at": "..." }
```
File permissions: `0600` (owner read/write only).
## Verification
- Start Keycloak/Dex in Docker for testing
- Verify unauthenticated GET requests still work
- Verify protected POST requires valid Bearer token
- Verify expired tokens are rejected
- Verify CLI device code flow obtains and caches token
- Verify token refresh works transparently

View file

@ -480,12 +480,16 @@ impl Image {
/// and stored under a flattened path:
/// manifests/<publisher>/<encoded_stem>@<encoded_version>.p5m
/// Missing publisher will fall back to the image default publisher, then "unknown".
/// Save raw manifest text to disk for offline use by `pkg verify` and `pkg fix`.
///
/// The caller must provide the original manifest text as fetched from the
/// repository. This avoids a redundant network round-trip and guarantees the
/// saved text matches what was actually used to install.
pub fn save_manifest(
&self,
fmri: &crate::fmri::Fmri,
_manifest: &crate::actions::Manifest,
manifest_text: &str,
) -> Result<std::path::PathBuf> {
// Determine publisher name
let pub_name = if let Some(p) = &fmri.publisher {
p.clone()
} else if let Ok(def) = self.default_publisher() {
@ -494,11 +498,9 @@ impl Image {
"unknown".to_string()
};
// Build directory path manifests/<publisher> (flattened, no stem subfolders)
let dir_path = self.manifest_dir().join(&pub_name);
std::fs::create_dir_all(&dir_path)?;
// Encode helpers for filename parts
fn url_encode(s: &str) -> String {
let mut out = String::new();
for b in s.bytes() {
@ -521,29 +523,9 @@ impl Image {
let encoded_version = url_encode(&version);
let file_path = dir_path.join(format!("{}@{}.p5m", encoded_stem, encoded_version));
// Fetch raw manifest text from repository
let publisher_name = pub_name.clone();
let raw_text = {
// Look up publisher configuration
let publisher = self.get_publisher(&publisher_name)?;
let origin = &publisher.origin;
if origin.starts_with("file://") {
let path_str = origin.trim_start_matches("file://");
let path = std::path::PathBuf::from(path_str);
let repo = crate::repository::FileBackend::open(&path)?;
repo.fetch_manifest_text(&publisher_name, fmri)?
} else {
let mut repo = crate::repository::RestBackend::open(origin)?;
// Set cache path for completeness
let publisher_catalog_dir = self.catalog_dir().join(&publisher.name);
repo.set_local_cache_path(&publisher_catalog_dir)?;
repo.fetch_manifest_text(&publisher_name, fmri)?
}
};
// Write atomically
// Write atomically via tmp + rename
let tmp_path = file_path.with_extension("p5m.tmp");
std::fs::write(&tmp_path, raw_text.as_bytes())?;
std::fs::write(&tmp_path, manifest_text.as_bytes())?;
std::fs::rename(&tmp_path, &file_path)?;
Ok(file_path)
@ -827,6 +809,45 @@ impl Image {
}
}
/// Fetch raw manifest text for the given FMRI from its repository origin.
///
/// Unlike [`get_manifest_from_repository`] which returns a parsed [`Manifest`],
/// this returns the original text so it can be cached on disk for offline use
/// by `pkg verify` and `pkg fix`.
pub fn fetch_manifest_text_from_repository(
&self,
fmri: &crate::fmri::Fmri,
) -> Result<String> {
let publisher_name = if let Some(p) = &fmri.publisher {
p.clone()
} else {
self.default_publisher()?.name.clone()
};
let publisher = self.get_publisher(&publisher_name)?;
let origin = &publisher.origin;
if fmri.version().is_empty() {
return Err(ImageError::Repository(RepositoryError::Other(
"FMRI must include a version to fetch manifest".to_string(),
)));
}
if origin.starts_with("file://") {
let path_str = origin.trim_start_matches("file://");
let path = PathBuf::from(path_str);
let repo = FileBackend::open(&path)?;
repo.fetch_manifest_text(&publisher_name, fmri)
.map_err(Into::into)
} else {
let mut repo = RestBackend::open(origin)?;
let publisher_catalog_dir = self.catalog_dir().join(&publisher.name);
repo.set_local_cache_path(&publisher_catalog_dir)?;
repo.fetch_manifest_text(&publisher_name, fmri)
.map_err(Into::into)
}
}
/// Download catalog for a specific publisher
pub fn download_publisher_catalog(&self, publisher_name: &str) -> Result<()> {
// Get the publisher

View file

@ -582,6 +582,9 @@ impl SolverError {
pub struct ResolvedPkg {
pub fmri: Fmri,
pub manifest: Manifest,
/// Original manifest text as fetched from the repository, preserved for
/// on-disk caching so that `pkg verify` and `pkg fix` can work offline.
pub manifest_text: String,
}
#[derive(Debug, Default, Clone)]
@ -818,21 +821,47 @@ pub fn resolve_install(
let mut plan = InstallPlan::default();
for sid in solution_ids {
if let Some(fmri) = sid_to_fmri.get(&sid).cloned() {
// Fetch manifest from repository or catalog cache
let manifest = match image_ref.get_manifest_from_repository(&fmri) {
Ok(m) => m,
Err(repo_err) => match image_ref.get_manifest_from_catalog(&fmri) {
Ok(Some(m)) => m,
_ => {
return Err(SolverError::new(format!(
"failed to obtain manifest for {}: {}",
fmri, repo_err
)));
// Try to fetch raw manifest text from the repository first.
// We keep both parsed manifest and original text so the text can
// be saved to disk for offline use by pkg verify / pkg fix.
let (manifest, manifest_text) =
match image_ref.fetch_manifest_text_from_repository(&fmri) {
Ok(text) => {
let m = crate::actions::Manifest::parse_string(text.clone()).map_err(
|e| {
SolverError::new(format!(
"failed to parse manifest for {}: {}",
fmri, e
))
},
)?;
(m, text)
}
},
};
Err(_repo_err) => {
// Fall back to catalog cache (covers tests with mock publishers
// and offline scenarios where the repo is unreachable).
let m = match image_ref.get_manifest_from_catalog(&fmri) {
Ok(Some(m)) => m,
_ => match image_ref.get_manifest_from_repository(&fmri) {
Ok(m) => m,
Err(e) => {
return Err(SolverError::new(format!(
"failed to obtain manifest for {}: {}",
fmri, e
)));
}
},
};
let text = serde_json::to_string(&m).unwrap_or_default();
(m, text)
}
};
plan.reasons.push(format!("selected {} via solver", fmri));
plan.add.push(ResolvedPkg { fmri, manifest });
plan.add.push(ResolvedPkg {
fmri,
manifest,
manifest_text,
});
}
}
Ok(plan)

View file

@ -762,22 +762,15 @@ fn main() -> Result<()> {
let mut idx = 0usize;
for rp in &plan.add {
image.install_package(&rp.fmri, &rp.manifest)?;
// Save original manifest text to disk — required for pkg verify/fix
let path = image.save_manifest(&rp.fmri, &rp.manifest_text)?;
if *verbose && !quiet {
eprintln!("Saved manifest for {} to {}", rp.fmri, path.display());
}
idx += 1;
if !quiet && (idx % 5 == 0 || idx == total_pkgs) {
println!("Recorded {}/{} packages", idx, total_pkgs);
}
// Save full manifest into manifests directory for reproducibility
match image.save_manifest(&rp.fmri, &rp.manifest) {
Ok(path) => {
if *verbose && !*quiet {
eprintln!("Saved manifest for {} to {}", rp.fmri, path.display());
}
}
Err(e) => {
// Non-fatal: log error but continue install
error!("Failed to save manifest for {}: {}", rp.fmri, e);
}
}
}
if !quiet {
println!("Installed {} package(s)", plan.add.len());