ips/docs/ai/plans/2026-02-25-phase3-sqlite-catalog-expansion.md
Till Wegmueller 9814635a32
feat: Preserve manifest text through install pipeline, add architecture plans
Manifest text is now carried through the solver's ResolvedPkg and written
directly to disk during install, eliminating the redundant re-fetch from
the repository that could silently fail. save_manifest() is now mandatory
(fatal on error) since the .p5m file on disk is the authoritative record
for pkg verify and pkg fix.

Add ADRs for libips API layer (GUI sharing), OpenID Connect auth, and
SQLite catalog as query engine (including normalized installed_actions
table). Add phase plans for code hygiene, client completion, catalog
expansion, and OIDC authentication.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-23 17:28:10 +01:00

4.7 KiB

Phase 3: SQLite Catalog Expansion (ADR-003 Implementation)

Date: 2026-02-25 Status: Active Depends on: Phase 1 (architecture), Phase 2 P2 (query commands) Implements: ADR-003

Goal

Expand the SQLite catalog shards so that pkg info, pkg contents, pkg search -f, and reverse file lookups work without fetching manifests from the repository. This makes offline queries fast and enables the GUI search bar.

Step 1: Expand shard build schema

File: libips/src/repository/sqlite_catalog.rs

1.1: Add package_metadata table

During build_shards(), when parsing catalog parts, extract set actions for:

  • pkg.summary (already extracted for FTS)
  • pkg.description (already extracted for FTS)
  • info.classification -> category
  • pkg.human-version -> human_version
  • info.upstream-url -> homepage
CREATE TABLE IF NOT EXISTS package_metadata (
    stem TEXT NOT NULL,
    version TEXT NOT NULL,
    publisher TEXT NOT NULL,
    summary TEXT,
    description TEXT,
    category TEXT,
    human_version TEXT,
    homepage TEXT,
    PRIMARY KEY (stem, version, publisher)
);

1.2: Add file_inventory table

Parse all file, dir, link, hardlink actions from manifests during shard build:

CREATE TABLE IF NOT EXISTS file_inventory (
    stem TEXT NOT NULL,
    version TEXT NOT NULL,
    publisher TEXT NOT NULL,
    action_type TEXT NOT NULL,  -- file, dir, link, hardlink
    path TEXT NOT NULL,
    hash TEXT,                   -- SHA hash for files, NULL for dirs/links
    target TEXT,                 -- link target for links, NULL otherwise
    mode TEXT,
    owner TEXT,
    grp TEXT,
    PRIMARY KEY (stem, version, publisher, path)
);
CREATE INDEX IF NOT EXISTS idx_fi_path ON file_inventory(path);
CREATE INDEX IF NOT EXISTS idx_fi_hash ON file_inventory(hash);

1.3: Index all dependency types

Currently only require is stored. Expand to store all types:

-- Add column or track in existing dependencies table
-- dep_type already exists, just populate for all types:
-- require, incorporate, optional, conditional, group, require-any, exclude, parent, origin

Step 2: Update shard sync

File: libips/src/repository/shard_sync.rs

The catalog.attrs index already tracks shard hashes. New tables go into active.db, so no new shard files are needed. The SHA256 hash of active.db changes when content changes, triggering re-download.

Backward compatibility: if the client sees a server without the new tables, queries fall back to manifest fetch. Add version field to catalog.attrs:

{
  "version": 3,
  "schema_features": ["package_metadata", "file_inventory"],
  ...
}

Step 3: Update query methods

3.1: pkg info fast path

// In libips::api::info
pub fn get_package_info(image: &Image, fmri: &Fmri) -> Result<PackageDetails> {
    // Try catalog metadata first (fast, no manifest parse)
    if let Some(meta) = query_package_metadata(image, fmri)? {
        return Ok(meta);
    }
    // Fall back to manifest fetch (slow, needs network)
    let manifest = image.get_manifest(fmri)?;
    Ok(PackageDetails::from_manifest(&manifest))
}

3.2: pkg contents fast path

pub fn get_package_contents(image: &Image, fmri: &Fmri) -> Result<Vec<FileEntry>> {
    // Try file_inventory first
    if let Ok(entries) = query_file_inventory(image, fmri)? {
        return Ok(entries);
    }
    // Fall back to manifest parse
    ...
}

3.3: Reverse file lookup

/// What package owns this file path?
pub fn search_by_path(image: &Image, path: &str) -> Result<Vec<(Fmri, String)>> {
    // SELECT stem, version, publisher, action_type FROM file_inventory WHERE path = ?1
}

3.4: FTS expansion

Add file paths to FTS index so pkg search /usr/bin/vim works:

CREATE VIRTUAL TABLE IF NOT EXISTS package_search
    USING fts5(stem, publisher, summary, description, file_paths,
               content='', tokenize='unicode61');

Where file_paths is a space-separated list of all paths in the package.

Step 4: Depot server serves new shard format

File: pkg6depotd/src/http/handlers/shard.rs

No changes needed — shards are served as opaque blobs by SHA256 hash. The depot already serves whatever catalog.attrs points to.

Verification

  • Rebuild a test repository: cargo run -p pkg6repo -- rebuild --publisher test
  • Verify new tables exist: sqlite3 catalog2/active.db ".tables"
  • Verify pkg info returns data without manifest fetch
  • Verify pkg contents returns file list without manifest fetch
  • Verify pkg search -f /usr/bin/something returns owning package
  • Verify backward compat: old client against new server, new client against old server