A Lobster themed Document Management system
  • Rust 97.8%
  • Go Template 1.2%
  • Dockerfile 1%
Find a file
Till Wegmueller 9d263752b6
All checks were successful
check / check (push) Successful in 19m13s
image / image (push) Successful in 23m47s
chore: re-trigger CI after queue cleared
Previous trigger commits #1978/#1979 were cancelled by the runner queue
cleanup. Push a new bump now that other jobs are unblocked.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 22:44:36 +02:00
.forgejo/workflows Make Dockerfile classic-Docker safe; add diagnostic dump. 2026-04-26 17:09:03 +02:00
config Scrub homelab-specific details from the public repo. 2026-04-23 23:42:16 +02:00
crates Fix CI: cargo fmt + clippy clean, gate dependent jobs on success. 2026-04-25 12:03:32 +02:00
deploy/helm/papercrab Add Solstice CI workflow for image and chart publishing. 2026-04-25 11:21:03 +02:00
docs Initial walking skeleton. 2026-04-23 23:38:41 +02:00
.ci-trigger chore: re-trigger CI after queue cleared 2026-04-29 22:44:36 +02:00
.dockerignore Initial walking skeleton. 2026-04-23 23:38:41 +02:00
.editorconfig Initial walking skeleton. 2026-04-23 23:38:41 +02:00
.gitignore Initial walking skeleton. 2026-04-23 23:38:41 +02:00
Cargo.lock Initial walking skeleton. 2026-04-23 23:38:41 +02:00
Cargo.toml Scrub homelab-specific details from the public repo. 2026-04-23 23:42:16 +02:00
clippy.toml Initial walking skeleton. 2026-04-23 23:38:41 +02:00
Dockerfile Make Dockerfile classic-Docker safe; add diagnostic dump. 2026-04-26 17:09:03 +02:00
LICENSE Initial commit 2026-04-21 12:59:39 +00:00
README.md Add Solstice CI workflow for image and chart publishing. 2026-04-25 11:21:03 +02:00
rust-toolchain.toml Initial walking skeleton. 2026-04-23 23:38:41 +02:00
rustfmt.toml Initial walking skeleton. 2026-04-23 23:38:41 +02:00

papercrab

A Lobster-themed Rust document management system. Pluggable read-only ingest stores, an S3/Object-Lock-backed write store, Postgres + Apache AGE for metadata and graph, Barycenter for identity and authorization, and Firecrawl (plus a local OCR fallback) for content extraction.

Status: walking skeleton. Many subsystems are wired at the trait level with stub implementations. See docs/architecture.md for the shape and docs/roadmap.md for what's deferred.

Layout

crates/
  papercrab-core       domain types
  papercrab-db         SeaORM entities + repositories (+ AGE helpers, FTS)
  papercrab-db/migration  SeaORM migrations
  papercrab-storage    StorageBackend trait + OpenDAL-backed LocalFs/WebDAV/S3
  papercrab-auth       OIDC RP (authn) + Barycenter /v1/check client (authz) + KDL policy
  papercrab-extract    Extractor trait + Firecrawl client + local fallback
  papercrab-formats    ODF / OOXML metadata read/patch (skeleton)
  papercrab-ai         rig agent + OpenRouter provider
  papercrab-search     Postgres-backed search trait (skeleton)
  papercrab-jobs       apalis job types
  papercrab-api        axum HTTP API
  papercrab-cli        clap admin CLI

deploy/
  helm/papercrab/      Helm chart

Building

cargo build --workspace
cargo run -p papercrab-cli -- --help

Running

Papercrab needs: a Postgres 16 database, a Barycenter instance (OIDC + optional /v1/check), an S3-compatible blob store for the read/write backend, and (optionally) API keys for OpenRouter and Firecrawl.

Point PAPERCRAB_CONFIG at a filled-out copy of config/papercrab.example.toml:

cp config/papercrab.example.toml config/papercrab.toml
# edit …
cargo run -p papercrab-cli -- migrate
cargo run -p papercrab-cli -- serve

Container + Helm

The Dockerfile produces a single binary image that runs papercrab serve by default. deploy/helm/papercrab/ is a generic Helm chart: Deployment + Service + ConfigMap + PVC + a pre-install/upgrade migration Job. All sensitive values are secretKeyRefs — the chart expects five Secrets in the release namespace (papercrab-postgres-url, papercrab-oidc, papercrab-openrouter, papercrab-firecrawl, papercrab-s3).

Override the deploy-specific bits (image tag, domain, storage class, backend list, Barycenter URLs, resource requests) from your Flux / Argo / helm install values file.

CI

.forgejo/workflows/build.yml runs on Solstice CI:

Trigger Action
pull_request → main cargo fmt --check, cargo clippy -D warnings, cargo test --workspace
push → main the above, then build and push the container image as <sha> and latest
push → tag v* the above, then helm package and push the chart to the Forgejo OCI registry, appVersion synced to the tag

The runner needs a REGISTRY_TOKEN secret with write:package scope on the forge — Solstice CI auto-configures Docker and Helm against the forge's registry when that's present. Pulls land at code.aopc.cloud/toasterson/papercrab (image) and oci://code.aopc.cloud/toasterson/charts/papercrab (chart).

Extension notes

  • pg_trgm ships with Postgres — always available.
  • pgvector ships with recent CloudNativePG operand images and standard pgvector/pgvector images.
  • Apache AGE is not in stock Postgres. Migration 0002_age_graph checks pg_available_extensions and skips cleanly when AGE is missing; graph-only features stay dormant. To turn them on, use a Postgres image with AGE compiled in and shared_preload_libraries = 'age'.

Auth

Papercrab is a Barycenter relying party. Register it once (RFC 7591 dynamic client registration):

papercrab register-oidc \
  --issuer https://<your-barycenter-host> \
  --redirect-uri https://<your-papercrab-host>/auth/callback

Paste the returned client_secret into the papercrab-oidc Secret.

Authorization defaults to an in-process MockAuthz (useful for CI, dev, and deployments where Barycenter's /v1/check service isn't yet reachable). Set authz.backend = "barycenter" and point check_endpoint / expand_endpoint at your Barycenter authz listener to flip to real enforcement.

License

MPL-2.0. See LICENSE.