BUG: Provenance explosion — 18:1 record-to-triple ratio (56K records for 3K triples) #200

Open
opened 2026-05-24 06:07:43 +00:00 by toasterson · 0 comments
Owner

Severity: P0 — Compounding daily, threatens workspace readability

Root Cause

engine.infer() is called per active goal in OODA orient phase (src/agent/ooda.rs:249-266). Each call runs full spreading activation, and provenance records from all calls are persisted via store_batch(). With 7 active goals, this creates 7x the provenance records per cycle.

The ProvenanceLedger::store() and store_batch() methods in src/provenance.rs have zero semantic deduplication — every record is treated as unique regardless of (derived_id, kind, source) tuple.

Additionally, src/autonomous/rule_engine.rs stores provenance per-derived-triple in individual transactions instead of batching.

Files

  • src/agent/ooda.rs:249-266 — per-goal infer loop
  • src/provenance.rs:958-980 — store() no dedup check
  • src/provenance.rs:1006-1040 — store_batch() no dedup check
  • src/autonomous/rule_engine.rs — per-tuple store_provenance() instead of batch
  • src/infer/engine.rs — InferContext.expanded is per-call, discarded after

Suggested Fix

Add is_duplicate() check in ProvenanceLedger::store() and store_batch() that checks existing (derived_id, kind_tag, source_ids) in DERIVED_INDEX before inserting. Batch rule engine provenance writes.

## Severity: P0 — Compounding daily, threatens workspace readability ## Root Cause `engine.infer()` is called per active goal in OODA orient phase (`src/agent/ooda.rs:249-266`). Each call runs full spreading activation, and provenance records from all calls are persisted via `store_batch()`. With 7 active goals, this creates 7x the provenance records per cycle. The `ProvenanceLedger::store()` and `store_batch()` methods in `src/provenance.rs` have **zero semantic deduplication** — every record is treated as unique regardless of (derived_id, kind, source) tuple. Additionally, `src/autonomous/rule_engine.rs` stores provenance per-derived-triple in individual transactions instead of batching. ## Files - `src/agent/ooda.rs:249-266` — per-goal infer loop - `src/provenance.rs:958-980` — store() no dedup check - `src/provenance.rs:1006-1040` — store_batch() no dedup check - `src/autonomous/rule_engine.rs` — per-tuple store_provenance() instead of batch - `src/infer/engine.rs` — InferContext.expanded is per-call, discarded after ## Suggested Fix Add `is_duplicate()` check in ProvenanceLedger::store() and store_batch() that checks existing (derived_id, kind_tag, source_ids) in DERIVED_INDEX before inserting. Batch rule engine provenance writes.
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
toasterson/akh-medu#200
No description provided.