akh-medu: assert_batch is non-functional — SIGKILL on write operations #132

Closed
opened 2026-05-22 13:31:59 +00:00 by toasterson · 0 comments
Owner

Problem

assert_batch is the designated bulk knowledge seeding tool and the ONLY safe alternative to ingest_text. It is broken — all calls (even single triples) are killed with SIGKILL, likely from daemon lock contention on the redb store.

This leaves the caretaking pipeline with NO safe ingestion path:

  • ingest_text → contaminates the KG (see #131)
  • assert_batch → SIGKILL
  • Manual assert_triple → too slow for batch seeding

Evidence

From surgical reports May 15-22: "assert_batch was non-functional — all calls (even single triples) were killed with SIGKILL, likely OOM or deadlock on the large workspace."

Note: the SIGKILL may be a timeout mismatch between mcporter client and akhomed daemon, not an actual OOM. Investigation needed.

Proposed Fix

In akhomed daemon:

  1. Profile assert_batch with a small workspace to isolate root cause
  2. If lock contention: batch transactions, yield between batches
  3. If timeout: increase mcporter client timeout or return partial success
  4. Add test: assert_batch with 100 triples → all confirmed present

Also in anima-server: add retry with exponential backoff on the assert_batch MCP call.

Acceptance

  • assert_batch succeeds for 100 triples on the tecton workspace
  • No SIGKILL within the default 30s timeout
  • Surgical caretaking cron switches from ingest_text to assert_batch

Complexity: M

Requires profiling/root-causing a daemon-side bug. 1-3 days.

## Problem assert_batch is the designated bulk knowledge seeding tool and the ONLY safe alternative to ingest_text. It is broken — all calls (even single triples) are killed with SIGKILL, likely from daemon lock contention on the redb store. This leaves the caretaking pipeline with NO safe ingestion path: - ingest_text → contaminates the KG (see #131) - assert_batch → SIGKILL - Manual assert_triple → too slow for batch seeding ## Evidence From surgical reports May 15-22: "assert_batch was non-functional — all calls (even single triples) were killed with SIGKILL, likely OOM or deadlock on the large workspace." Note: the SIGKILL may be a timeout mismatch between mcporter client and akhomed daemon, not an actual OOM. Investigation needed. ## Proposed Fix In akhomed daemon: 1. Profile assert_batch with a small workspace to isolate root cause 2. If lock contention: batch transactions, yield between batches 3. If timeout: increase mcporter client timeout or return partial success 4. Add test: assert_batch with 100 triples → all confirmed present Also in anima-server: add retry with exponential backoff on the assert_batch MCP call. ## Acceptance - assert_batch succeeds for 100 triples on the tecton workspace - No SIGKILL within the default 30s timeout - Surgical caretaking cron switches from ingest_text to assert_batch ## Complexity: M Requires profiling/root-causing a daemon-side bug. 1-3 days.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
toasterson/Anima#132
No description provided.