Token consolidation:
- docker-compose web service now reads ${CLAUDE_CODE_OAUTH_TOKEN} directly,
drop the W1-F8 CLAUDE_CODE_OAUTH_TOKEN_FOR_WEB indirection (user feedback:
one var name, no _FOR_WEB suffix).
investigator-runtime claude.ts:
- --system-prompt silently dropped by CLI v2.1.150 for multi-KB prompts;
inline the system content into the user prompt with a separator
(mirrors scripts/reextract/run.py pattern).
- Multi-line prompts via positional -- broke ("Input must be provided …");
pipe via stdin instead.
- --allowedTools "" is rejected; when no tools wanted, omit it and explicitly
--disallowedTools the writer/reader set so the model can't reach for any.
investigator-runtime locard.ts:
- Log the raw response (first 600 chars) to container stderr — saved hours
of debugging when the writer rejected.
- Grade fallback: when Locard omits `grade` but provides custody_steps,
infer the highest grade that fits (≥3 → A, ≥2 → B, ≥1 → C).
investigator-runtime write_evidence.ts:
- Filter related_hypotheses entries with empty/null hypothesis_id silently
(Locard sometimes emits [{}] when it knows no link yet) instead of
failing the whole write.
Migration 0006_investigator_serial_sequences.sql:
- BIGSERIAL on the 7 investigation tables created auto-sequences
(evidence_evidence_pk_seq etc) that 0004 forgot to GRANT to the
investigator role. Without those grants every INSERT failed with
"permission denied for sequence …". Grant USAGE/SELECT/UPDATE on each
auto-seq.
Verified live: Locard wrote E-0002 + E-0003 from real Sandia chunks
(green fireball Feb 1949; cobalt particle analysis). Grade B, confidence
high, custody chain of 3 steps with honest gaps. Cost $0.09 for both,
~70s wall.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Root-cause fix for "search returns garbage for absent terms". The hybrid RPC's
dense branch always returned its k nearest vectors regardless of distance, so a
query for a term not in the corpus (e.g. "varginha") surfaced unrelated chunks.
The cross-encoder reranker would filter these but costs 18-62s on CPU —
unusable for interactive search.
Add max_dense_dist (default 0.40) to hybrid_search_chunks: dense neighbours
beyond that cosine distance are dropped server-side. Calibrated from measured
distances — strong semantic match ~0.12-0.20, no real match ~0.46-0.53. BM25
full-text still matches literal terms; the reranker becomes opt-in refinement.
Verified live: varginha/abducao → 0, disco voador/roswell → relevant, all <1s.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>