/c/[slug] returned 404 even after the W3.9 web rebuild because the web
container's volume list didn't include the case/ directory the
investigator-runtime writes to. The BureauSnapshot file-listing for
Case reports gracefully fell back to empty, but /c/<slug> can't fall
back: it has to read the markdown.
Fix:
- Mount ${CASE_ROOT:-/data/disclosure/case}:/data/ufo/case:ro (read-only,
same pattern as wiki/processing/raw).
- Set CASE_ROOT=/data/ufo/case env in the web container so the
/c/[slug] page and BureauSnapshot resolve the same path.
Verified live: /c/green-fireballs-sandia now serves HTTP 200 with the
Watson narrative parsed + rendered via MarkdownBody.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Token consolidation:
- docker-compose web service now reads ${CLAUDE_CODE_OAUTH_TOKEN} directly,
drop the W1-F8 CLAUDE_CODE_OAUTH_TOKEN_FOR_WEB indirection (user feedback:
one var name, no _FOR_WEB suffix).
investigator-runtime claude.ts:
- --system-prompt silently dropped by CLI v2.1.150 for multi-KB prompts;
inline the system content into the user prompt with a separator
(mirrors scripts/reextract/run.py pattern).
- Multi-line prompts via positional -- broke ("Input must be provided …");
pipe via stdin instead.
- --allowedTools "" is rejected; when no tools wanted, omit it and explicitly
--disallowedTools the writer/reader set so the model can't reach for any.
investigator-runtime locard.ts:
- Log the raw response (first 600 chars) to container stderr — saved hours
of debugging when the writer rejected.
- Grade fallback: when Locard omits `grade` but provides custody_steps,
infer the highest grade that fits (≥3 → A, ≥2 → B, ≥1 → C).
investigator-runtime write_evidence.ts:
- Filter related_hypotheses entries with empty/null hypothesis_id silently
(Locard sometimes emits [{}] when it knows no link yet) instead of
failing the whole write.
Migration 0006_investigator_serial_sequences.sql:
- BIGSERIAL on the 7 investigation tables created auto-sequences
(evidence_evidence_pk_seq etc) that 0004 forgot to GRANT to the
investigator role. Without those grants every INSERT failed with
"permission denied for sequence …". Grant USAGE/SELECT/UPDATE on each
auto-seq.
Verified live: Locard wrote E-0002 + E-0003 from real Sandia chunks
(green fireball Feb 1949; cobalt particle analysis). Grade B, confidence
high, custody chain of 3 steps with honest gaps. Cost $0.09 for both,
~70s wall.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Root-cause fix for "search returns garbage for absent terms". The hybrid RPC's
dense branch always returned its k nearest vectors regardless of distance, so a
query for a term not in the corpus (e.g. "varginha") surfaced unrelated chunks.
The cross-encoder reranker would filter these but costs 18-62s on CPU —
unusable for interactive search.
Add max_dense_dist (default 0.40) to hybrid_search_chunks: dense neighbours
beyond that cosine distance are dropped server-side. Calibrated from measured
distances — strong semantic match ~0.12-0.20, no real match ~0.46-0.53. BM25
full-text still matches literal terms; the reranker becomes opt-in refinement.
Verified live: varginha/abducao → 0, disco voador/roswell → relevant, all <1s.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Fase 3 onda 2 — entity synthesis at scale:
- scripts/synthesize/20_entity_summary.py: queries DB for entities with
total_mentions ≥ threshold + top-K verbatim chunk snippets via
entity_mentions JOIN, prompts Sonnet (Holmes-Watson voice, bilingual),
writes narrative_summary EN+PT-BR + summary_status=synthesized.
Ran on 187 candidates (mentions ≥ 20) → 158 OK · 1 err · 29 skipped (no
snippets). Combined with anchor curation: 20 curated + 158 synthesized
= 178 entities with real narrative (vs 0 a day ago).
Fase 4 — chat with typed artifacts + persistence:
- lib/chat/agui.ts: AG-UI v1 typed Artifact union (citation, crop_image,
entity_card, evidence_card, hypothesis_card, case_card, navigation_offer)
alongside the existing event types.
- lib/chat/tools.ts + openrouter.ts: hybrid_search emits up to 6
citation + crop_image artifacts per query. Provider collects them and
returns in done.artifacts so the route can persist.
- api/sessions/[id]/messages: persist artifacts to messages.citations.
- components/chat-bubble.tsx: ArtifactCard renders inline cards (citation,
crop_image, entity_card, navigation_offer) for streamed and persisted
messages. activeId now persisted in localStorage so navigation between
pages keeps the same conversation. New sessions are lazy (only when user
has zero). loadMessages hydrates tools + artifacts from server. CRUD UI:
rename (✎) + archive (🗑) buttons per session in the list.
Home search:
- doc-list-filters: input now fires hybrid_search (rerank=0 for speed)
in parallel with the local title filter; chunk hits render above the doc
grid with snippet + score + classification.
- api/search/hybrid: accept ?rerank=0 to skip the cross-encoder (1.3s vs 60s).
Auth flow:
- infra: SMTP_HOST=mail.spacemail.com:587 + DMARC published; mail now lands
in inbox. GOTRUE_MAILER_AUTOCONFIRM=false (real email verification).
- kong.yml: proxy /auth/callback on api.disclosure.top → web:3000 so PKCE
email links don't 404 at the gateway.
- web/app/auth/callback: handle both ?code= (OAuth) and ?token=&type=
(PKCE); redirect to the public site host before verifyOtp so the session
cookie lands on the right domain.
Audit deliverables:
- .nirvana/outputs/disclosure-bureau/.../systems-atelier/: 5 docs (code
analysis, tech debt, discovery brief, system arch, 5 ADRs) authored by
sa-principal that produced this roadmap. Kept in-tree for traceability.
- scripts/03-dedup-entities.py: stop emitting placeholder narrative ("Stub. Will
be enriched in Phase 7"); write summary_status=none + null fields instead.
- scripts/maintain/41_strip_stubs.py: idempotent migration that cleaned the
22,096 entity .md files (now zero stub strings in wiki/).
- scripts/synthesize/01_anchor_events.py: curated 20 anchor UAP events
(Roswell, Nimitz Tic-Tac, Phoenix Lights, Operação Prato, AATIP, etc.) with
bilingual Holmes-Watson narrative via claude -p --model sonnet
(CLAUDE_CODE_OAUTH_TOKEN). All summary_status=curated, confidence=high.
- web/api/timeline + timeline-view: filter narrative-less events by default,
render "curado" badge for hand-vetted ones, drop the date display alone.
- CLAUDE-schema-full.md: document the summary_status enum and the four states.
- docker-compose.yml: SMTP_HOST=mail.spacemail.com configured;
GOTRUE_MAILER_AUTOCONFIRM flipped to false (real email confirmation working).
- .nirvana/outputs/.../systems-atelier/: 5 deliverables of the architecture
audit that produced this roadmap.