Builds on top of W3.9 to turn the homepage Bureau from a read-only
dashboard into a working command center.
UI improvements (web/components/bureau-snapshot.tsx):
- Detective tiles are now <Link>s — each navigates to its primary
artefact section in /bureau (Holmes→#hypotheses, Locard→#evidence,
Dupin→#contradictions, Schneier→#hypotheses, Poirot→#witnesses,
Taleb→#outliers, Tetlock→#hypotheses, Case-Writer→#reports). Hover
bg matches the detective's tone color.
- <QuickLaunch /> form inserted right under the tiles.
New <QuickLaunch /> client component:
- Detective dropdown (7 active kinds; evidence_chain not yet exposed
here since it needs a doc_id better picked from the doc page).
- Single input swaps placeholder + aria-label by kind: question for
Holmes, topic for Dupin/Taleb/Case-Writer, hypothesis_id for
Schneier/Tetlock, person_id for Poirot.
- Submits to POST /api/bureau/launch and redirects to /jobs/[id]
via the next.js router.
- Loading state ("queueing…") + error display inline.
POST /api/bureau/launch (web/app/api/bureau/launch/route.ts):
- Same 8-kind validator as the chat tool's request_investigation.
- Auth required when Supabase is configured (triggered_by = user:email).
- Returns { job_id, kind, detective, status_url, eta_seconds }.
DocBureauPanel on /d/[docId] (web/components/doc-bureau-panel.tsx):
- Server component inserted between the doc header and
AnomalyHighlights.
- Surfaces every bureau artefact that touches the doc:
· Evidence whose source_page_id starts with docId/p
· Hypotheses citing any of those evidence_ids
· Contradictions whose chunks[] has any item with this doc_id
· Gaps/outliers with scope.doc_id == docId
· Case reports whose markdown body references docId (filesystem scan)
- Empty state shows "Investigation Bureau — untouched" with a CTA
linking back to the homepage to launch the first investigation.
- When non-empty, header counts total artefacts + links to /bureau
for the full view.
Metadata (web/app/layout.tsx):
- description rewritten from "Investigative wiki of the US Department
of War UAP/UFO archive (war.gov/ufo)" to one that names the bureau
+ the 8 detectives. Affects SERP previews + social-card defaults.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Scanned docs are messy — duplicate transcriptions (typed + handwritten),
two classification variants of the same narrative, OCR noise, repeated
banners. The doc page showed raw chunks, so everything appeared twice.
40_reading_version.py generates ONE clean, deduplicated, well-structured
bilingual Markdown reading version per doc (Sonnet): merges duplicate versions
without losing unique lines, drops page furniture, formats transcripts as
dialogue. Faithful — invents nothing; redactions kept as markers.
/d/[docId] now defaults to a "📖 leitura" tab rendering this clean version,
with "🔍 trechos · scan original" preserving the faithful per-chunk + per-page
scan view. reading.md lives in raw/<doc>--subagent/ alongside the chunks.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Doc page (/d/[docId]/[page]) gains prev/next navigation bars (top + bottom):
within a doc it steps page-by-page; at the first/last page it jumps to the
previous/next document. Replaces the disabled-at-boundary links.
Indexer tooling for the VPS repopulation:
- 30-index-chunks-to-db.py: add --no-embed (fast BM25-only index; vectors
backfilled separately) so the app is usable in minutes, not hours of CPU
embedding.
- 57_load_relations_from_json.py: load typed relations into public.relations
from reextract structured fields (deterministic ids, no fuzzy guessing).
- 58_backfill_embeddings.py: async pass to fill chunks.embedding (NULL rows)
via the embed-service.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Add reextract pipeline (scripts/reextract/) that rebuilds doc-level entity
JSON from Sonnet-vision chunks via Opus, replacing the noisy per-page
extraction. Add synthesize scripts to regenerate wiki/entities from the 116
_reextract.json (30), aggregate missing page.md from chunks (31), and reprocess
805 pages the doc-rebuilder agent dropped on context overflow (32). Add
maintain scripts 43-56 for chunk-page sync, dedup, generic-entity marking, and
typed relation extraction.
Web: wire relations API + entity-relations component; entity/timeline/doc
pages consume the rebuilt layer.
Note: raw/, processing/, wiki/ remain gitignored (bulk data managed
separately); the 116 reextract JSONs and 7,798 rebuilt entity files live on
disk only. The 27 curated anchor events under wiki/entities/events/ are
preserved.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>