Today /sightings, /witnesses, /objects, /locations and /operations show
a name + mention count and nothing else. After this each row carries a
60-100 word bilingual narrative summary written from the chunks where
the entity actually appears.
Migration 0008 (apply as supabase_admin):
public.entities +summary_en TEXT
+summary_pt_br TEXT
+summary_generated_at TIMESTAMPTZ
+summary_model TEXT
+summary_status TEXT
CHECK ('pending'|'ai_generated'|'curated'|'refused')
+ index on summary_status
+ GRANT UPDATE (summary_*) ON entities TO investigator
+ new policy entities_investigator_update_summary (RLS UPDATE for
investigator role)
Enrichment script (investigator-runtime/scripts/enrich_entity_summaries.ts):
- Per-class config (chunk_k, min_mentions, max_per_class)
- Path A: entity_mentions JOIN chunks (high-precision linker)
- Path B (fallback): hybridSearch on canonical_name + aliases when
entity_mentions returns zero. This is what unlocked Kenneth Arnold
and similar entities — their wiki YAML has high total_mentions
counted from frontmatter mentioned_in[], but the entity_mentions
extractor was silent because the matches came from the wiki text,
not the OCR chunks.
- Sonnet 4.6 via OAuth Max, ~$0.04 per entity, ~$10 for the full
260-entity bulk run.
- INSUFFICIENT skip when chunks can't sustain a 60-word summary —
refused entries get summary_status='refused' so they're not retried.
UI uplift:
- lib/retrieval/entity-pages.ts: getEntityCore now prefers the DB
summary (ai_generated or curated) over wiki YAML narrative.
- components/entity-list-page.tsx:
* SELECT now pulls summary_en, summary_pt_br, summary_status
* Sorted with summary-enriched rows first (so the magazine grid
lands on quality content immediately)
* MagazineGrid: 4-line summary preview replaces aliases line
* CompactGrid: enriched rows render as full editorial cards,
bare rows fall back to a compact table below
Smoke results:
- Kenneth Arnold sighting: "On June 24, 1947, pilot Kenneth Arnold
reported sighting unidentified objects over the Pacific Northwest,
and the account spread worldwide. It set off a run of similar
reports: County Commissioner Crankes saw comparable objects after
Arnold's account reached the press, and United Airlines pilot
Emil H. Smith spotted flying discs on July 4 during a routine
flight out of Boise, Idaho..."
- Roswell Incident: includes Colonel Corso's 1997 book + the 1995
GAO finding that radio messages from Oct 46–Feb 47 were destroyed
+ Senator Strom Thurmond's foreword. Real magazine-grade content.
Background bulk run kicked off across all 5 classes (event,
uap_object, person, location, organization) — populating live as
the homepage rebuilds.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|---|---|---|
| .. | ||
| app | ||
| components | ||
| lib | ||
| .dockerignore | ||
| .env.local.example | ||
| Dockerfile | ||
| instrumentation.ts | ||
| middleware.ts | ||
| next-env.d.ts | ||
| next.config.ts | ||
| package-lock.json | ||
| package.json | ||
| postcss.config.mjs | ||
| README.md | ||
| sentry.client.config.ts | ||
| sentry.server.config.ts | ||
| tailwind.config.ts | ||
| tsconfig.json | ||
web — Disclosure Bureau Next.js app
Next.js 15 + React 19 + Tailwind + Supabase + assistant-ui.
Quick start (local dev)
# 1. Install deps
npm install
# 2. (Optional) Start local Supabase
# Requires Docker. Skip if pointing at remote Supabase.
npx supabase init # first time only — creates supabase/ folder
npx supabase start # spins up Postgres/GoTrue/Storage on :54321
# 3. Configure env
cp .env.local.example .env.local
# Edit .env.local — paste local Supabase keys (printed by `supabase start`)
# 4. Apply migrations
psql postgresql://postgres:postgres@localhost:54322/postgres \
-f ../infra/supabase/migrations/0001_chat_schema.sql
# 5. Start dev
npm run dev
# http://localhost:3030
Without Supabase
The app degrades gracefully if Supabase env vars are unset:
- Wiki browsing works (read-only from filesystem)
- Auth bar shows "auth: disabled (dev)"
- Chat bubble shows "Auth not configured"
Useful for quick UI work without spinning up Docker.
Production (Coolify on VPS)
See ../infra/coolify/. Stack:
- Coolify orchestrates everything
- Supabase self-hosted:
db.disclosure.top,studio.disclosure.top - Next.js:
disclosure.top - Meilisearch (shared):
search.disclosure.top - Imgproxy (shared):
img.disclosure.top - Caddy: TLS + reverse proxy (built into Coolify)
Architecture
app/
├── page.tsx # home — 116 docs grouped by collection
├── auth/
│ ├── signin/page.tsx # magic-link form
│ ├── callback/route.ts # exchanges code for session
│ └── signout/route.ts
├── d/[docId]/
│ ├── page.tsx # doc detail
│ └── [page]/page.tsx # page reader (OCR + entity highlights + crops + sidebar PNG)
├── api/
│ ├── me/route.ts # GET current profile
│ ├── sessions/route.ts # GET list, POST new
│ ├── sessions/[id]/route.ts # GET detail, PATCH, DELETE
│ ├── sessions/[id]/messages/route.ts # POST send → assistant reply
│ ├── documents/, pages/, entities/, tables/ # read-only data
│ └── static/[...path]/route.ts # sandboxed file serve
components/
├── chat-bubble.tsx # floating Sherlock — auth-aware, session list
├── entity-modal.tsx # opens on entity click
├── reader-content.tsx # OCR + highlights + crops
└── auth-bar.tsx # sign in / out + budget tracker
lib/
├── wiki.ts # markdown reader (gray-matter)
├── entity-index.ts # match loader + text segmentation
└── supabase/{server,client}.ts # SSR helpers
middleware.ts # session refresh on every request
Tech notes
- No RAG: chat agent reads markdown directly. Wiki-link traversal substitutes for vector search.
- RLS-first: Supabase Row Level Security enforces "user sees only own sessions" at the DB layer.
- Magic-link auth: no passwords. GoTrue handles email delivery.
- Anti-abuse: per-user budget cap (default $5) + daily message quota (default 100) enforced via
check_budgetRPC before each Claude call.
Cost
Each chat turn costs ~$0.005-0.05 depending on context size (mostly Haiku $1/M input, $5/M output).