Commit graph

19 commits

Author SHA1 Message Date
Luiz Gustavo
189a771cbe W3.1-W3.4: Investigation Bureau foundation — migrations, runtime, Locard
Some checks failed
CI / Web — typecheck + lint + build (push) Failing after 38s
CI / Scripts — Python smoke (push) Failing after 3s
CI / Web — npm audit (push) Failing after 33s
CI / Retrieval — golden set (Recall@5 + MRR) (push) Failing after 4s
Migrations:
- 0004_investigation_bureau.sql: 7 new tables (investigation_jobs + evidence,
  hypotheses, contradictions, witnesses, gaps, residual_uncertainties), id
  sequences, pg_notify trigger on investigation_jobs, RLS read-only public,
  investigator role with least-privilege grants (no service_role).
- 0005_investigator_write_policies.sql: fixup adding RLS INSERT/UPDATE
  policies bound to investigator + service_role + postgres (RLS with only a
  SELECT policy was silently blocking the worker's claim UPDATE).

investigator-runtime/ (new Bun + TS container):
- src/main.ts: LISTEN/NOTIFY poller, claim-with-SKIP-LOCKED, drain pool,
  healthcheck file, graceful SIGTERM shutdown.
- src/orchestrator.ts: chief-detective dispatch (evidence_chain → Locard).
  Marks job failed when all per-item outputs error; surfaces first errors.
- src/lib/{env,pg,audit,ids,claude}.ts: typed config (gate #8), pool +
  dedicated LISTEN client, NDJSON audit, sequence allocator (E-NNNN etc),
  claude -p subprocess with quota detection (api_error_status=429).
- src/tools/write_evidence.ts: schema-validate (grade A/B/C custody steps),
  resolve chunk_pk via FK, verify verbatim_excerpt actually appears in
  chunk content, INSERT + render case/evidence/E-NNNN.md + audit.
- src/detectives/locard.ts: load chunk → call Claude with locard.md system
  prompt → parse strict JSON → call writeEvidence locally.
- Dockerfile installs `claude` CLI (OAuth) at build time.

Compose:
- new `investigator` service builds from investigator-runtime/, connects
  with low-privilege role, mounts case/ RW and wiki/+raw/ RO, 512m mem cap.

Web:
- /api/admin/investigate/test (POST+GET) gated by middleware (W0-F1).
  POST creates a job, GET polls status. For W3.6 it becomes the chat tool.

End-to-end smoke: INSERT job → pg_notify → claim → Locard dispatch →
claude subprocess invoked. Auth works (CLI v2.1.150). Currently quota
exhausted (weekly limit · resets 3pm UTC) — pipeline catches the typed
isQuota error, marks job failed with surfaced reason. Architecture proven;
quota reset enables real evidence creation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 19:49:33 -03:00
Luiz Gustavo
eaf282c535 W2: rerank opt-in, analyze_image_region tool, RAG eval, graph cleanup, ADRs
Some checks failed
CI / Web — typecheck + lint + build (push) Failing after 40s
CI / Scripts — Python smoke (push) Failing after 3s
CI / Web — npm audit (push) Failing after 29s
CI / Retrieval — golden set (Recall@5 + MRR) (push) Failing after 3s
- TD#8 hybrid.ts: rerank_strategy {always|when_top_k_gt|never} + threshold
  (default skips rerank for top_k ≤ 15; chat tool uses threshold 10)
- O11 vision.ts + tools.ts: analyze_image_region tool — sharp-crops the
  bbox, claude CLI reads the temp PNG via Read tool, Sonnet vision answers
- TD#12 /graph: SigmaGraph replaces ForceGraphCanvas; react-force-graph-2d
  uninstalled (-37 transitive deps); force-graph-canvas.tsx deleted
- TD#27 messages/route.ts gatherContext slice sizes via CTX_* env vars
- TD#22 tests/rag/: golden.yaml (15 queries) + run.py (Recall@k + MRR +
  negative-pass rate) + baseline.json + CI job in .forgejo/workflows/ci.yml
- docs/adrs/: ADR-001..005 published from systems-atelier deliverables

Verified live on disclosure.top: top_k=5 path skips rerank (6.7s embed-only,
was 12-15s with rerank); rerank=always still available on demand.
First RAG baseline: Recall@5 = 0.2083, MRR = 0.25, Negative pass = 1.0.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 19:20:09 -03:00
Luiz Gustavo
55cac8a395 W0+W1+W1.2: security hardening, observability, autocomplete, glitchtip, forgejo CI
Some checks failed
CI / Web — typecheck + lint + build (push) Failing after 1m30s
CI / Scripts — Python smoke (push) Failing after 32s
CI / Web — npm audit (push) Failing after 37s
W0 — security hardening (5 fixes verified live on disclosure.top)
- middleware: gate /api/admin/* same as /admin/* (F1)
- imgproxy: tighten LOCAL_FILESYSTEM_ROOT from / to /var/lib/storage (F2)
- studio: real basic-auth label (bcrypt hash, middleware reference) (F3)
- relations: ENABLE ROW LEVEL SECURITY + public SELECT policy (F4)
- migration 0003: fold is_searchable + hybrid_search update into canonical (TD#2)

W1 — observability + resilience + autocomplete
- studio: HOSTNAME=0.0.0.0 so Next.js binds on loopback for healthcheck
- compose: PG_POOL_MAX=20, CLAUDE_CODE_OAUTH_TOKEN gated by separate env
- claude-code.ts: subprocess timeout configurable (CLAUDE_CODE_TIMEOUT_MS)
- openrouter.ts: retry with exponential backoff + Retry-After + in-memory
  circuit breaker (promotes FALLBACK after CB_THRESHOLD failures)
- lib/logger.ts: pino logger (NDJSON prod / pretty dev) + withRequest helper
- middleware: mints correlation_id, stamps x-correlation-id response header,
  emits structured http_request log per /api/* call
- messages/route.ts: switch to structured logger
- 60_meili_index.py: push documents + chunks into Meilisearch
- /api/search/autocomplete: parallel meili search (docs + chunks), 5-8ms p50
- search-autocomplete.tsx: debounced dropdown wired into search-panel

W1.2 — Glitchtip + Forgejo self-hosted
- compose: glitchtip-redis + glitchtip-web + glitchtip-worker (v4.2)
- compose: forgejo + forgejo-runner (server v9, runner v6) with group_add=988
- @sentry/nextjs SDK wired (instrumentation.ts + sentry.{client,server}.config.ts)
- /api/admin/throw smoke endpoint (gated by W0-F1 middleware)
- Synthetic event ingestion verified at glitchtip.disclosure.top
- forgejo.disclosure.top up, repo discadmin/disclosure-bureau created,
  runner registered (labels: ubuntu-latest, docker)
- .forgejo/workflows/ci.yml: typecheck + lint + build + npm audit + python
  syntax + compose validation

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 18:18:42 -03:00
Luiz Gustavo
e75ca5eda2 add clean LLM reading version of documents (the core goal)
Scanned docs are messy — duplicate transcriptions (typed + handwritten),
two classification variants of the same narrative, OCR noise, repeated
banners. The doc page showed raw chunks, so everything appeared twice.

40_reading_version.py generates ONE clean, deduplicated, well-structured
bilingual Markdown reading version per doc (Sonnet): merges duplicate versions
without losing unique lines, drops page furniture, formats transcripts as
dialogue. Faithful — invents nothing; redactions kept as markers.

/d/[docId] now defaults to a "📖 leitura" tab rendering this clean version,
with "🔍 trechos · scan original" preserving the faithful per-chunk + per-page
scan view. reading.md lives in raw/<doc>--subagent/ alongside the chunks.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-21 17:23:36 -03:00
Luiz Gustavo
5b62d0a3fe fix: UAP flag renders cleanly when type/rationale absent
~414 chunks have ufo_anomaly_detected=true but no type/rationale (extraction
left them null), so the flag rendered "UAP flag: anomaly —" with a dangling
em-dash. Build the label from the parts that exist: fall back to "anomalia"
for a missing type and omit the "—" when there's no rationale. The flag still
shows (the chunk genuinely contains UAP content), just without the noise.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-21 16:42:37 -03:00
Luiz Gustavo
b94f4869de fix: render image description when chunk has no bbox (no broken crop)
29 image chunks have an empty bbox {}, and `fm.bbox ?? default` doesn't catch
an empty object, so the crop URL got w=undefined → /api/crop 400 → broken-image
icon. Now validate bbox (w/h > 0); without it, render the image's textual
description instead of requesting an impossible crop.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-21 16:40:51 -03:00
Luiz Gustavo
504b20fa5c search: gate dense recall by cosine-distance threshold in the RPC
Root-cause fix for "search returns garbage for absent terms". The hybrid RPC's
dense branch always returned its k nearest vectors regardless of distance, so a
query for a term not in the corpus (e.g. "varginha") surfaced unrelated chunks.
The cross-encoder reranker would filter these but costs 18-62s on CPU —
unusable for interactive search.

Add max_dense_dist (default 0.40) to hybrid_search_chunks: dense neighbours
beyond that cosine distance are dropped server-side. Calibrated from measured
distances — strong semantic match ~0.12-0.20, no real match ~0.46-0.53. BM25
full-text still matches literal terms; the reranker becomes opt-in refinement.

Verified live: varginha/abducao → 0, disco voador/roswell → relevant, all <1s.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-21 16:36:56 -03:00
Luiz Gustavo
4865f974b6 fix search: rerank-gate results so absent terms return nothing
The hybrid_search RPC always returns up to recall_k dense neighbours, so a
query for a term absent from the corpus (e.g. "varginha") returned its 12
nearest vectors — irrelevant chunks like PAGE_NUMBER "1". Two bugs:
the reranker was skipped whenever results <= top_k, and there was no relevance
floor.

Now always run the cross-encoder reranker (BGE-reranker-v2-m3, normalized
sigmoid) and drop hits below 0.02. Verified: "varginha" → 0 results;
"roswell"/"tic tac"/"disco voador" → relevant hits on top (reranker cleanly
separates 0.0001 garbage from 0.03-0.27 matches).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-21 14:46:49 -03:00
Luiz Gustavo
ebc6fa41e9 fix: keep _index.json total_pages in sync after recovering pages
The reprocess pass added chunks for pages beyond the original total_pages but
never updated the field, so doc-page navigation thought docs ended early
(jumped to next document mid-doc) and the page counter was wrong. Now bump
total_pages to the real max chunk page on each integration.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-21 14:32:55 -03:00
Luiz Gustavo
fe19bb9c57 add page↔document navigation + DB repopulation tooling
Doc page (/d/[docId]/[page]) gains prev/next navigation bars (top + bottom):
within a doc it steps page-by-page; at the first/last page it jumps to the
previous/next document. Replaces the disabled-at-boundary links.

Indexer tooling for the VPS repopulation:
- 30-index-chunks-to-db.py: add --no-embed (fast BM25-only index; vectors
  backfilled separately) so the app is usable in minutes, not hours of CPU
  embedding.
- 57_load_relations_from_json.py: load typed relations into public.relations
  from reextract structured fields (deterministic ids, no fuzzy guessing).
- 58_backfill_embeddings.py: async pass to fill chunks.embedding (NULL rows)
  via the embed-service.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-21 14:28:14 -03:00
Luiz Gustavo
a7e9dce6d2 rebuild entity layer from Sonnet-vision reextract pipeline
Add reextract pipeline (scripts/reextract/) that rebuilds doc-level entity
JSON from Sonnet-vision chunks via Opus, replacing the noisy per-page
extraction. Add synthesize scripts to regenerate wiki/entities from the 116
_reextract.json (30), aggregate missing page.md from chunks (31), and reprocess
805 pages the doc-rebuilder agent dropped on context overflow (32). Add
maintain scripts 43-56 for chunk-page sync, dedup, generic-entity marking, and
typed relation extraction.

Web: wire relations API + entity-relations component; entity/timeline/doc
pages consume the rebuilt layer.

Note: raw/, processing/, wiki/ remain gitignored (bulk data managed
separately); the 116 reextract JSONs and 7,798 rebuilt entity files live on
disk only. The 27 curated anchor events under wiki/entities/events/ are
preserved.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-21 12:20:24 -03:00
guto
291748df63 sanitize entities: single YAML source of truth, signal_strength badge
The corpus had two parallel reverse-reference signals: the wiki/pages
entities_extracted blocks (Haiku page-level) and public.entity_mentions
(Sonnet chunk-level, ILIKE-matched). The entity page only consulted the
DB, so it showed "0 menções" for thousands of entities that were anchored
in pages or in cross-entity links the DB never indexed.

Resolved by collapsing all signals into the YAML frontmatter, which is
now the single runtime source for entity metadata.

scripts/maintain/42_sync_entity_stats.py walks every entity and writes:

  mentioned_in:        [...]        # consolidated page refs
  total_mentions:      max(db, pages)
  documents_count:     max(db_docs, distinct page docs)
  signal_sources:
    db_chunks:         int
    page_refs:         int
    cross_refs:        int
  signal_strength:     strong | weak | orphan | unverified
  referenced_by:       [[class/id]]  # cross-entity backlinks

Outgoing wikilinks (e.g. OBJ.observed_in_event → EV) count toward the
entity's own cross_refs so anchored-but-not-mentioned entities don't
register as orphan.

OBJ canonical names like "7m long, 1.3m high, two rocket motors,
smooth flow, rotary drive null UAP (OBJ-EV1945-PEYERLSHOTDOWN-01)"
are rewritten to "Peyerl shot down UAP" derived from observed_in_event,
preserving the full description as an alias. --fix-obj-names did this
for every OBJ-* with >80 char canonical_name.

Default behaviour is conservative: --archive-only-junk archives only
single/double-char names and pure-numeric noise. Everything else stays
on disk with signal_strength marked, so the user can review later.

web/lib/retrieval/entity-pages.ts swapped from db-first to yaml-first.
The /e/[cls]/[id] page now reads counts straight from YAML and renders
a "força do sinal" badge with the per-source breakdown. Orphan entities
get a banner explaining they have no cross-references.

DB is still queried for ONE thing: the chunk text for preview cards on
the entity page, so we don't re-parse 21k markdown files on every render.

First-pass result: 9020 strong / 14520 weak / 10814 orphan; OBJ-EV1945-
PEYERLSHOTDOWN-01 now reads "Peyerl shot down UAP · fraca · 1 backlink"
in the live UI.
2026-05-18 19:49:31 -03:00
guto
c0c6652dd5 guard /admin/* by role + filter chat artifacts to cited chunks
middleware.ts now checks profiles.role on every /admin/* request and
returns a plain 404 (not a redirect) for anonymous users and any
authenticated user without role='admin'. The 404 wording matches a
non-existent route, so we don't leak the existence of an admin area.
gutomec@gmail.com promoted to admin in the live DB.

openrouter.ts chat now collects artifacts silently during the tool-call
loop instead of streaming them to the SSE. After the model writes its
final prose (and after the forced-synthesis pass if needed), we scan
the assembled text for [[doc-id/p007#cNNNN]] citations and emit ONLY the
artifacts referenced. Duplicates are deduped by chunk_id. The persisted
citations column on messages now stores the filtered set too, so old
sessions reload with the same focused card grid.

Before: every hybrid_search hit (up to 6 per call × 5 calls = 30+ citation
cards plus crop images) flooded the chat regardless of what the model
ended up using. After: only the chunks actually woven into the answer.
2026-05-18 17:41:35 -03:00
guto
9889308bf4 fix chat: force synthesis pass + fix ambiguous-column trigger
Two bugs combined to make the chat reply with only cards and no prose:

1. SQL trigger rollup_session_stats was failing with "column reference
   total_cost_usd is ambiguous" because the UPDATE on public.profiles had
   a FROM public.chat_sessions clause and both tables expose that column.
   Persistence of every user message died at this point — sessions were
   created in the DB but had message_count=0 forever. Applied SQL fix
   that qualifies columns with p./s. aliases (production DB updated;
   ALTER FUNCTION run live, not yet codified in a migration file).

2. The free-tier model (nemotron-3-super:free) spent all 5 tool-loop
   turns on hybrid_search calls and never wrote any prose, returning
   content_len=0. Added a forced-synthesis pass in openrouter.ts: when
   the loop exits with empty assembledText but the model did call tools,
   we send ONE final turn with tools omitted from the request payload
   and a user message instructing the model to answer in 3-8 sentences
   citing chunks. openrouterStreamCall now accepts a `withTools` opt
   so the synthesis call can disable tool calling entirely.

Verified end-to-end with the actual user query "O que os astronautas
viram? Quem foi que viu?" on /d/nasa-uap-d6-apollo-17-...:
- content_len: 0 → 947 chars (real synthesis citing Schmitt)
- artifacts: 44 preserved
- assistant message persisted with tool_calls + citations columns
2026-05-18 15:39:46 -03:00
guto
d5f6e6030a fix png-numbering: re-convert 34 zero-based docs + crop fallback
34 of 116 docs were generated with 0-based PNG numbering (p-000.png …
p-008.png) but the Sonnet chunks reference 1-based page numbers in their
YAML frontmatter (page: 9 means the 9th sheet of paper). The /api/crop
handler built p-009.png and got a 500, the browser's Next/Image surfaced
400, and the chunk rendered as a black box on screen.

Fixes:
- web/app/api/crop/route.ts: try p-NNN.png first, fall back to
  p-(NNN-1).png if the 1-based file is missing. Cheap insurance for any
  doc that comes in with the old convention.
- scripts/01-convert-pdfs.sh: previously printf '%03d' "$num" with $num
  starting at 0 (e.g. "008") raised "invalid number" because Bash
  parsed it as octal. Wrap with $((10#$num)) to force decimal — this
  was silently corrupting page sequences and producing gaps like p-001
  ... p-008, p-011 (missing p-009/p-010).
- All 34 affected docs re-converted from PDFs with the patched script;
  every directory now has continuous 1-based PNGs.
- /processing/png/ rsync'd to VPS, web redeployed.

Smoke: /api/crop?doc=doc-341-…&page=9&… now returns 200 image/webp
instead of 500. Tested in browser: chunk c0026 (diagram, p9) renders
the real engineering diagram.
2026-05-18 11:45:40 -03:00
guto
7d13f93393 ship: synthesize 158 entities, AG-UI artifacts, chat persistence, auth flow
Fase 3 onda 2 — entity synthesis at scale:
- scripts/synthesize/20_entity_summary.py: queries DB for entities with
  total_mentions ≥ threshold + top-K verbatim chunk snippets via
  entity_mentions JOIN, prompts Sonnet (Holmes-Watson voice, bilingual),
  writes narrative_summary EN+PT-BR + summary_status=synthesized.
  Ran on 187 candidates (mentions ≥ 20) → 158 OK · 1 err · 29 skipped (no
  snippets). Combined with anchor curation: 20 curated + 158 synthesized
  = 178 entities with real narrative (vs 0 a day ago).

Fase 4 — chat with typed artifacts + persistence:
- lib/chat/agui.ts: AG-UI v1 typed Artifact union (citation, crop_image,
  entity_card, evidence_card, hypothesis_card, case_card, navigation_offer)
  alongside the existing event types.
- lib/chat/tools.ts + openrouter.ts: hybrid_search emits up to 6
  citation + crop_image artifacts per query. Provider collects them and
  returns in done.artifacts so the route can persist.
- api/sessions/[id]/messages: persist artifacts to messages.citations.
- components/chat-bubble.tsx: ArtifactCard renders inline cards (citation,
  crop_image, entity_card, navigation_offer) for streamed and persisted
  messages. activeId now persisted in localStorage so navigation between
  pages keeps the same conversation. New sessions are lazy (only when user
  has zero). loadMessages hydrates tools + artifacts from server. CRUD UI:
  rename (✎) + archive (🗑) buttons per session in the list.

Home search:
- doc-list-filters: input now fires hybrid_search (rerank=0 for speed)
  in parallel with the local title filter; chunk hits render above the doc
  grid with snippet + score + classification.
- api/search/hybrid: accept ?rerank=0 to skip the cross-encoder (1.3s vs 60s).

Auth flow:
- infra: SMTP_HOST=mail.spacemail.com:587 + DMARC published; mail now lands
  in inbox. GOTRUE_MAILER_AUTOCONFIRM=false (real email verification).
- kong.yml: proxy /auth/callback on api.disclosure.top → web:3000 so PKCE
  email links don't 404 at the gateway.
- web/app/auth/callback: handle both ?code= (OAuth) and ?token=&type=
  (PKCE); redirect to the public site host before verifyOtp so the session
  cookie lands on the right domain.

Audit deliverables:
- .nirvana/outputs/disclosure-bureau/.../systems-atelier/: 5 docs (code
  analysis, tech debt, discovery brief, system arch, 5 ADRs) authored by
  sa-principal that produced this roadmap. Kept in-tree for traceability.
2026-05-18 03:52:59 -03:00
guto
a35e1115fb ban gemini: quarantine 10 SDK-using scripts under scripts/_archived-gemini/
User reported a ~ Google Gemini bill against an expected ~ budget.
Permanent ban on Gemini for this project — all LLM inference goes through
claude -p --model sonnet (CLAUDE_CODE_OAUTH_TOKEN) or OPENROUTER_API_KEY as
fallback. The 10 scripts in scripts/ that import google-genai/-generativeai
are moved under scripts/_archived-gemini/ with a DO-NOT-RUN README. Project
memory updated (feedback-no-gemini-ever.md) and the older reference note that
recommended gemini-3.1-pro-preview is revoked.
2026-05-18 02:23:13 -03:00
guto
4459bd17e4 phase-0: kill stubs, ship 20 curated anchor events, configure SMTP
- scripts/03-dedup-entities.py: stop emitting placeholder narrative ("Stub. Will
  be enriched in Phase 7"); write summary_status=none + null fields instead.
- scripts/maintain/41_strip_stubs.py: idempotent migration that cleaned the
  22,096 entity .md files (now zero stub strings in wiki/).
- scripts/synthesize/01_anchor_events.py: curated 20 anchor UAP events
  (Roswell, Nimitz Tic-Tac, Phoenix Lights, Operação Prato, AATIP, etc.) with
  bilingual Holmes-Watson narrative via claude -p --model sonnet
  (CLAUDE_CODE_OAUTH_TOKEN). All summary_status=curated, confidence=high.
- web/api/timeline + timeline-view: filter narrative-less events by default,
  render "curado" badge for hand-vetted ones, drop the date display alone.
- CLAUDE-schema-full.md: document the summary_status enum and the four states.
- docker-compose.yml: SMTP_HOST=mail.spacemail.com configured;
  GOTRUE_MAILER_AUTOCONFIRM flipped to false (real email confirmation working).
- .nirvana/outputs/.../systems-atelier/: 5 deliverables of the architecture
  audit that produced this roadmap.
2026-05-18 00:44:17 -03:00
guto
19d0678e55 baseline: Disclosure Bureau pipeline + Next.js UI + Supabase stack 2026-05-17 22:44:36 -03:00