disclosure-bureau

Author	SHA1	Message	Date
Luiz Gustavo	54a26f8db8	W3 followup: drop _FOR_WEB token, fix claude CLI args + writer guards, BIGSERIAL grants Some checks failed CI / Web — typecheck + lint + build (push) Failing after 46s Details CI / Scripts — Python smoke (push) Failing after 4s Details CI / Web — npm audit (push) Failing after 34s Details CI / Retrieval — golden set (Recall@5 + MRR) (push) Failing after 4s Details Token consolidation: - docker-compose web service now reads ${CLAUDE_CODE_OAUTH_TOKEN} directly, drop the W1-F8 CLAUDE_CODE_OAUTH_TOKEN_FOR_WEB indirection (user feedback: one var name, no _FOR_WEB suffix). investigator-runtime claude.ts: - --system-prompt silently dropped by CLI v2.1.150 for multi-KB prompts; inline the system content into the user prompt with a separator (mirrors scripts/reextract/run.py pattern). - Multi-line prompts via positional -- broke ("Input must be provided …"); pipe via stdin instead. - --allowedTools "" is rejected; when no tools wanted, omit it and explicitly --disallowedTools the writer/reader set so the model can't reach for any. investigator-runtime locard.ts: - Log the raw response (first 600 chars) to container stderr — saved hours of debugging when the writer rejected. - Grade fallback: when Locard omits `grade` but provides custody_steps, infer the highest grade that fits (≥3 → A, ≥2 → B, ≥1 → C). investigator-runtime write_evidence.ts: - Filter related_hypotheses entries with empty/null hypothesis_id silently (Locard sometimes emits [{}] when it knows no link yet) instead of failing the whole write. Migration 0006_investigator_serial_sequences.sql: - BIGSERIAL on the 7 investigation tables created auto-sequences (evidence_evidence_pk_seq etc) that 0004 forgot to GRANT to the investigator role. Without those grants every INSERT failed with "permission denied for sequence …". Grant USAGE/SELECT/UPDATE on each auto-seq. Verified live: Locard wrote E-0002 + E-0003 from real Sandia chunks (green fireball Feb 1949; cobalt particle analysis). Grade B, confidence high, custody chain of 3 steps with honest gaps. Cost $0.09 for both, ~70s wall. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-23 21:05:35 -03:00
Luiz Gustavo	189a771cbe	W3.1-W3.4: Investigation Bureau foundation — migrations, runtime, Locard Some checks failed CI / Web — typecheck + lint + build (push) Failing after 38s Details CI / Scripts — Python smoke (push) Failing after 3s Details CI / Web — npm audit (push) Failing after 33s Details CI / Retrieval — golden set (Recall@5 + MRR) (push) Failing after 4s Details Migrations: - 0004_investigation_bureau.sql: 7 new tables (investigation_jobs + evidence, hypotheses, contradictions, witnesses, gaps, residual_uncertainties), id sequences, pg_notify trigger on investigation_jobs, RLS read-only public, investigator role with least-privilege grants (no service_role). - 0005_investigator_write_policies.sql: fixup adding RLS INSERT/UPDATE policies bound to investigator + service_role + postgres (RLS with only a SELECT policy was silently blocking the worker's claim UPDATE). investigator-runtime/ (new Bun + TS container): - src/main.ts: LISTEN/NOTIFY poller, claim-with-SKIP-LOCKED, drain pool, healthcheck file, graceful SIGTERM shutdown. - src/orchestrator.ts: chief-detective dispatch (evidence_chain → Locard). Marks job failed when all per-item outputs error; surfaces first errors. - src/lib/{env,pg,audit,ids,claude}.ts: typed config (gate #8), pool + dedicated LISTEN client, NDJSON audit, sequence allocator (E-NNNN etc), claude -p subprocess with quota detection (api_error_status=429). - src/tools/write_evidence.ts: schema-validate (grade A/B/C custody steps), resolve chunk_pk via FK, verify verbatim_excerpt actually appears in chunk content, INSERT + render case/evidence/E-NNNN.md + audit. - src/detectives/locard.ts: load chunk → call Claude with locard.md system prompt → parse strict JSON → call writeEvidence locally. - Dockerfile installs `claude` CLI (OAuth) at build time. Compose: - new `investigator` service builds from investigator-runtime/, connects with low-privilege role, mounts case/ RW and wiki/+raw/ RO, 512m mem cap. Web: - /api/admin/investigate/test (POST+GET) gated by middleware (W0-F1). POST creates a job, GET polls status. For W3.6 it becomes the chat tool. End-to-end smoke: INSERT job → pg_notify → claim → Locard dispatch → claude subprocess invoked. Auth works (CLI v2.1.150). Currently quota exhausted (weekly limit · resets 3pm UTC) — pipeline catches the typed isQuota error, marks job failed with surfaced reason. Architecture proven; quota reset enables real evidence creation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-23 19:49:33 -03:00
Luiz Gustavo	55cac8a395	W0+W1+W1.2: security hardening, observability, autocomplete, glitchtip, forgejo CI Some checks failed CI / Web — typecheck + lint + build (push) Failing after 1m30s Details CI / Scripts — Python smoke (push) Failing after 32s Details CI / Web — npm audit (push) Failing after 37s Details W0 — security hardening (5 fixes verified live on disclosure.top) - middleware: gate /api/admin/* same as /admin/* (F1) - imgproxy: tighten LOCAL_FILESYSTEM_ROOT from / to /var/lib/storage (F2) - studio: real basic-auth label (bcrypt hash, middleware reference) (F3) - relations: ENABLE ROW LEVEL SECURITY + public SELECT policy (F4) - migration 0003: fold is_searchable + hybrid_search update into canonical (TD#2) W1 — observability + resilience + autocomplete - studio: HOSTNAME=0.0.0.0 so Next.js binds on loopback for healthcheck - compose: PG_POOL_MAX=20, CLAUDE_CODE_OAUTH_TOKEN gated by separate env - claude-code.ts: subprocess timeout configurable (CLAUDE_CODE_TIMEOUT_MS) - openrouter.ts: retry with exponential backoff + Retry-After + in-memory circuit breaker (promotes FALLBACK after CB_THRESHOLD failures) - lib/logger.ts: pino logger (NDJSON prod / pretty dev) + withRequest helper - middleware: mints correlation_id, stamps x-correlation-id response header, emits structured http_request log per /api/* call - messages/route.ts: switch to structured logger - 60_meili_index.py: push documents + chunks into Meilisearch - /api/search/autocomplete: parallel meili search (docs + chunks), 5-8ms p50 - search-autocomplete.tsx: debounced dropdown wired into search-panel W1.2 — Glitchtip + Forgejo self-hosted - compose: glitchtip-redis + glitchtip-web + glitchtip-worker (v4.2) - compose: forgejo + forgejo-runner (server v9, runner v6) with group_add=988 - @sentry/nextjs SDK wired (instrumentation.ts + sentry.{client,server}.config.ts) - /api/admin/throw smoke endpoint (gated by W0-F1 middleware) - Synthetic event ingestion verified at glitchtip.disclosure.top - forgejo.disclosure.top up, repo discadmin/disclosure-bureau created, runner registered (labels: ubuntu-latest, docker) - .forgejo/workflows/ci.yml: typecheck + lint + build + npm audit + python syntax + compose validation Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-23 18:18:42 -03:00
Luiz Gustavo	504b20fa5c	search: gate dense recall by cosine-distance threshold in the RPC Root-cause fix for "search returns garbage for absent terms". The hybrid RPC's dense branch always returned its k nearest vectors regardless of distance, so a query for a term not in the corpus (e.g. "varginha") surfaced unrelated chunks. The cross-encoder reranker would filter these but costs 18-62s on CPU — unusable for interactive search. Add max_dense_dist (default 0.40) to hybrid_search_chunks: dense neighbours beyond that cosine distance are dropped server-side. Calibrated from measured distances — strong semantic match ~0.12-0.20, no real match ~0.46-0.53. BM25 full-text still matches literal terms; the reranker becomes opt-in refinement. Verified live: varginha/abducao → 0, disco voador/roswell → relevant, all <1s. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-21 16:36:56 -03:00
guto	19d0678e55	baseline: Disclosure Bureau pipeline + Next.js UI + Supabase stack	2026-05-17 22:44:36 -03:00

5 commits