- TD#8 hybrid.ts: rerank_strategy {always|when_top_k_gt|never} + threshold
(default skips rerank for top_k ≤ 15; chat tool uses threshold 10)
- O11 vision.ts + tools.ts: analyze_image_region tool — sharp-crops the
bbox, claude CLI reads the temp PNG via Read tool, Sonnet vision answers
- TD#12 /graph: SigmaGraph replaces ForceGraphCanvas; react-force-graph-2d
uninstalled (-37 transitive deps); force-graph-canvas.tsx deleted
- TD#27 messages/route.ts gatherContext slice sizes via CTX_* env vars
- TD#22 tests/rag/: golden.yaml (15 queries) + run.py (Recall@k + MRR +
negative-pass rate) + baseline.json + CI job in .forgejo/workflows/ci.yml
- docs/adrs/: ADR-001..005 published from systems-atelier deliverables
Verified live on disclosure.top: top_k=5 path skips rerank (6.7s embed-only,
was 12-15s with rerank); rerank=always still available on demand.
First RAG baseline: Recall@5 = 0.2083, MRR = 0.25, Negative pass = 1.0.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
77 lines
4 KiB
Markdown
77 lines
4 KiB
Markdown
---
|
|
adr: ADR-002
|
|
title: Materializar Investigation Bureau — runtime agentico em background, 8 detetives como roles
|
|
status: accepted
|
|
date: 2026-05-23
|
|
deciders: sa-principal, sa-architecture-lead, sa-security-engineer (veto power)
|
|
project: disclosure-bureau
|
|
---
|
|
|
|
## Context
|
|
|
|
O branding "The Disclosure Bureau" promete "8 detetives investigativos" (Holmes/Poirot/Dupin/Locard/Schneier/Tetlock/Taleb + Investigation Bureau coletivo) com chain of custody, hypothesis tournament, residual uncertainty calculation. Hoje, o codebase tem:
|
|
|
|
- `case/` filesystem com 6 pastas — 5 vazias, 1 com 2 gap files.
|
|
- Chat com 12 tools read-only e um system prompt grandioso.
|
|
- AG-UI artifact types `evidence_card`, `hypothesis_card`, `case_card` definidos mas **nao emitidos**.
|
|
- Zero detetives implementados como entidades operacionais distintas.
|
|
|
|
O brief pede: "AI detective bureau REAL, nao decorativo". Isso requer **producao** de dado novo (`case/evidence/*.md`, `case/hypotheses/*.md`, `public.{hypotheses,evidence,contradictions,...}`) por **agentes especializados** com **outputs estruturados e auditaveis**.
|
|
|
|
Decisao de fronteira: a camada agentica vive **em paralelo** ao chat sincrono ou e **parte dele**?
|
|
|
|
## Options considered
|
|
|
|
1. **Parte do chat sincrono.** Estender system prompt + adicionar write tools. Usuario espera 30s-5min sincrono.
|
|
2. **Worker em background.** Chat dispara job; usuario polls; worker assincrono produz outputs.
|
|
3. **Sem agentic layer**: manter so chat read-only. Refatorar branding para refletir realidade ("AI-assisted wiki").
|
|
4. **CronJob batch only**. Sem trigger user. Investigacoes acontecem em background diario.
|
|
|
|
## Decision
|
|
|
|
**Opcao 2: Worker em background, separado do chat sincrono.**
|
|
|
|
Especificamente:
|
|
|
|
1. **Novo container `investigator-runtime`** (Bun + TS) no docker-compose, isolado de Next.js.
|
|
2. **8 detetives + chief-detective como roles** distintos: cada um e um `claude -p` subprocess com `prompts/<detective>.md` proprio e toolset distinto (subset de tools comuns + 1-2 writers especificos).
|
|
3. **Postgres LISTEN/NOTIFY** como queue (`public.investigation_jobs` + trigger NOTIFY).
|
|
4. **Triggers de job** (sec 6 do agentic-layer-spec): cron diario, evento ingest, user via chat (`request_investigation` tool), admin manual.
|
|
5. **Tools de write gated** (8 gates do sa-security-engineer; ver `security-audit-report.md` secao 5).
|
|
6. **Budget cap por job:** $1.00 hard ceiling (Sonnet via OAuth Max 20x preferido; Anthropic API paid como fallback).
|
|
7. **Outputs validados antes de commit:** schema check + lint (`04-lint.py --dry-run`) sobre markdown gerado.
|
|
|
|
**Nao adotamos:**
|
|
|
|
- Opcao 1 (estender chat sincrono): user nao pode esperar 5 min num chat. Quebra modelo mental.
|
|
- Opcao 3 (sem agentic): foge do brief explicito. Branding sem motor e desonesto.
|
|
- Opcao 4 (cron only): sem trigger user e UX pobre. Manter cron como complementar, nao exclusivo.
|
|
|
|
## Consequences
|
|
|
|
**Positivas:**
|
|
- Branding "8 detetives" passa a ter motor real.
|
|
- Chat sincrono continua rapido (LLM read-only + 12 tools).
|
|
- Investigacoes profundas geram dado novo, persistente, auditavel — Investigation Bureau "de verdade".
|
|
- Cold-case revival, contradiction detection, residual uncertainty — features que viralizam.
|
|
|
|
**Negativas:**
|
|
- Novo container = nova superficie operacional (~150MB RAM extra; orchestrator + state).
|
|
- Quota Claude Max 20x mais utilizada (ja monitorada por `/api/admin/batch`).
|
|
- Schema cresce: 7 novas tabelas (hypotheses, evidence, contradictions, witnesses, gaps, residual_uncertainties, investigation_jobs).
|
|
- Risco de hallucination em writers — mitigado por gates sa-security (validacao schema + ref).
|
|
|
|
## Verification
|
|
|
|
- Spec completa em `agentic-layer-spec.md`.
|
|
- Plano de bring-up incremental em 10 sub-steps W3.1-W3.10.
|
|
- 8 gates documentados para sa-security veto.
|
|
- Custos esperados $30-110/mes (tabela secao 11 do spec).
|
|
- Golden hypothesis set como quality bar (W3.10).
|
|
|
|
## References
|
|
|
|
- `agentic-layer-spec.md`
|
|
- `ai-opportunity-map.md` O1-O5
|
|
- `security-audit-report.md` secao 5
|
|
- Anthropic Claude Code OAuth pattern (memoria do projeto)
|