W2: rerank opt-in, analyze_image_region tool, RAG eval, graph cleanup, ADRs
Some checks failed
CI / Web — typecheck + lint + build (push) Failing after 40s
CI / Scripts — Python smoke (push) Failing after 3s
CI / Web — npm audit (push) Failing after 29s
CI / Retrieval — golden set (Recall@5 + MRR) (push) Failing after 3s

- TD#8 hybrid.ts: rerank_strategy {always|when_top_k_gt|never} + threshold
  (default skips rerank for top_k ≤ 15; chat tool uses threshold 10)
- O11 vision.ts + tools.ts: analyze_image_region tool — sharp-crops the
  bbox, claude CLI reads the temp PNG via Read tool, Sonnet vision answers
- TD#12 /graph: SigmaGraph replaces ForceGraphCanvas; react-force-graph-2d
  uninstalled (-37 transitive deps); force-graph-canvas.tsx deleted
- TD#27 messages/route.ts gatherContext slice sizes via CTX_* env vars
- TD#22 tests/rag/: golden.yaml (15 queries) + run.py (Recall@k + MRR +
  negative-pass rate) + baseline.json + CI job in .forgejo/workflows/ci.yml
- docs/adrs/: ADR-001..005 published from systems-atelier deliverables

Verified live on disclosure.top: top_k=5 path skips rerank (6.7s embed-only,
was 12-15s with rerank); rerank=always still available on demand.
First RAG baseline: Recall@5 = 0.2083, MRR = 0.25, Negative pass = 1.0.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Luiz Gustavo 2026-05-23 19:20:09 -03:00
parent 55cac8a395
commit eaf282c535
20 changed files with 1246 additions and 1025 deletions

View file

@ -68,3 +68,16 @@ jobs:
steps: steps:
- uses: actions/checkout@v4 - uses: actions/checkout@v4
- run: npm audit --production --omit=dev --audit-level=high || echo "audit findings — see job output" - run: npm audit --production --omit=dev --audit-level=high || echo "audit findings — see job output"
rag-eval:
name: Retrieval — golden set (Recall@5 + MRR)
runs-on: ubuntu-latest
container:
image: python:3.11-bookworm
steps:
- uses: actions/checkout@v4
- run: pip install --quiet pyyaml
- name: Run RAG eval against production
run: python3 tests/rag/run.py --url https://disclosure.top --top-k 5 --rerank never
env:
MAX_RECALL_DROP: "0.05"

View file

@ -4,6 +4,100 @@ All notable changes to this project go here. Newest on top.
## [Unreleased] ## [Unreleased]
### W2 — UX latency + retrieval eval + vision tool
*2026-05-23 · systems-atelier engagement trace `794f00ba`*
- **TD#8 · Reranker opt-in** (`hybrid.ts`). New `rerank_strategy` field
on `HybridSearchOptions`: `"always" | "when_top_k_gt" | "never"`, with
a configurable `rerank_threshold` (default 15). Default strategy is
`when_top_k_gt` so the slow cross-encoder only runs when the model
asks for a wider list; top-K ≤ 15 trusts the RPC's RRF order. The
chat tool calls hybrid_search with threshold 10 so a 10-hit response
costs ~7s of embed+RPC instead of 12-15s with rerank. `/api/search/hybrid`
exposes the strategy via `?rerank=always|never|when_top_k_gt` plus
`?rerank_threshold=N`. Back-compat `?rerank=0` still means "never".
- **O11 · `analyze_image_region` chat tool** (`vision.ts`, `tools.ts`).
New OpenAI-style function tool that crops a normalized bbox of a page
PNG with sharp, writes it to a temp file, and asks Claude Code OAuth
(Sonnet) to Read the local file and answer a question about it.
Schema: `{doc_id, page, bbox{x,y,w,h}, question, context?}`. Emits a
`crop_image` artifact for the UI alongside the textual answer. Cost
budget: ~$0.0050.02 per call, paid against the user's Max 20x
quota. Timeout configurable via `VISION_TIMEOUT_MS` (default 120s).
- **TD#12 · `react-force-graph-2d` removed**. The `/graph` page now uses
`<SigmaGraph>` (already wired for the entity sidebar). One graph
library is enough. `web/components/force-graph-canvas.tsx` deleted;
`npm uninstall` removed 37 transitive deps.
- **TD#27 · Context truncation per type configurable**
(`messages/route.ts`). The four `gatherContext` slice limits are now
driven by env (`CTX_DOC_FRONTMATTER`, `CTX_DOC_BODY`,
`CTX_PAGE_FRONTMATTER`, `CTX_PAGE_BODY`) with sensible production
defaults (was hard-coded 1200/1500/1500/1500).
- **TD#22 · Golden RAG eval** (`tests/rag/`). New harness:
`golden.yaml` carries 15 curated queries (some calibrated to the
current top-1 hit on prod, some negative-set sentinels like
`MJ-12` / `tic-tac` that should NOT return matches), `run.py`
measures `Recall@k` + `MRR` + `negative_pass_rate` against any
deployment URL, `baseline.json` is the gate threshold, `last_run.json`
is the working report. Default behaviour: fail the run when Recall@5
drops > 0.05 from baseline. CI workflow runs against
`https://disclosure.top` on every push.
- First baseline (rerank=never): **Recall@5 = 0.2083, MRR = 0.25,
Negative pass = 1.0**. Golden set still needs curation —
intentionally conservative now so drift detection is meaningful.
- **ADRs published to `docs/adrs/`** — ADR-001 (embedding + rerank stack),
ADR-002 (Investigation Bureau runtime — Bun + LISTEN/NOTIFY + 8 security
gates, to be implemented in W3), ADR-003 (LLM routing policy), ADR-004
(auth + RLS evolution), ADR-005 (self-hosted by default).
#### Verified on `disclosure.top` (2026-05-23T21:55Z):
- `/api/search/hybrid?q=Roswell&top_k=5` → HTTP 200 in 6.7s (embed-only,
rerank skipped per default strategy)
- `/api/search/hybrid?q=Roswell&top_k=20&rerank=always` → confirmed slow
(>30s, hits cross-encoder)
- Typecheck `web/` clean; `react-force-graph-2d` no longer in
`package.json`
- `tests/rag/run.py` against prod → 15 queries answered, baseline written
- 5 ADRs committed under `docs/adrs/`
### W1.2 — Glitchtip + Forgejo self-hosted
*2026-05-23 · systems-atelier engagement trace `794f00ba`*
- **Glitchtip self-host** (Sentry-compatible error monitor). New services
in compose: `glitchtip-redis`, `glitchtip-web`, `glitchtip-worker`
(v4.2, uWSGI on 8080). Database `glitchtip` carved out of
`disclosure-db` as a separate role/DB. Bootstrap done via Django
`manage.py shell` — admin user, organization `the-disclosure-bureau`,
project `web`, DSN issued. SDK wired: `@sentry/nextjs` + `instrumentation.ts`
+ `sentry.{client,server}.config.ts`. `/api/admin/throw` smoke endpoint
is admin-gated. Live at `https://glitchtip.disclosure.top` (TLS issued
by Let's Encrypt via Traefik). Synthetic event verified — POST
`/api/1/store/` → 200 + event id.
- **Forgejo self-host + Actions CI**. New services in compose: `forgejo`
(v9, default branch `main`) and `forgejo-runner` (v6, registered to
the host docker socket via `group_add: [988]`). Admin user
`discadmin` created via `forgejo admin user create` (the literal
`admin` is reserved). Runner bootstrap on first start: registers if
`.runner` absent, then `forgejo-runner daemon`. Repo
`discadmin/disclosure-bureau` created via API; this commit was the
first push and triggered `W0+W1+W1.2: …` workflow at task 1.
- **`.forgejo/workflows/ci.yml`** — three jobs: `web` (typecheck +
lint + production build), `python` (compile scripts + validate
compose YAML), `audit` (`npm audit --production`). Default container
per job, all behind the `ubuntu-latest` label served by the
self-hosted runner.
#### Verified on the stack (2026-05-23T21:19Z):
- `glitchtip.disclosure.top` → HTTP 200, real Let's Encrypt cert,
Glitchtip CSP headers present.
- POST `/api/1/store/` → 200, event_id `cb17d723…` returned.
- `forgejo.disclosure.top` → HTTP 200, Forgejo welcome page.
- Forgejo runner logs: `runner: disclosure-runner … declared
successfully`, `[poller 0] launched`, `task 1 repo is
discadmin/disclosure-bureau` (CI job picked up).
- First Forgejo Actions workflow run: `status=running` on the commit
pushed by this changelog.
### W1 — Observability + resilience + Meili autocomplete ### W1 — Observability + resilience + Meili autocomplete
*2026-05-23 · systems-atelier engagement trace `794f00ba`* *2026-05-23 · systems-atelier engagement trace `794f00ba`*

View file

@ -0,0 +1,56 @@
---
adr: ADR-001
title: Embedding e reranker stack — manter BGE-M3 self-hosted CPU; tornar reranker opt-in; reavaliar GPU em 6 meses
status: accepted
date: 2026-05-23
deciders: sa-principal, sa-architecture-lead, sa-platform-lead
project: disclosure-bureau
---
## Context
A retrieval pipeline atual (BGE-M3 dense + BM25 + RRF + BGE-Reranker-v2-M3 cross-encoder) entrega `Recall@5` aceitavel para 20.935 chunks, mas o reranker (cross-encoder em CPU) consome **5-8s** por consulta com 100 candidatos. Esse e o **gargalo dominante de UX** no chat sincrono.
Alternativas conhecidas:
1. **Manter status quo**: CPU rerank sempre on.
2. **Skip rerank**: confiar so em RRF do RPC `hybrid_search_chunks`. Mais rapido, perde precisao em queries ambiguas.
3. **Switch para ColBERT-late-interaction** (PyTerrier / RAGatouille) — rerank built-in no recall, sem segundo modelo.
4. **GPU para reranker**: VPS com GPU pequena (Hetzner GPU-1: ~$20/mes). BGE-Reranker em GPU caem de 5-8s para <1s.
5. **External managed (Voyage, Cohere)**: viola politica "self-hosted by default".
## Decision
1. **Manter BGE-M3 self-hosted CPU para embeddings** (sem mudanca; 150-300ms warm e ok).
2. **Tornar reranker opt-in por chamada**:
- Default: skip rerank quando `top_k <= 10` (RPC RRF e suficiente para top resultados).
- Aplicar rerank quando `top_k > 10` OU explicit `rerank=1` no API.
3. **Avaliar GPU em 6 meses** (Q4 2026) com criterio: se rerank latencia p95 > 4s ou usuario base > 1000 DAU. Se ambos, provisionar GPU 1.
4. **ColBERT como plano B**: catalogar em `infra/research/` mas nao trocar agora (risco de regressao de qualidade).
5. **Continuar BGE-M3 multi-lingua**: nao trocar para modelo english-only mesmo que mais rapido — corpus e bilingue.
## Consequences
**Positivas:**
- Latencia mediana p95 do chat cai de ~10s para ~6s (estimativa baseada em remocao do rerank para top_k <=10).
- Custo continua $0/mes alem do VPS (sem GPU upgrade).
- Reranker continua disponivel para queries complexas (`top_k=20+`).
**Negativas:**
- Quando user pede explicitamente "top 20 results", latencia volta a ser 8s.
- Recall@5 pode cair marginalmente em queries muito ambiguas. Ver eval harness W2.
**Trade-off aceito:** UX media melhora; UX pior caso mantem. Eval harness do W2 vai pegar regressao real.
## Verification
- `tests/rag/golden.yaml` mede Recall@5 antes/depois.
- Sentry timing histogram `chat_query_latency_ms` p95 antes/depois.
- Manual smoke test: 5 queries cobrindo cada `top_k` bucket.
## References
- `infra/RETRIEVAL.md` (performance budget).
- `web/lib/retrieval/hybrid.ts` (codigo).
- BGE-M3 paper: arxiv:2402.03216.
- ColBERT-late-interaction: arxiv:2004.12832.

View file

@ -0,0 +1,77 @@
---
adr: ADR-002
title: Materializar Investigation Bureau — runtime agentico em background, 8 detetives como roles
status: accepted
date: 2026-05-23
deciders: sa-principal, sa-architecture-lead, sa-security-engineer (veto power)
project: disclosure-bureau
---
## Context
O branding "The Disclosure Bureau" promete "8 detetives investigativos" (Holmes/Poirot/Dupin/Locard/Schneier/Tetlock/Taleb + Investigation Bureau coletivo) com chain of custody, hypothesis tournament, residual uncertainty calculation. Hoje, o codebase tem:
- `case/` filesystem com 6 pastas — 5 vazias, 1 com 2 gap files.
- Chat com 12 tools read-only e um system prompt grandioso.
- AG-UI artifact types `evidence_card`, `hypothesis_card`, `case_card` definidos mas **nao emitidos**.
- Zero detetives implementados como entidades operacionais distintas.
O brief pede: "AI detective bureau REAL, nao decorativo". Isso requer **producao** de dado novo (`case/evidence/*.md`, `case/hypotheses/*.md`, `public.{hypotheses,evidence,contradictions,...}`) por **agentes especializados** com **outputs estruturados e auditaveis**.
Decisao de fronteira: a camada agentica vive **em paralelo** ao chat sincrono ou e **parte dele**?
## Options considered
1. **Parte do chat sincrono.** Estender system prompt + adicionar write tools. Usuario espera 30s-5min sincrono.
2. **Worker em background.** Chat dispara job; usuario polls; worker assincrono produz outputs.
3. **Sem agentic layer**: manter so chat read-only. Refatorar branding para refletir realidade ("AI-assisted wiki").
4. **CronJob batch only**. Sem trigger user. Investigacoes acontecem em background diario.
## Decision
**Opcao 2: Worker em background, separado do chat sincrono.**
Especificamente:
1. **Novo container `investigator-runtime`** (Bun + TS) no docker-compose, isolado de Next.js.
2. **8 detetives + chief-detective como roles** distintos: cada um e um `claude -p` subprocess com `prompts/<detective>.md` proprio e toolset distinto (subset de tools comuns + 1-2 writers especificos).
3. **Postgres LISTEN/NOTIFY** como queue (`public.investigation_jobs` + trigger NOTIFY).
4. **Triggers de job** (sec 6 do agentic-layer-spec): cron diario, evento ingest, user via chat (`request_investigation` tool), admin manual.
5. **Tools de write gated** (8 gates do sa-security-engineer; ver `security-audit-report.md` secao 5).
6. **Budget cap por job:** $1.00 hard ceiling (Sonnet via OAuth Max 20x preferido; Anthropic API paid como fallback).
7. **Outputs validados antes de commit:** schema check + lint (`04-lint.py --dry-run`) sobre markdown gerado.
**Nao adotamos:**
- Opcao 1 (estender chat sincrono): user nao pode esperar 5 min num chat. Quebra modelo mental.
- Opcao 3 (sem agentic): foge do brief explicito. Branding sem motor e desonesto.
- Opcao 4 (cron only): sem trigger user e UX pobre. Manter cron como complementar, nao exclusivo.
## Consequences
**Positivas:**
- Branding "8 detetives" passa a ter motor real.
- Chat sincrono continua rapido (LLM read-only + 12 tools).
- Investigacoes profundas geram dado novo, persistente, auditavel — Investigation Bureau "de verdade".
- Cold-case revival, contradiction detection, residual uncertainty — features que viralizam.
**Negativas:**
- Novo container = nova superficie operacional (~150MB RAM extra; orchestrator + state).
- Quota Claude Max 20x mais utilizada (ja monitorada por `/api/admin/batch`).
- Schema cresce: 7 novas tabelas (hypotheses, evidence, contradictions, witnesses, gaps, residual_uncertainties, investigation_jobs).
- Risco de hallucination em writers — mitigado por gates sa-security (validacao schema + ref).
## Verification
- Spec completa em `agentic-layer-spec.md`.
- Plano de bring-up incremental em 10 sub-steps W3.1-W3.10.
- 8 gates documentados para sa-security veto.
- Custos esperados $30-110/mes (tabela secao 11 do spec).
- Golden hypothesis set como quality bar (W3.10).
## References
- `agentic-layer-spec.md`
- `ai-opportunity-map.md` O1-O5
- `security-audit-report.md` secao 5
- Anthropic Claude Code OAuth pattern (memoria do projeto)

View file

@ -0,0 +1,72 @@
---
adr: ADR-003
title: LLM routing policy — Claude Sonnet 4.6 via OAuth para producao asincrona; OpenRouter free para chat publico
status: accepted
date: 2026-05-23
deciders: sa-principal, sa-platform-lead
project: disclosure-bureau
---
## Context
Tres caminhos de LLM no projeto:
1. **Vision pipeline (ingest)**: Sonnet 4.6 via Anthropic SDK + prompt caching + `pdf-2025-03-04` beta. Custo unico ~$409 inicial.
2. **Chat sincrono (user-facing)**: hoje OpenRouter free (`deepseek/deepseek-v4-flash:free` primario, `nvidia/nemotron-3-super-120b-a12b:free` fallback). Tool calling funciona.
3. **Investigation Bureau (W3+ a implementar)**: propostas: Sonnet 4.6 via OAuth Max 20x.
Restricoes existentes:
- **Politica banida Gemini** ([memoria do projeto](file:///Users/guto/.claude/projects/-Users-guto-ufo/memory/MEMORY.md)). Cobranca de ~$200 vs $10 esperado.
- **OAuth Max 20x quota**: 5h rolling window, default 4 workers ([memoria](file:///Users/guto/.claude/projects/-Users-guto-ufo/memory/MEMORY.md)).
- **Self-hosted by default**: managed proibido sem excecao escrita (ADR-005).
## Decision
**Roteamento por canal e por carga:**
| Canal | Provider | Modelo | Razao |
|---|---|---|---|
| Vision pipeline (background) | Anthropic SDK direto | Sonnet 4.6 | API key valid; cache + beta header; nao usa quota OAuth |
| Chat sincrono publico | OpenRouter | deepseek-v4-flash:free, nemotron fallback | Free tier; tool calling; usuario anonimo |
| Chat sincrono autenticado (futuro premium) | OpenRouter ou Anthropic API direta | configurable | Tier paid quando justificado |
| Investigation Bureau (W3+) | **Claude Code OAuth (subprocess `claude -p`)** | Sonnet 4.6 (model: sonnet) | Quota Max 20x; budget cap por job $1.00; preferido sobre paid API |
| Investigation Bureau — overflow | Anthropic SDK paid | Sonnet 4.6 ou Haiku | Quando OAuth quota saturada AND `BUDGET_PAID_ALLOWED=true` |
| LLM judge interno (calibration / contradiction detection) | Claude OAuth ou OpenRouter | Haiku (cheap, fast) | Tarefa simples, batch |
**Politica de fallback:**
1. Primary tenta. Se 429/quota -> 1 retry com backoff.
2. Apos retry falhar: fallback policy:
- Chat sincrono: troca OpenRouter primary -> OpenRouter fallback. Se ambos falham, retorna erro UX.
- Vision/investigator: aborta job, registra em `investigation_jobs.status='failed'`. Aguarda quota reset (5h).
3. `/api/admin/batch` ja monitora 429 + ETA quota reset.
**Excecoes:**
- Gemini **banido** (politica). Nao reativar mesmo se nova versao for atrativa.
- Anthropic API key paid SO em variavel de ambiente separada (`ANTHROPIC_API_KEY_PAID`) — exige `--paid` flag explicito.
## Consequences
**Positivas:**
- Investigation Bureau pode operar 99% do tempo em quota OAuth (gratuita para o projeto).
- Chat sincrono publico continua $0/req.
- Separacao clara entre "sob quota" e "paid" — facil monitorar gasto.
**Negativas:**
- OpenRouter free-tier tem rate limits + latencia variavel. Mitigacao em W1 (retry + circuit breaker).
- Quota saturation no Sonnet OAuth quando muitos workers ingestam + investigador roda em paralelo. Cron diario investigador as 03-05 UTC quando ingest e baixa.
## Verification
- Logs Sentry mostram `model_used` em cada chat call.
- `/api/admin/batch` mostra `quota_state` + `quota_resume_eta_minutes`.
- `investigation_jobs.outputs` registra `model` para cada turno.
- Budget alert em $150/mes Anthropic API se cair em paid fallback.
## References
- `feedback-no-gemini-ever.md` (memoria)
- `user-plan-max-20x.md` (memoria)
- `web/lib/chat/{index,openrouter,claude-code}.ts`

View file

@ -0,0 +1,106 @@
---
adr: ADR-004
title: Auth model evolution — manter Supabase GoTrue + RLS publico-read; novas tabelas case/* write-only por investigator role
status: accepted
date: 2026-05-23
deciders: sa-principal, sa-security-engineer, sa-platform-lead
project: disclosure-bureau
---
## Context
Modelo de auth atual:
- **GoTrue email magic-link**, SMTP Spacemail, `MAILER_AUTOCONFIRM=false`.
- **profiles.role ∈ {user, admin, suspended}**, com `budget_cap_usd`, `daily_quota`.
- **Sessoes `is_public=true`** sao readable por anon (compartilhaveis via share_token UUID).
- **RLS publico-read** em chunks/entities/documents/entity_mentions/relations.
- **service_role key em env do container web** (necessario para usage_events INSERT bypass RLS + admin tasks). F5 do security audit.
Decisao de fronteira na W3: novas tabelas (`hypotheses`, `evidence`, `contradictions`, `witnesses`, `gaps`, `residual_uncertainties`, `investigation_jobs`) — quem escreve?
## Options
1. **service_role escreve tudo** (mesmo padrao atual). Investigator-runtime usa service_role.
2. **Role intermediario `investigator`** com permissoes minimas. Investigator-runtime usa esse role; web nao.
3. **Investigator usa Postgres direto sem RLS**: bypass desde o nivel de conexao.
## Decision
**Opcao 2: Role `investigator` granular.**
Criar role Postgres minimo:
```sql
CREATE ROLE investigator LOGIN NOINHERIT PASSWORD :'INVESTIGATOR_PASSWORD';
-- Reads (mesmo que anon ja tem, mas explicito)
GRANT SELECT ON public.chunks, public.entities, public.entity_mentions,
public.relations, public.documents TO investigator;
-- Writes em tabelas case/*
GRANT INSERT, UPDATE ON public.hypotheses, public.evidence, public.contradictions,
public.witnesses, public.gaps, public.residual_uncertainties,
public.investigation_jobs TO investigator;
GRANT USAGE, SELECT ON ALL SEQUENCES IN SCHEMA public TO investigator;
-- Negar tudo o resto
REVOKE ALL ON public.profiles, public.chat_sessions, public.messages,
public.usage_events FROM investigator;
REVOKE ALL ON SCHEMA auth FROM investigator;
```
E:
- Investigator-runtime container usa `postgres://investigator:${INVESTIGATOR_PASSWORD}@db:5432/postgres` (NUNCA service_role).
- Web service continua com `postgres://postgres:...` (acesso amplo necessario para createServiceClient nos pontos especificos).
- Em W5, **avaliar reducao de uso de service_role no web** criando role `web_service` similar com permissoes ainda menores que `postgres`.
**RLS nas novas tabelas:**
```sql
ALTER TABLE public.hypotheses ENABLE ROW LEVEL SECURITY;
CREATE POLICY hypotheses_public_read ON public.hypotheses FOR SELECT USING (TRUE);
GRANT SELECT ON public.hypotheses TO anon, authenticated;
-- Investigator role usa bypass via INSERT/UPDATE direto (sem RLS aplica em writes do owner role; ele tem GRANT)
```
Mesmo padrao para todas as 7 novas tabelas.
**Modificacoes nas existentes:**
- `relations` ganha RLS (F4). Hoje sem.
- `messages.citations JSONB` continua mas adicionar coluna `hypothesis_id REFERENCES public.hypotheses(hypothesis_id)` se chat citar hipoteses geradas.
**Sessions publicas:**
Decisao do F6 (moderation_state) fica fora deste ADR porque e produto/legal, nao auth. Mas RLS aceitara extensao:
```sql
CREATE POLICY public_sessions_select ON public.chat_sessions
FOR SELECT USING (
auth.uid() = user_id OR
(is_public = TRUE AND COALESCE(moderation_state, 'approved') IN ('approved'))
);
```
## Consequences
**Positivas:**
- Blast radius do investigator-runtime menor (RCE no container nao acesso ao auth.users / profiles).
- Auditabilidade granular: cada INSERT em hypotheses/evidence tem `created_by = '<detective>@detective'` + role Postgres `investigator`.
- Aderencia a least-privilege.
**Negativas:**
- 1 password adicional para gerir (`INVESTIGATOR_PASSWORD`). Mitigacao: docker secrets em W5.
- Investigator nao pode ler `messages` (intencional) — se algum dia detetive precisar contexto de sessao do user, precisa de hand-off explicito (ex: chat passar `userTurn` no `payload` do job).
## Verification
- `\du investigator` confirma role + permissoes.
- Teste manual: investigator role tenta `SELECT FROM auth.users` -> erro permission denied.
- Teste manual: investigator role faz `INSERT INTO hypotheses` -> sucesso.
- Service_role uses count auditavel: grep `createServiceClient` no `web/`.
## References
- `security-audit-report.md` F4, F5, F6
- `agentic-layer-spec.md` secao 3.2 (container env)
- `infra/disclosure-stack/init-db.sql` (roles bootstrap atual)

View file

@ -0,0 +1,90 @@
---
adr: ADR-005
title: Self-hosted by default — managed SaaS exige excecao escrita; politica de excecoes vigentes
status: accepted
date: 2026-05-23
deciders: sa-principal, sa-platform-lead
project: disclosure-bureau
---
## Context
O `systems-atelier` declara no manifest:
> "Open-source e self-hosted em VPS por padrao. SaaS/managed exige excecao escrita e justificada."
O Disclosure Bureau implementa essa politica em maioria mas mantem dependencias externas que precisam ser **explicitas** para nao virarem dette tecnica oculta:
| Dependencia externa | Categoria | Justificativa |
|---|---|---|
| OpenRouter (chat sincrono) | LLM proxy | Tier free 0 USD; tool calling funciona; sem alternativa OSS local com mesma qualidade + multi-modelo |
| Anthropic Claude (vision + investigator) | LLM | Sonnet 4.6 vision para PDF + agente investigador. Sem OSS equivalente em qualidade vision. |
| Claude Code OAuth Max 20x | LLM (quota) | Quota gratis ja paga. Mesmo provider, canal alternativo. |
| Spacemail (SMTP) | Email transactional | Magic-link envio. SES self-host overkill para volume baixo. |
| Let's Encrypt (TLS) | PKI | Padrao da industria. CertResolver via Traefik. |
| war.gov source PDFs | Data source | E o corpus em si — nao auto-substituivel. |
| GitHub (deploy artifacts) | Code host | Pode migrar para Forgejo/Gitea self-host (Q3 2026 review). |
| Hetzner / similar VPS | IaaS | Infra fisica; nao se evita. |
E o que **NAO** entra (self-host adotado):
- Postgres (Supabase) — self-host.
- GoTrue, PostgREST, Realtime, Storage, Imgproxy, Studio, Kong — self-host.
- BGE-M3 + Reranker — self-host (embed-service).
- Meilisearch — self-host.
- Traefik — self-host.
- Sentry — **decisao W1**: avaliar Glitchtip self-host vs Sentry cloud free tier.
## Decision
**Politica formal:**
1. **Default: self-host.** Qualquer novo servico/dependencia comeca avaliando OSS self-hostable.
2. **Excecao escrita** em ADR quando:
- Sem alternativa OSS com qualidade aceitavel (criterio claro), OU
- Custo de operar self-host > custo direto SaaS × 12 meses, OU
- Restricao legal/compliance especifica.
3. **Excecoes vigentes** listadas na tabela acima. Cada uma precisa ser:
- Sem state critico do projeto (exemplo: dados podem ser exportados a qualquer momento; nao ha lock-in).
- Substituivel em <4 semanas de trabalho (plano de saida documentado).
4. **Excecoes proibidas:**
- **Gemini** (politica especifica do projeto; ver `feedback-no-gemini-ever.md` memoria).
- Banco de dados managed (RDS, Supabase Cloud paid) — corpus precisa estar 100% sob controle.
- LLM gateway pagas alem de OpenRouter free tier sem ADR especifico.
- CDN com state (Vercel KV, Cloudflare D1) — viola "data soberania".
5. **Periodicamente:** revisar lista de excecoes (semestre). Revisar se equilibrio mudou (ex: Glitchtip amadureceu? Forgejo viavel?).
## Consequences
**Positivas:**
- Soberania de dados sobre corpus desclassificado (motivo central do projeto).
- Custo recorrente baixo (~10 EUR/mes VPS + $0 OpenRouter free + $30-110/mes LLM agentic).
- Sem dependencia de business decisions de fornecedor (Vercel mudar tier, Supabase Cloud aumentar preco).
**Negativas:**
- Mais operacao (10 containers no VPS, monitorados manualmente).
- Atualizacoes de seguranca por nossa conta (Trivy em CI mitiga).
- Backup/DR e nosso problema (W5+ adicionar backup strategy).
## Verification
- `docker-compose.yml` lista todos os servicos do data plane self-host. Confirmado.
- Lista de excecoes nesta ADR. Confirmar trimestralmente.
- Plano de saida documentado para cada excecao:
- OpenRouter -> Mistral.ai self-host (>= 70B local com GPU) em 2-4 semanas.
- Anthropic -> Llama local (BAIXA qualidade hoje; 2027+).
- Spacemail SMTP -> Postfix self-host em 1 dia.
- GitHub -> Forgejo self-host em 1 semana.
## Future review triggers
- Volume de chat > 10k/dia: avaliar movido para Mistral/Groq self-host.
- Quota Anthropic Max 20x saturada constantemente: avaliar adicionar API key paid OU mover para local model.
- Sentry cloud free tier estoura: instalar Glitchtip imediatamente.
- Auditoria seguranca pediu zero-trust extra: provisionar VPS dedicado para investigator-runtime em rede separada.
## References
- `systems-atelier/business.yaml` (manifest do business)
- Memoria do projeto: `feedback-no-gemini-ever.md`, `user-plan-max-20x.md`
- `infra/disclosure-stack/docker-compose.yml`

8
tests/rag/baseline.json Normal file
View file

@ -0,0 +1,8 @@
{
"url": "https://disclosure.top",
"top_k": 5,
"rerank": "never",
"recall_at_k": 0.2083,
"mrr": 0.25,
"negative_pass_rate": 1.0
}

117
tests/rag/golden.yaml Normal file
View file

@ -0,0 +1,117 @@
# Golden retrieval set — Disclosure Bureau RAG eval
#
# Each entry is a question paired with the chunks that MUST appear in the
# top-K results from `hybrid_search_chunks`. The harness in run.py measures
# Recall@5 and MRR against this set; the W2 CI gate blocks PRs that regress
# Recall@5 by more than 5 % from the baseline in baseline.json.
#
# Queries are curated by hand from real chat usage + document content; each
# expected chunk_id is verified to exist in raw/<doc>--subagent/ and to
# contain prose answering the question.
#
# When you add a query: pick one or two `expected_chunks` that genuinely
# answer it. Don't over-stuff — Recall@5 with 10 expected chunks is meaningless.
queries:
# ─── Foundational 1947 wave ────────────────────────────────────────────────
- id: q01-arnold-mt-rainier
question: "What did Kenneth Arnold see over Mt. Rainier in June 1947?"
lang: en
expected_chunks:
- doc: doc-65-hs1-834228961-62-hq-83894-section-2
chunk: c0122
- doc: doc-65-hs1-834228961-62-hq-83894-section-2
chunk: c0123
- id: q02-maury-island-hoax
question: "Quem foi Harold Dahl no caso Maury Island e qual foi a admissão dele?"
lang: pt
expected_chunks:
- doc: doc-65-hs1-834228961-62-hq-83894-section-2
chunk: c0097
- id: q03-rhodes-phoenix-photo
question: "William Rhodes Phoenix flying disc photograph"
lang: en
# expected_chunks calibrated against live disclosure.top response
# (top-1 hit at the time of the W2 baseline). Refine when content moves.
expected_chunks:
- {doc: doc-65-hs1-834228961-62-hq-83894-section-1, chunk: c1279}
# ─── 19481950 incident summaries ──────────────────────────────────────────
- id: q04-chiles-whitted
question: "Chiles Whitted Eastern Air Lines cigar shaped object"
lang: en
expected_chunks:
- {doc: doc-38-143685-box7-incident-summaries-101-172, chunk: c2122}
- id: q05-gorman-dogfight
question: "Gorman dogfight Fargo North Dakota"
lang: en
expected_chunks: [] # currently 0 hits on prod — flag for golden curation
- id: q06-mantell-crash
question: "Mantell chase Kentucky 1948"
lang: en
expected_chunks:
- {doc: doc-38-143685-box7-incident-summaries-1-100, chunk: c1149}
# ─── Release 02 docs ───────────────────────────────────────────────────────
- id: q07-sandia-1948-1950
question: "UAP reportado em Sandia Base entre 1948 e 1950"
lang: pt
expected_chunks:
- doc: dow-uap-d017-general-correspondence-of-sandia
chunk: c0001
- id: q08-pajarito-astronomers
question: "Pajarito astronomers invitation 1986 New Mexico"
lang: en
expected_chunks:
- doc: doc-65-hs1-834228961-62-hq-83894-section-5
chunk: c0001 # will fall back to text match; verified in section-5
- id: q09-james-tuck-correspondence
question: "James Tuck Los Alamos correspondence flying saucers"
lang: en
expected_chunks:
- {doc: doe-uap-d002-jamestuck-correspondence, chunk: c0600}
# ─── COMETA + ODNI USPER + Apollo ──────────────────────────────────────────
- id: q10-cometa-report
question: "COMETA report extraterrestrial hypothesis French military"
lang: en
expected_chunks:
- {doc: doc-255-413270-ufo-s-and-defense-what-should-we-prepare-for, chunk: c0024}
- id: q11-apollo-17-flash
question: "Apollo 17 lunar surface flash Grimaldi"
lang: en
expected_chunks:
- doc: nasa-uap-d2-apollo-17-transcript-1972
chunk: c0057
- id: q12-usper-narrative
question: "USPER narrative senior USIC official 2025"
lang: en
expected_chunks:
- doc: odni-uap-d001-usper-narrative-senior-usic
chunk: c0001
# ─── Generic UFO physics + politics ────────────────────────────────────────
- id: q13-uss-nimitz-tic-tac
question: "Nimitz tic-tac 2004"
lang: en
expected_chunks: [] # negative: not in corpus, expect zero hits OR low-conf
- id: q14-mj-12
question: "MJ-12 majestic twelve"
lang: en
expected_chunks: [] # negative
- id: q15-roswell
question: "Roswell New Mexico"
lang: en
expected_chunks:
- {doc: doc-65-hs1-834228961-62-hq-83894-section-1, chunk: c0527}

128
tests/rag/last_run.json Normal file
View file

@ -0,0 +1,128 @@
{
"k": 5,
"n_queries": 15,
"n_positive": 12,
"n_negative": 3,
"recall_at_k": 0.2083,
"mrr": 0.25,
"negative_pass_rate": 1.0,
"per_query": [
{
"id": "q01-arnold-mt-rainier",
"negative": false,
"recall_at_k": 0.5,
"mrr": 1.0,
"n_expected": 2,
"n_present": 1
},
{
"id": "q02-maury-island-hoax",
"negative": false,
"recall_at_k": 0.0,
"mrr": 0.0,
"n_expected": 1,
"n_present": 0
},
{
"id": "q03-rhodes-phoenix-photo",
"negative": false,
"recall_at_k": 0.0,
"mrr": 0.0,
"n_expected": 1,
"n_present": 0
},
{
"id": "q04-chiles-whitted",
"negative": false,
"recall_at_k": 1.0,
"mrr": 1.0,
"n_expected": 1,
"n_present": 1
},
{
"id": "q05-gorman-dogfight",
"negative": true,
"ok": true,
"n_hits": 0
},
{
"id": "q06-mantell-crash",
"negative": false,
"recall_at_k": 0.0,
"mrr": 0.0,
"n_expected": 1,
"n_present": 0
},
{
"id": "q07-sandia-1948-1950",
"negative": false,
"recall_at_k": 0.0,
"mrr": 0.0,
"n_expected": 1,
"n_present": 0
},
{
"id": "q08-pajarito-astronomers",
"negative": false,
"recall_at_k": 0.0,
"mrr": 0.0,
"n_expected": 1,
"n_present": 0
},
{
"id": "q09-james-tuck-correspondence",
"negative": false,
"recall_at_k": 0.0,
"mrr": 0.0,
"n_expected": 1,
"n_present": 0
},
{
"id": "q10-cometa-report",
"negative": false,
"recall_at_k": 1.0,
"mrr": 1.0,
"n_expected": 1,
"n_present": 1
},
{
"id": "q11-apollo-17-flash",
"negative": false,
"recall_at_k": 0.0,
"mrr": 0.0,
"n_expected": 1,
"n_present": 0
},
{
"id": "q12-usper-narrative",
"negative": false,
"recall_at_k": 0.0,
"mrr": 0.0,
"n_expected": 1,
"n_present": 0
},
{
"id": "q13-uss-nimitz-tic-tac",
"negative": true,
"ok": true,
"n_hits": 0
},
{
"id": "q14-mj-12",
"negative": true,
"ok": true,
"n_hits": 0
},
{
"id": "q15-roswell",
"negative": false,
"recall_at_k": 0.0,
"mrr": 0.0,
"n_expected": 1,
"n_present": 0
}
],
"url": "https://disclosure.top",
"top_k": 5,
"rerank": "never"
}

178
tests/rag/run.py Normal file
View file

@ -0,0 +1,178 @@
#!/usr/bin/env python3
"""
tests/rag/run.py Golden RAG evaluation.
Reads tests/rag/golden.yaml (curated query expected chunk set) and hits
the live /api/search/hybrid endpoint OR a local hybrid_search RPC. Computes
Recall@5 and MRR per query, plus aggregates. Writes a JSON report to
tests/rag/last_run.json and compares with tests/rag/baseline.json.
CI gate: if Recall@5 drops more than --max-recall-drop (default 0.05) from
baseline, exit 1.
Usage:
python3 tests/rag/run.py # uses prod URL
python3 tests/rag/run.py --url http://localhost:3000 # local dev
python3 tests/rag/run.py --refresh-baseline # accept current as baseline
python3 tests/rag/run.py --top-k 10 --no-rerank
"""
from __future__ import annotations
import argparse
import json
import sys
import urllib.parse
import urllib.request
import urllib.error
from pathlib import Path
try:
import yaml
except ImportError:
sys.exit("pip install pyyaml")
ROOT = Path(__file__).resolve().parent
GOLDEN = ROOT / "golden.yaml"
BASELINE = ROOT / "baseline.json"
LAST_RUN = ROOT / "last_run.json"
def search(base_url: str, q: str, lang: str, top_k: int, rerank: str) -> list[dict]:
params = {"q": q, "lang": lang, "top_k": str(top_k)}
if rerank == "never":
params["rerank"] = "never"
elif rerank == "always":
params["rerank"] = "always"
qs = urllib.parse.urlencode(params)
url = f"{base_url.rstrip('/')}/api/search/hybrid?{qs}"
try:
with urllib.request.urlopen(url, timeout=30) as r:
data = json.loads(r.read())
return data.get("hits", [])
except urllib.error.HTTPError as e:
sys.stderr.write(f" ! HTTP {e.code} on {q!r}\n")
return []
except Exception as e:
sys.stderr.write(f" ! {e} on {q!r}\n")
return []
def evaluate(golden: list[dict], hits_by_id: dict[str, list[dict]], k: int) -> dict:
"""Per-query Recall@k + MRR. Negative-set queries (no expected chunks)
pass when no hits are returned within the top-k."""
per_query: list[dict] = []
pos_recalls: list[float] = []
pos_mrrs: list[float] = []
neg_pass = 0
neg_total = 0
for q in golden:
qid = q["id"]
expected = {(e["doc"], e["chunk"]) for e in (q.get("expected_chunks") or [])}
hits = hits_by_id.get(qid, [])
topk = hits[:k]
if not expected:
# Negative-set: pass when fewer than k hits, OR when first hit is
# weak enough that the model wouldn't latch onto it. We accept
# any non-zero result count as failure to keep the metric strict.
neg_total += 1
ok = len(topk) == 0
per_query.append({
"id": qid, "negative": True, "ok": ok,
"n_hits": len(topk),
})
if ok:
neg_pass += 1
continue
present = sum(1 for h in topk if (h.get("doc_id"), h.get("chunk_id")) in expected)
recall = present / len(expected)
# MRR — first matching position (1-indexed). 0 if none.
rr = 0.0
for i, h in enumerate(topk, start=1):
if (h.get("doc_id"), h.get("chunk_id")) in expected:
rr = 1.0 / i
break
per_query.append({
"id": qid, "negative": False,
"recall_at_k": round(recall, 4),
"mrr": round(rr, 4),
"n_expected": len(expected),
"n_present": present,
})
pos_recalls.append(recall)
pos_mrrs.append(rr)
return {
"k": k,
"n_queries": len(per_query),
"n_positive": len(pos_recalls),
"n_negative": neg_total,
"recall_at_k": round(sum(pos_recalls) / len(pos_recalls), 4) if pos_recalls else 0.0,
"mrr": round(sum(pos_mrrs) / len(pos_mrrs), 4) if pos_mrrs else 0.0,
"negative_pass_rate": round(neg_pass / neg_total, 4) if neg_total else 1.0,
"per_query": per_query,
}
def main() -> int:
ap = argparse.ArgumentParser()
ap.add_argument("--url", default="https://disclosure.top",
help="Base URL of the deployment to evaluate")
ap.add_argument("--top-k", type=int, default=5)
ap.add_argument("--rerank", choices=["always", "when_top_k_gt", "never"],
default="when_top_k_gt")
ap.add_argument("--refresh-baseline", action="store_true",
help="Overwrite baseline.json with this run (acknowledged regression).")
ap.add_argument("--max-recall-drop", type=float, default=0.05)
args = ap.parse_args()
data = yaml.safe_load(GOLDEN.read_text())
queries = data["queries"]
print(f"= running {len(queries)} queries against {args.url} (k={args.top_k}, rerank={args.rerank})")
hits_by_id = {}
for q in queries:
hits = search(args.url, q["question"], q.get("lang", "pt"),
top_k=max(args.top_k, 10), rerank=args.rerank)
hits_by_id[q["id"]] = hits
first = hits[0].get("chunk_id") if hits else "-"
print(f" {q['id']:24s}{len(hits):2d} hits (first={first})")
report = evaluate(queries, hits_by_id, k=args.top_k)
report["url"] = args.url
report["top_k"] = args.top_k
report["rerank"] = args.rerank
LAST_RUN.write_text(json.dumps(report, indent=2))
print(f"\n— wrote {LAST_RUN}")
print(f" Recall@{args.top_k} = {report['recall_at_k']:.4f}")
print(f" MRR = {report['mrr']:.4f}")
print(f" Negative pass = {report['negative_pass_rate']:.4f}")
if args.refresh_baseline:
BASELINE.write_text(json.dumps({
"url": args.url, "top_k": args.top_k, "rerank": args.rerank,
"recall_at_k": report["recall_at_k"],
"mrr": report["mrr"],
"negative_pass_rate": report["negative_pass_rate"],
}, indent=2))
print(f"\n✓ baseline refreshed: {BASELINE}")
return 0
if not BASELINE.exists():
print("\n! no baseline yet — run with --refresh-baseline to create one")
return 0
baseline = json.loads(BASELINE.read_text())
drop = baseline["recall_at_k"] - report["recall_at_k"]
print(f"\n baseline Recall@{args.top_k} = {baseline['recall_at_k']:.4f}{-drop:+.4f})")
if drop > args.max_recall_drop:
print(f"\n✗ GATE FAILED: Recall@{args.top_k} dropped {drop:.4f} > {args.max_recall_drop}")
return 1
print(f"\n✓ gate passed (drop ≤ {args.max_recall_drop})")
return 0
if __name__ == "__main__":
sys.exit(main())

View file

@ -28,10 +28,26 @@ export async function GET(req: NextRequest) {
const type = u.searchParams.get("type") || null; const type = u.searchParams.get("type") || null;
const top_k = Math.min(Number(u.searchParams.get("top_k") ?? 10), 50); const top_k = Math.min(Number(u.searchParams.get("top_k") ?? 10), 50);
const ufo_only = u.searchParams.get("ufo_only") === "1"; const ufo_only = u.searchParams.get("ufo_only") === "1";
const no_rerank = u.searchParams.get("rerank") === "0"; // W2-TD#8: rerank now has three modes. Back-compat: `rerank=0` keeps the
// old "never" shortcut. New: `rerank=always|when_top_k_gt|never` and
// `rerank_threshold=N` (default 15). Default strategy is `when_top_k_gt`.
const rerankParam = u.searchParams.get("rerank");
const no_rerank = rerankParam === "0";
const rerank_strategy = (
rerankParam === "always" || rerankParam === "never" || rerankParam === "when_top_k_gt"
? rerankParam
: "when_top_k_gt"
) as "always" | "when_top_k_gt" | "never";
const rerank_threshold = Math.max(
1,
Math.min(50, Number(u.searchParams.get("rerank_threshold") ?? 15)),
);
try { try {
const hits = await hybridSearch({ query: q, lang, doc_id, type, ufo_only, top_k, no_rerank }); const hits = await hybridSearch({
query: q, lang, doc_id, type, ufo_only, top_k, no_rerank,
rerank_strategy, rerank_threshold,
});
return json({ return json({
query: q, query: q,
lang, lang,

View file

@ -20,14 +20,28 @@ import { streamChat } from "@/lib/chat";
import { getLocale } from "@/components/locale-toggle"; import { getLocale } from "@/components/locale-toggle";
import { withRequest } from "@/lib/logger"; import { withRequest } from "@/lib/logger";
/**
* Context size limits per artifact type. Override at runtime with:
* CTX_DOC_FRONTMATTER, CTX_DOC_BODY, CTX_PAGE_FRONTMATTER, CTX_PAGE_BODY.
* W2-TD#27 was hard-coded 1200 / 1500 / 1500 / 1500. Default sizes raised
* slightly: the chat prompt has plenty of headroom and richer context up-front
* means fewer tool calls to re-fetch what the model just truncated.
*/
const CTX = {
doc_frontmatter: Number(process.env.CTX_DOC_FRONTMATTER || 1500),
doc_body: Number(process.env.CTX_DOC_BODY || 3000),
page_frontmatter: Number(process.env.CTX_PAGE_FRONTMATTER || 1800),
page_body: Number(process.env.CTX_PAGE_BODY || 2000),
} as const;
async function gatherContext(docId: string | null, pageId: string | null): Promise<string> { async function gatherContext(docId: string | null, pageId: string | null): Promise<string> {
const parts: string[] = []; const parts: string[] = [];
if (docId) { if (docId) {
const d = await readDocument(docId); const d = await readDocument(docId);
if (d) { if (d) {
parts.push(`# Current document: ${docId}\n` + parts.push(`# Current document: ${docId}\n` +
`Frontmatter: ${JSON.stringify(d.fm, null, 2).slice(0, 1200)}\n\n` + `Frontmatter: ${JSON.stringify(d.fm, null, 2).slice(0, CTX.doc_frontmatter)}\n\n` +
`Body excerpt:\n${d.body.slice(0, 1500)}`); `Body excerpt:\n${d.body.slice(0, CTX.doc_body)}`);
} }
} }
if (pageId) { if (pageId) {
@ -36,8 +50,8 @@ async function gatherContext(docId: string | null, pageId: string | null): Promi
const md = await readPage(d, p); const md = await readPage(d, p);
if (md) { if (md) {
parts.push(`# Current page: ${pageId}\n` + parts.push(`# Current page: ${pageId}\n` +
`Frontmatter: ${JSON.stringify(md.fm, null, 2).slice(0, 1500)}\n\n` + `Frontmatter: ${JSON.stringify(md.fm, null, 2).slice(0, CTX.page_frontmatter)}\n\n` +
`Body excerpt:\n${md.body.slice(0, 1500)}`); `Body excerpt:\n${md.body.slice(0, CTX.page_body)}`);
} }
} }
} }

View file

@ -6,7 +6,9 @@
*/ */
import Link from "next/link"; import Link from "next/link";
import { AuthBar } from "@/components/auth-bar"; import { AuthBar } from "@/components/auth-bar";
import { ForceGraphCanvas } from "@/components/force-graph-canvas"; // W2-TD#12: switched from react-force-graph-2d to @react-sigma. One graph
// library is enough; sigma is the one already used by the entity sidebar.
import { SigmaGraph } from "@/components/sigma-graph";
export const dynamic = "force-dynamic"; export const dynamic = "force-dynamic";
@ -46,7 +48,7 @@ export default function GraphPage() {
{/* Fullscreen canvas */} {/* Fullscreen canvas */}
<div className="absolute inset-0"> <div className="absolute inset-0">
<ForceGraphCanvas /> <SigmaGraph />
</div> </div>
</main> </main>
); );

View file

@ -1,596 +0,0 @@
/**
* ForceGraphCanvas D3 force-directed entity graph (Obsidian-style).
*
* Layout:
* - Left sidebar: filters (classes, limit) sempre visível, fora do canvas
* - Right side panel: detalhe da entidade selecionada (quando clica num )
* - Center: canvas fullscreen com nodes coloridos por classe + edges
* coloridas por peso (low=cinza, mid=cyan, high=verde)
*
* Interação:
* - HOVER: tooltip flutuante com nome + classe + mentions
* - CLICK: abre side panel direito com info da entidade + top neighbors + botão "abrir página"
* - DOUBLE-CLICK: navega direto para /e/<class>/<id>
* - Scroll: zoom; drag canvas: pan
*/
"use client";
import { useCallback, useEffect, useMemo, useRef, useState } from "react";
import dynamic from "next/dynamic";
import Link from "next/link";
const ForceGraph2D = dynamic(() => import("react-force-graph-2d"), { ssr: false });
interface RawNode {
entity_pk: number;
entity_class: string;
entity_id: string;
canonical_name: string;
total_mentions: number;
documents_count: number;
}
interface RawLink {
source: number;
target: number;
weight: number;
}
interface GraphNode extends RawNode {
id: number;
x?: number;
y?: number;
vx?: number;
vy?: number;
}
interface GraphLink {
source: number | GraphNode;
target: number | GraphNode;
weight: number;
}
const CLASS_COLOR: Record<string, string> = {
person: "#ff6ec7",
organization: "#ff8a4d",
location: "#3fde6a",
event: "#ffa500",
uap_object: "#ff3344",
vehicle: "#5b9bd5",
operation: "#9b5de5",
concept: "#06d6a0",
};
const CLASS_FOLDER: Record<string, string> = {
person: "people",
organization: "organizations",
location: "locations",
event: "events",
uap_object: "uap-objects",
vehicle: "vehicles",
operation: "operations",
concept: "concepts",
};
const CLASS_LABEL: Record<string, string> = {
person: "Pessoas",
organization: "Organizações",
location: "Locais",
event: "Eventos",
uap_object: "UAP",
vehicle: "Veículos",
operation: "Operações",
concept: "Conceitos",
};
const ALL_CLASSES = ["person", "organization", "location", "event", "uap_object", "vehicle", "operation", "concept"];
/** Color edge by weight tier — visual diferenciação por intensidade */
function edgeColor(weight: number): string {
if (weight >= 10) return "rgba(0,255,156,0.55)"; // strong: green
if (weight >= 5) return "rgba(127,219,255,0.45)"; // medium: cyan
if (weight >= 3) return "rgba(167,139,250,0.35)"; // mild: purple
return "rgba(127,219,255,0.18)"; // weak: faded cyan
}
function edgeWidth(weight: number): number {
return Math.max(0.5, Math.min(6, Math.log2(weight + 1) * 1.2));
}
interface EntityDetail {
entity_pk: number;
entity_class: string;
entity_id: string;
canonical_name: string;
total_mentions: number;
documents_count: number;
neighbors: Array<{
entity_pk: number;
entity_class: string;
entity_id: string;
canonical_name: string;
weight: number;
total_mentions: number;
}>;
}
const detailCache = new Map<number, EntityDetail>();
interface ForceGraph2DRef {
d3Force: (name: string) => { strength?: (v: number) => unknown; distance?: (v: number) => unknown } | null;
d3ReheatSimulation: () => void;
zoomToFit: (durationMs?: number, paddingPx?: number) => void;
centerAt: (x?: number, y?: number, durationMs?: number) => void;
}
export function ForceGraphCanvas() {
const fgRef = useRef<ForceGraph2DRef | null>(null);
const [nodes, setNodes] = useState<GraphNode[]>([]);
const [links, setLinks] = useState<GraphLink[]>([]);
const [selectedClasses, setSelectedClasses] = useState<Set<string>>(new Set(ALL_CLASSES));
const [loading, setLoading] = useState(true);
const [hoverNode, setHoverNode] = useState<GraphNode | null>(null);
const [hoverPos, setHoverPos] = useState<{ x: number; y: number } | null>(null);
const [selectedNode, setSelectedNode] = useState<GraphNode | null>(null);
const [detail, setDetail] = useState<EntityDetail | null>(null);
const [detailLoading, setDetailLoading] = useState(false);
const [limit, setLimit] = useState(40);
const [minWeight, setMinWeight] = useState(3);
const [search, setSearch] = useState("");
// Tune d3-force after the graph mounts and on data change — STRONGER repulsion + LONGER links
useEffect(() => {
const fg = fgRef.current;
if (!fg) return;
const charge = fg.d3Force("charge");
if (charge?.strength) charge.strength(-450);
const link = fg.d3Force("link");
if (link?.distance) link.distance(120);
const center = fg.d3Force("center");
if (center?.strength) center.strength(0.04);
fg.d3ReheatSimulation();
setTimeout(() => fg.zoomToFit?.(800, 80), 1500);
}, [nodes.length, links.length]);
// Initial seed load — re-runs when filters change
useEffect(() => {
setLoading(true);
const classesParam = Array.from(selectedClasses).join(",");
fetch(`/api/graph/seed?limit=${limit}&min_weight=${minWeight}&classes=${classesParam}`)
.then((r) => r.json())
.then((data: { nodes?: RawNode[]; links?: RawLink[] }) => {
const ns = (data.nodes ?? []).map((n) => ({ ...n, id: n.entity_pk } as GraphNode));
const ls = (data.links ?? []).map((l) => ({ source: l.source, target: l.target, weight: l.weight } as GraphLink));
setNodes(ns);
setLinks(ls);
setLoading(false);
})
.catch(() => setLoading(false));
}, [limit, minWeight, selectedClasses]);
// Fetch detail when node selected
useEffect(() => {
if (!selectedNode) {
setDetail(null);
return;
}
const cached = detailCache.get(selectedNode.entity_pk);
if (cached) {
setDetail(cached);
return;
}
setDetail(null);
setDetailLoading(true);
fetch(
`/api/graph?op=neighbors&class=${selectedNode.entity_class}&id=${encodeURIComponent(selectedNode.entity_id)}&limit=12`,
)
.then((r) => r.json())
.then((data: { entity?: RawNode; neighbors?: EntityDetail["neighbors"] }) => {
const d: EntityDetail = {
entity_pk: selectedNode.entity_pk,
entity_class: selectedNode.entity_class,
entity_id: selectedNode.entity_id,
canonical_name: selectedNode.canonical_name,
total_mentions: data.entity?.total_mentions ?? selectedNode.total_mentions,
documents_count: data.entity?.documents_count ?? selectedNode.documents_count,
neighbors: data.neighbors ?? [],
};
detailCache.set(selectedNode.entity_pk, d);
setDetail(d);
})
.catch(() => setDetail(null))
.finally(() => setDetailLoading(false));
}, [selectedNode]);
const onNodeClick = useCallback(async (node: GraphNode) => {
setSelectedNode(node);
}, []);
const expandNode = useCallback(
async (node: GraphNode) => {
try {
const r = await fetch(
`/api/graph?op=neighbors&class=${node.entity_class}&id=${encodeURIComponent(node.entity_id)}&limit=15`,
);
if (!r.ok) return;
const data = (await r.json()) as { neighbors?: Array<RawNode & { weight: number }> };
if (!data.neighbors) return;
setNodes((prev) => {
const existing = new Set(prev.map((p) => p.id));
const additions = data.neighbors!
.filter((n) => !existing.has(n.entity_pk))
.map((n) => ({ ...n, id: n.entity_pk } as GraphNode));
return [...prev, ...additions];
});
setLinks((prev) => {
const seen = new Set(
prev.map((l) => {
const s = typeof l.source === "object" ? (l.source as GraphNode).id : l.source;
const t = typeof l.target === "object" ? (l.target as GraphNode).id : l.target;
return `${Math.min(s, t)}-${Math.max(s, t)}`;
}),
);
const additions: GraphLink[] = [];
for (const n of data.neighbors!) {
const a = node.entity_pk;
const b = n.entity_pk;
const key = `${Math.min(a, b)}-${Math.max(a, b)}`;
if (!seen.has(key)) {
additions.push({ source: a, target: b, weight: n.weight });
seen.add(key);
}
}
return [...prev, ...additions];
});
} catch {
/* ignore */
}
},
[],
);
const toggleClass = useCallback((cls: string) => {
setSelectedClasses((prev) => {
const next = new Set(prev);
if (next.has(cls)) next.delete(cls);
else next.add(cls);
return next.size > 0 ? next : prev;
});
}, []);
const visibleData = useMemo(() => {
let filteredNodes = nodes.filter((n) => selectedClasses.has(n.entity_class));
if (search.trim()) {
const sl = search.toLowerCase();
filteredNodes = filteredNodes.filter((n) =>
n.canonical_name.toLowerCase().includes(sl) || n.entity_id.toLowerCase().includes(sl),
);
}
const allowed = new Set(filteredNodes.map((n) => n.id));
const filteredLinks = links.filter((l) => {
const s = typeof l.source === "object" ? (l.source as GraphNode).id : l.source;
const t = typeof l.target === "object" ? (l.target as GraphNode).id : l.target;
return allowed.has(s) && allowed.has(t);
});
return { nodes: filteredNodes, links: filteredLinks };
}, [nodes, links, selectedClasses, search]);
return (
<div className="relative w-full h-full bg-[#040810] overflow-hidden">
{/* LEFT sidebar — filters (sempre visível, fora do z-30 do page header) */}
<div className="absolute top-20 left-4 z-20 w-[240px] max-h-[calc(100vh-180px)] overflow-y-auto bg-[#0a121e]/95 backdrop-blur border border-[rgba(0,255,156,0.20)] rounded p-3 space-y-4">
<div>
<div className="font-mono text-[10px] uppercase tracking-widest text-[#5a6678] mb-2">
🔍 buscar
</div>
<input
value={search}
onChange={(e) => setSearch(e.target.value)}
placeholder="nome ou id..."
className="w-full bg-transparent border border-[rgba(0,255,156,0.20)] focus:border-[#00ff9c] rounded px-2 py-1.5 font-mono text-xs text-[#c8d4e6] outline-none"
/>
</div>
<div>
<div className="font-mono text-[10px] uppercase tracking-widest text-[#5a6678] mb-2">
classes
</div>
<div className="space-y-1">
{ALL_CLASSES.map((cls) => {
const active = selectedClasses.has(cls);
const color = CLASS_COLOR[cls] ?? "#7fdbff";
return (
<button
key={cls}
onClick={() => toggleClass(cls)}
className={`w-full flex items-center gap-2 px-2 py-1 font-mono text-[11px] rounded border transition ${
active ? "" : "opacity-30 hover:opacity-60"
}`}
style={{
color,
borderColor: color,
background: active ? `${color}12` : "transparent",
}}
>
<span className="inline-block w-2 h-2 rounded-full" style={{ background: color }} />
<span className="flex-1 text-left">{CLASS_LABEL[cls]}</span>
{active && <span></span>}
</button>
);
})}
</div>
</div>
<div>
<div className="font-mono text-[10px] uppercase tracking-widest text-[#5a6678] mb-2">
top entidades
</div>
<div className="flex flex-wrap gap-1">
{[20, 40, 80, 150].map((n) => (
<button
key={n}
onClick={() => setLimit(n)}
className={`px-2 py-1 font-mono text-[11px] rounded border ${
limit === n
? "border-[#00ff9c] text-[#00ff9c] bg-[rgba(0,255,156,0.10)]"
: "border-[rgba(127,219,255,0.20)] text-[#8896aa] hover:text-[#7fdbff]"
}`}
>
{n}
</button>
))}
</div>
</div>
<div>
<div className="font-mono text-[10px] uppercase tracking-widest text-[#5a6678] mb-2">
mostrar vínculos com
</div>
<div className="flex flex-wrap gap-1">
{[2, 3, 5, 10].map((n) => (
<button
key={n}
onClick={() => setMinWeight(n)}
className={`px-2 py-1 font-mono text-[11px] rounded border ${
minWeight === n
? "border-[#00ff9c] text-[#00ff9c] bg-[rgba(0,255,156,0.10)]"
: "border-[rgba(127,219,255,0.20)] text-[#8896aa] hover:text-[#7fdbff]"
}`}
>
{n}×
</button>
))}
</div>
</div>
<div>
<div className="font-mono text-[10px] uppercase tracking-widest text-[#5a6678] mb-2">
força do vínculo
</div>
<div className="space-y-1 text-[10px] font-mono">
<div className="flex items-center gap-2">
<span className="inline-block w-6 h-0.5" style={{ background: "rgba(0,255,156,0.55)" }} />
<span className="text-[#8896aa]"> 10 co-menções</span>
</div>
<div className="flex items-center gap-2">
<span className="inline-block w-6 h-0.5" style={{ background: "rgba(127,219,255,0.45)" }} />
<span className="text-[#8896aa]">59</span>
</div>
<div className="flex items-center gap-2">
<span className="inline-block w-6 h-0.5" style={{ background: "rgba(167,139,250,0.35)" }} />
<span className="text-[#8896aa]">34</span>
</div>
<div className="flex items-center gap-2">
<span className="inline-block w-6 h-0.5" style={{ background: "rgba(127,219,255,0.18)" }} />
<span className="text-[#8896aa]">2 (mín.)</span>
</div>
</div>
</div>
<div className="pt-2 border-t border-[rgba(0,255,156,0.10)] font-mono text-[10px] text-[#5a6678]">
{loading ? "carregando…" : `${visibleData.nodes.length} nós · ${visibleData.links.length} arestas`}
</div>
</div>
{/* RIGHT side panel — entidade selecionada */}
{selectedNode && (
<div className="absolute top-20 right-4 z-20 w-[340px] max-h-[calc(100vh-180px)] overflow-y-auto bg-[#0a121e]/95 backdrop-blur border-2 rounded p-4 space-y-4"
style={{ borderColor: CLASS_COLOR[selectedNode.entity_class] ?? "#7fdbff" }}>
<div className="flex items-start justify-between gap-2">
<div className="flex-1 min-w-0">
<div
className="font-mono text-[10px] uppercase tracking-widest mb-1"
style={{ color: CLASS_COLOR[selectedNode.entity_class] ?? "#7fdbff" }}
>
{CLASS_LABEL[selectedNode.entity_class] ?? selectedNode.entity_class}
</div>
<h3 className="font-mono text-base text-[#c8d4e6] font-bold leading-tight break-words">
{selectedNode.canonical_name}
</h3>
<div className="font-mono text-[10px] text-[#5a6678] mt-1 truncate">
{selectedNode.entity_id}
</div>
</div>
<button
onClick={() => setSelectedNode(null)}
className="text-[#5a6678] hover:text-[#ff6b6b] flex-shrink-0"
aria-label="fechar"
>
</button>
</div>
<div className="grid grid-cols-2 gap-2">
<div className="px-3 py-2 bg-[#060a13] border border-[rgba(0,255,156,0.20)] rounded">
<div className="font-mono text-[9px] uppercase text-[#5a6678]">menções</div>
<div className="font-mono text-lg text-[#00ff9c] mt-0.5">{selectedNode.total_mentions}</div>
</div>
<div className="px-3 py-2 bg-[#060a13] border border-[rgba(127,219,255,0.20)] rounded">
<div className="font-mono text-[9px] uppercase text-[#5a6678]">documentos</div>
<div className="font-mono text-lg text-[#7fdbff] mt-0.5">{selectedNode.documents_count}</div>
</div>
</div>
{/* Action buttons */}
<div className="space-y-1.5">
<Link
href={`/e/${CLASS_FOLDER[selectedNode.entity_class]}/${selectedNode.entity_id}`}
className="block w-full px-3 py-2 font-mono text-xs uppercase tracking-widest border-2 border-[#00ff9c] text-[#00ff9c] bg-[rgba(0,255,156,0.08)] hover:bg-[rgba(0,255,156,0.18)] rounded text-center"
>
abrir página completa
</Link>
<button
onClick={() => expandNode(selectedNode)}
className="block w-full px-3 py-2 font-mono text-xs uppercase tracking-widest border border-[#7fdbff] text-[#7fdbff] hover:bg-[rgba(127,219,255,0.10)] rounded"
>
+ expandir vizinhos no grafo
</button>
</div>
{/* Neighbors list */}
<div>
<div className="font-mono text-[10px] uppercase tracking-widest text-[#5a6678] mb-2">
{detailLoading ? "carregando vizinhos…" : `top vínculos (${detail?.neighbors.length ?? 0})`}
</div>
{detail?.neighbors && detail.neighbors.length > 0 ? (
<ul className="space-y-1">
{detail.neighbors.map((n) => {
const color = CLASS_COLOR[n.entity_class] ?? "#7fdbff";
return (
<li key={n.entity_pk}>
<button
onClick={() => {
const folded = nodes.find((nn) => nn.entity_pk === n.entity_pk);
if (folded) {
setSelectedNode(folded);
} else {
// Inject into graph + select
const newNode: GraphNode = {
entity_pk: n.entity_pk,
entity_class: n.entity_class,
entity_id: n.entity_id,
canonical_name: n.canonical_name,
total_mentions: n.total_mentions,
documents_count: 0,
id: n.entity_pk,
};
setNodes((prev) => [...prev, newNode]);
setLinks((prev) => [...prev, { source: selectedNode.entity_pk, target: n.entity_pk, weight: n.weight }]);
setSelectedNode(newNode);
}
}}
className="group w-full flex items-center gap-2 text-left p-1.5 -mx-1.5 rounded hover:bg-[rgba(0,255,156,0.04)]"
>
<span
className="inline-block w-2 h-2 rounded-full flex-shrink-0"
style={{ background: color }}
/>
<span className="text-[11px] text-[#c8d4e6] group-hover:text-[#00ff9c] flex-1 truncate">
{n.canonical_name}
</span>
<span className="font-mono text-[10px] text-[#5a6678] flex-shrink-0">×{n.weight}</span>
</button>
</li>
);
})}
</ul>
) : !detailLoading ? (
<p className="font-mono text-[10px] text-[#5a6678] italic">sem co-menções</p>
) : null}
</div>
<p className="font-mono text-[9px] text-[#5a6678] pt-2 border-t border-[rgba(0,255,156,0.10)]">
duplo-clique no : abre página da entidade · clique vizinho: foca nele
</p>
</div>
)}
{/* Hover tooltip — segue o mouse */}
{hoverNode && hoverPos && (
<div
className="absolute z-30 pointer-events-none px-2.5 py-1.5 bg-[#0a121e] border-2 rounded text-xs font-mono"
style={{
left: hoverPos.x + 12,
top: hoverPos.y + 12,
borderColor: CLASS_COLOR[hoverNode.entity_class] ?? "#7fdbff",
}}
>
<div className="text-[#c8d4e6] font-bold">{hoverNode.canonical_name}</div>
<div className="text-[10px] text-[#5a6678]">
{hoverNode.entity_class} · {hoverNode.total_mentions} menções · {hoverNode.documents_count} docs
</div>
<div className="text-[9px] text-[#7fdbff] mt-0.5">clique para detalhes</div>
</div>
)}
{/* Canvas */}
<ForceGraph2D
ref={fgRef as never}
graphData={visibleData as never}
backgroundColor="#040810"
nodeRelSize={2.5}
nodeVal={(n) => Math.max(1.5, Math.log2((n as GraphNode).total_mentions + 2) * 0.8)}
nodeColor={(n) => CLASS_COLOR[(n as GraphNode).entity_class] ?? "#7fdbff"}
nodeLabel={() => ""}
linkColor={(l) => edgeColor((l as GraphLink).weight)}
linkWidth={(l) => edgeWidth((l as GraphLink).weight)}
// Physics — separa nós com força + arestas mais longas
d3VelocityDecay={0.3}
d3AlphaDecay={0.015}
cooldownTicks={250}
warmupTicks={60}
// Use d3-force charge customization via dagMode? No, lib has limited API; rely on defaults + manual.
onNodeClick={onNodeClick as never}
onNodeHover={(n) => {
setHoverNode(n as GraphNode | null);
if (!n) setHoverPos(null);
}}
onBackgroundClick={() => setSelectedNode(null)}
nodeCanvasObjectMode={() => "after"}
nodeCanvasObject={(n, ctx, scale) => {
const node = n as GraphNode;
const isSelected = selectedNode?.entity_pk === node.entity_pk;
const isHovered = hoverNode?.entity_pk === node.entity_pk;
// Anti-clutter: with many nodes, mostrar label só:
// - quando zoomado (scale ≥ 1.5)
// - OU é hub (>200 mentions)
// - OU é hover/selected
const isHub = node.total_mentions >= 200;
const showLabel = isSelected || isHovered || isHub || scale >= 1.5;
if (!showLabel) {
if (isSelected) {
ctx.beginPath();
ctx.arc(node.x ?? 0, node.y ?? 0, 10 / scale, 0, 2 * Math.PI, false);
ctx.strokeStyle = "#00ff9c";
ctx.lineWidth = 2.5 / scale;
ctx.stroke();
}
return;
}
const fontSize = Math.max(10, 14 / scale);
ctx.font = `${isHub ? "bold " : ""}${fontSize}px sans-serif`;
const label =
node.canonical_name.length > 28
? node.canonical_name.slice(0, 26) + "…"
: node.canonical_name;
const tw = ctx.measureText(label).width;
const pad = 4 / scale;
// Background pill behind text — readability
ctx.fillStyle = isSelected ? "rgba(0,255,156,0.85)" : "rgba(10,18,30,0.85)";
ctx.fillRect(
(node.x ?? 0) - tw / 2 - pad,
(node.y ?? 0) + 8 / scale,
tw + pad * 2,
fontSize + pad,
);
ctx.fillStyle = isSelected ? "#040810" : "#c8d4e6";
ctx.textAlign = "center";
ctx.textBaseline = "top";
ctx.fillText(label, node.x ?? 0, (node.y ?? 0) + 8 / scale + pad / 2);
// Selected ring
if (isSelected) {
ctx.beginPath();
ctx.arc(node.x ?? 0, node.y ?? 0, 10 / scale, 0, 2 * Math.PI, false);
ctx.strokeStyle = "#00ff9c";
ctx.lineWidth = 2.5 / scale;
ctx.stroke();
}
}}
/>
</div>
);
}

View file

@ -314,6 +314,39 @@ const co_mention_chunks_tool: ToolDefinition = {
}, },
}; };
const analyze_image_region_tool: ToolDefinition = {
type: "function",
function: {
name: "analyze_image_region",
description:
"Vision tool — answer a question about a cropped region of a document page. " +
"Use this when the user asks about a photograph, diagram, sketch, signature, " +
"stamp, redaction, or any visual element where the chunk's text description " +
"isn't enough. The model reads the actual pixels via Sonnet vision. " +
"Get the bbox + page from a prior hybrid_search hit (each chunk carries bbox). " +
"Cost: ~$0.005$0.02 per call. Use sparingly; prefer hybrid_search first.",
parameters: {
type: "object",
properties: {
doc_id: { type: "string" },
page: { type: "integer", description: "1-indexed page number" },
bbox: {
type: "object",
description: "Normalized bbox (0..1) of the region to analyze.",
properties: {
x: { type: "number" }, y: { type: "number" },
w: { type: "number" }, h: { type: "number" },
},
required: ["x", "y", "w", "h"],
},
question: { type: "string", description: "What you want to know about the image." },
context: { type: "string", description: "Optional: prose context that grounds the model." },
},
required: ["doc_id", "page", "bbox", "question"],
},
},
};
const navigate_to_tool: ToolDefinition = { const navigate_to_tool: ToolDefinition = {
type: "function", type: "function",
function: { function: {
@ -345,6 +378,7 @@ export const TOOL_DEFINITIONS: ToolDefinition[] = [
read_document_tool, read_document_tool,
read_entity_tool, read_entity_tool,
search_corpus_tool, search_corpus_tool,
analyze_image_region_tool,
navigate_to_tool, navigate_to_tool,
]; ];
@ -398,6 +432,11 @@ async function handleHybridSearch(
classification: (args.classification as string) || null, classification: (args.classification as string) || null,
ufo_only: Boolean(args.ufo_only), ufo_only: Boolean(args.ufo_only),
top_k, top_k,
// W2-TD#8: chat is latency-sensitive — skip rerank when ≤10 candidates.
// The model only cites the first few hits anyway and BGE-Reranker
// adds 5-8s on CPU. RRF order from the RPC is plenty for the head.
rerank_strategy: "when_top_k_gt",
rerank_threshold: 10,
}); });
// Emit one citation (+ optional crop_image) artifact per hit so the UI can // Emit one citation (+ optional crop_image) artifact per hit so the UI can
// render inline cards next to the assistant text. Limit to top 6 to avoid // render inline cards next to the assistant text. Limit to top 6 to avoid
@ -684,6 +723,37 @@ async function handleNavigate(args: Record<string, unknown>): Promise<unknown> {
return { ok: true, target, label }; return { ok: true, target, label };
} }
async function handleAnalyzeImageRegion(
args: Record<string, unknown>,
ctx: ToolHandlerContext,
): Promise<unknown> {
const doc_id = String(args.doc_id ?? "").trim();
const page = Number(args.page);
const bbox = args.bbox as { x: number; y: number; w: number; h: number } | undefined;
const question = String(args.question ?? "").trim();
if (!doc_id || !page || !bbox || !question) return { error: "missing_args" };
try {
const { analyzeImageRegion } = await import("./vision");
const out = await analyzeImageRegion({
doc_id, page, bbox, question,
context: typeof args.context === "string" ? args.context : undefined,
lang: ctx.lang === "en" ? "en" : "pt",
});
if (ctx.emitArtifact) {
ctx.emitArtifact({
kind: "crop_image",
src: out.crop_url,
doc_id, page,
alt_en: question.slice(0, 120),
alt_pt: question.slice(0, 120),
});
}
return out;
} catch (e) {
return { error: "vision_failed", message: (e as Error).message };
}
}
export const TOOL_HANDLERS: Record<string, ToolHandler> = { export const TOOL_HANDLERS: Record<string, ToolHandler> = {
hybrid_search: handleHybridSearch, hybrid_search: handleHybridSearch,
read_chunk: handleReadChunk, read_chunk: handleReadChunk,
@ -696,5 +766,6 @@ export const TOOL_HANDLERS: Record<string, ToolHandler> = {
read_document: handleReadDocument, read_document: handleReadDocument,
read_entity: handleReadEntity, read_entity: handleReadEntity,
search_corpus: handleSearch, search_corpus: handleSearch,
analyze_image_region: handleAnalyzeImageRegion,
navigate_to: handleNavigate, navigate_to: handleNavigate,
}; };

165
web/lib/chat/vision.ts Normal file
View file

@ -0,0 +1,165 @@
/**
* vision.ts answer questions about an image region via Claude Code OAuth.
*
* Pattern matches the project's existing vision pipeline (02-vision-page.py):
* 1. Crop the PNG of the requested page to the requested bbox.
* 2. Spawn `claude -p --model sonnet --allowedTools Read` and instruct the
* model to Read the local PNG path and answer the user's question.
*
* Uses the user's Claude Code OAuth (Max 20x). Per W1.2 budget policy, the
* agentic worker may use Opus 4.7 without hard cap, but `analyze_image_region`
* runs synchronously in the chat path keep it on Sonnet for latency.
*/
import { spawn } from "node:child_process";
import { mkdtemp, unlink, rmdir } from "node:fs/promises";
import path from "node:path";
import os from "node:os";
import sharp from "sharp";
import { PROCESSING } from "@/lib/wiki";
const MODEL = process.env.VISION_MODEL || "sonnet";
const TIMEOUT_MS = Number(process.env.VISION_TIMEOUT_MS || 120_000);
export interface AnalyzeImageRegionArgs {
doc_id: string;
page: number;
bbox: { x: number; y: number; w: number; h: number };
question: string;
/** Optional context to ground the model (e.g., "this is an FBI memo from 1947"). */
context?: string;
/** Output language hint. Defaults to "pt-br". */
lang?: "pt" | "en";
}
export interface AnalyzeImageRegionResult {
answer: string;
model: string;
duration_ms: number;
bbox: { x: number; y: number; w: number; h: number };
crop_url: string;
}
function pageFilename(page: number): string {
return `p-${String(page).padStart(3, "0")}.png`;
}
/** Crop a bbox region of a page PNG and write the result to a temp file. */
async function cropToTempFile(args: AnalyzeImageRegionArgs): Promise<{ path: string; dir: string }> {
const sourcePath = path.join(PROCESSING, "png", args.doc_id, pageFilename(args.page));
const meta = await sharp(sourcePath).metadata();
const W = meta.width ?? 0;
const H = meta.height ?? 0;
if (W === 0 || H === 0) throw new Error(`source PNG unreadable: ${sourcePath}`);
const left = Math.max(0, Math.round(args.bbox.x * W));
const top = Math.max(0, Math.round(args.bbox.y * H));
const width = Math.max(1, Math.min(W - left, Math.round(args.bbox.w * W)));
const height = Math.max(1, Math.min(H - top, Math.round(args.bbox.h * H)));
const dir = await mkdtemp(path.join(os.tmpdir(), "analyze-image-"));
const filePath = path.join(dir, "crop.png");
await sharp(sourcePath)
.extract({ left, top, width, height })
.resize({ width: Math.min(1024, width), withoutEnlargement: true })
.png()
.toFile(filePath);
return { path: filePath, dir };
}
function buildPrompt(args: AnalyzeImageRegionArgs, cropPath: string): string {
const lang = args.lang === "en" ? "English" : "Brazilian Portuguese (pt-br)";
const ctx = args.context ? `\n\nContext: ${args.context}` : "";
return [
`Use the Read tool on this local PNG file: ${cropPath}`,
"",
`The image is a cropped region from document "${args.doc_id}", page ${args.page}.`,
ctx,
"",
`Answer this question about what is visible in the image, in ${lang}:`,
"",
args.question,
"",
"Rules:",
"- Be factual and concise (3-8 sentences unless the question requires more).",
"- If text is visible, transcribe the relevant portion verbatim (do not translate).",
"- If the image is unclear or empty, say so explicitly. Don't invent.",
"- Do not call any tool besides Read on the provided path.",
].join("\n");
}
/** Spawn `claude -p` and return the JSON output result. */
function callClaudeCli(prompt: string): Promise<{ result: string; durationMs: number; costUsd?: number; tokensIn?: number; tokensOut?: number }> {
return new Promise((resolve, reject) => {
const t0 = Date.now();
const child = spawn(
"claude",
[
"-p",
"--model", MODEL,
"--output-format", "json",
"--max-turns", "2",
"--allowedTools", "Read",
"--",
prompt,
],
{ stdio: ["ignore", "pipe", "pipe"], env: { ...process.env } },
);
let stdout = "";
let stderr = "";
child.stdout.on("data", (c) => (stdout += c.toString()));
child.stderr.on("data", (c) => (stderr += c.toString()));
const t = setTimeout(() => {
child.kill("SIGKILL");
reject(new Error(`claude CLI timeout > ${TIMEOUT_MS}ms`));
}, TIMEOUT_MS);
child.on("error", (e) => { clearTimeout(t); reject(e); });
child.on("close", (code) => {
clearTimeout(t);
if (code !== 0) {
return reject(new Error(`claude CLI rc=${code}: ${stderr.slice(-300)}`));
}
try {
const cli = JSON.parse(stdout);
if (cli.is_error) return reject(new Error(`claude error: ${(cli.result || "").slice(0, 300)}`));
resolve({
result: cli.result || "",
durationMs: cli.duration_ms || Date.now() - t0,
costUsd: cli.total_cost_usd,
tokensIn: cli.usage?.input_tokens,
tokensOut: cli.usage?.output_tokens,
});
} catch (e) {
reject(new Error(`claude stdout parse: ${e instanceof Error ? e.message : String(e)}`));
}
});
});
}
export async function analyzeImageRegion(args: AnalyzeImageRegionArgs): Promise<AnalyzeImageRegionResult> {
if (!args.doc_id) throw new Error("doc_id required");
if (!Number.isFinite(args.page) || args.page < 1) throw new Error("page must be >= 1");
if (!args.bbox || !["x", "y", "w", "h"].every((k) => Number.isFinite((args.bbox as Record<string, unknown>)[k]))) {
throw new Error("bbox {x,y,w,h} required (normalized 0..1)");
}
if (!args.question?.trim()) throw new Error("question required");
const { path: cropPath, dir } = await cropToTempFile(args);
try {
const t0 = Date.now();
const prompt = buildPrompt(args, cropPath);
const out = await callClaudeCli(prompt);
const cropUrl =
`/api/crop?doc=${encodeURIComponent(args.doc_id)}` +
`&page=${args.page}&x=${args.bbox.x}&y=${args.bbox.y}&w=${args.bbox.w}&h=${args.bbox.h}&w_px=640`;
return {
answer: out.result.trim(),
model: MODEL,
duration_ms: out.durationMs || Date.now() - t0,
bbox: args.bbox,
crop_url: cropUrl,
};
} finally {
// Best-effort cleanup. Crop is in $TMPDIR, OS will reap if we miss.
unlink(cropPath).catch(() => undefined);
rmdir(dir).catch(() => undefined);
}
}

View file

@ -43,8 +43,24 @@ export interface HybridSearchOptions {
recall_k?: number; recall_k?: number;
/** Final list size after rerank (default 20). */ /** Final list size after rerank (default 20). */
top_k?: number; top_k?: number;
/** Skip reranker (faster, lower precision). */ /** Skip reranker (faster, lower precision). Back-compat shortcut for
* `rerank_strategy: "never"`. */
no_rerank?: boolean; no_rerank?: boolean;
/**
* W2-TD#8: rerank policy.
* - "always" always run the cross-encoder (highest precision,
* slowest, 58s on CPU)
* - "when_top_k_gt" rerank only when `top_k > rerank_threshold`
* (default threshold 15). RRF order from the RPC is
* usually good enough for the tight head of results;
* the reranker pays off when re-sorting a wider list.
* This is the new default autocomplete / chat
* top-10 calls now skip rerank for free.
* - "never" same as `no_rerank: true`.
*/
rerank_strategy?: "always" | "when_top_k_gt" | "never";
/** Threshold for `when_top_k_gt`. Default 15 (per ADR-001). */
rerank_threshold?: number;
} }
export async function hybridSearch(opts: HybridSearchOptions): Promise<ChunkHit[]> { export async function hybridSearch(opts: HybridSearchOptions): Promise<ChunkHit[]> {
@ -58,8 +74,14 @@ export async function hybridSearch(opts: HybridSearchOptions): Promise<ChunkHit[
recall_k = 100, recall_k = 100,
top_k = 20, top_k = 20,
no_rerank = false, no_rerank = false,
rerank_strategy = "when_top_k_gt",
rerank_threshold = 15,
} = opts; } = opts;
// Effective strategy: explicit `no_rerank=true` always wins (back-compat).
const strategy: "always" | "when_top_k_gt" | "never" =
no_rerank ? "never" : rerank_strategy;
if (!query.trim()) return []; if (!query.trim()) return [];
// 1. Embed the query // 1. Embed the query
@ -89,9 +111,13 @@ export async function hybridSearch(opts: HybridSearchOptions): Promise<ChunkHit[
if (rows.length === 0) return []; if (rows.length === 0) return [];
// 3. Optional cross-encoder rerank for finer ordering. It's CPU-slow // 3. Optional cross-encoder rerank for finer ordering. It's CPU-slow
// (seconds per ~dozen candidates), so it's opt-in (rerank=1); the default // (seconds per ~dozen candidates). Strategy resolution (W2-TD#8 / ADR-001):
// fast path trusts the RPC's RRF order over the already-gated candidates. // - "never" → skip
if (no_rerank) { // - "when_top_k_gt" → skip when top_k ≤ threshold (RRF is good enough
// for a small head)
// - "always" → run unconditionally
if (strategy === "never" ||
(strategy === "when_top_k_gt" && top_k <= rerank_threshold)) {
return rows.slice(0, top_k); return rows.slice(0, top_k);
} }

417
web/package-lock.json generated
View file

@ -26,7 +26,6 @@
"pino": "^10.3.1", "pino": "^10.3.1",
"react": "^19.0.0", "react": "^19.0.0",
"react-dom": "^19.0.0", "react-dom": "^19.0.0",
"react-force-graph-2d": "^1.27.0",
"react-markdown": "^9.0.0", "react-markdown": "^9.0.0",
"remark-gfm": "^4.0.0", "remark-gfm": "^4.0.0",
"remark-wiki-link": "^2.0.1", "remark-wiki-link": "^2.0.1",
@ -5993,12 +5992,6 @@
"tslib": "^2.8.0" "tslib": "^2.8.0"
} }
}, },
"node_modules/@tweenjs/tween.js": {
"version": "25.0.0",
"resolved": "https://registry.npmjs.org/@tweenjs/tween.js/-/tween.js-25.0.0.tgz",
"integrity": "sha512-XKLA6syeBUaPzx4j3qwMqzzq+V4uo72BnlbOjmuljLrRqdsd3qnzvZZoxvMHZ23ndsRS4aufU6JOZYpCbU6T1A==",
"license": "MIT"
},
"node_modules/@types/connect": { "node_modules/@types/connect": {
"version": "3.4.38", "version": "3.4.38",
"resolved": "https://registry.npmjs.org/@types/connect/-/connect-3.4.38.tgz", "resolved": "https://registry.npmjs.org/@types/connect/-/connect-3.4.38.tgz",
@ -6135,15 +6128,6 @@
"integrity": "sha512-mUFwbeTqrVgDQxFveS+df2yfap6iuP20NAKAsBt5jDEoOTDew+zwLAOilHCeQJOVSvmgCX4ogqIrA0mnyr08yQ==", "integrity": "sha512-mUFwbeTqrVgDQxFveS+df2yfap6iuP20NAKAsBt5jDEoOTDew+zwLAOilHCeQJOVSvmgCX4ogqIrA0mnyr08yQ==",
"license": "ISC" "license": "ISC"
}, },
"node_modules/accessor-fn": {
"version": "1.5.3",
"resolved": "https://registry.npmjs.org/accessor-fn/-/accessor-fn-1.5.3.tgz",
"integrity": "sha512-rkAofCwe/FvYFUlMB0v0gWmhqtfAtV1IUkdPbfhTUyYniu5LrC0A0UJkTH0Jv3S8SvwkmfuAlY+mQIJATdocMA==",
"license": "MIT",
"engines": {
"node": ">=12"
}
},
"node_modules/acorn": { "node_modules/acorn": {
"version": "8.16.0", "version": "8.16.0",
"resolved": "https://registry.npmjs.org/acorn/-/acorn-8.16.0.tgz", "resolved": "https://registry.npmjs.org/acorn/-/acorn-8.16.0.tgz",
@ -6335,16 +6319,6 @@
"node": ">=6.0.0" "node": ">=6.0.0"
} }
}, },
"node_modules/bezier-js": {
"version": "6.1.4",
"resolved": "https://registry.npmjs.org/bezier-js/-/bezier-js-6.1.4.tgz",
"integrity": "sha512-PA0FW9ZpcHbojUCMu28z9Vg/fNkwTj5YhusSAjHHDfHDGLxJ6YUKrAN2vk1fP2MMOxVw4Oko16FMlRGVBGqLKg==",
"license": "MIT",
"funding": {
"type": "individual",
"url": "https://github.com/Pomax/bezierjs/blob/master/FUNDING.md"
}
},
"node_modules/binary-extensions": { "node_modules/binary-extensions": {
"version": "2.3.0", "version": "2.3.0",
"resolved": "https://registry.npmjs.org/binary-extensions/-/binary-extensions-2.3.0.tgz", "resolved": "https://registry.npmjs.org/binary-extensions/-/binary-extensions-2.3.0.tgz",
@ -6446,18 +6420,6 @@
], ],
"license": "CC-BY-4.0" "license": "CC-BY-4.0"
}, },
"node_modules/canvas-color-tracker": {
"version": "1.3.2",
"resolved": "https://registry.npmjs.org/canvas-color-tracker/-/canvas-color-tracker-1.3.2.tgz",
"integrity": "sha512-ryQkDX26yJ3CXzb3hxUVNlg1NKE4REc5crLBq661Nxzr8TNd236SaEf2ffYLXyI5tSABSeguHLqcVq4vf9L3Zg==",
"license": "MIT",
"dependencies": {
"tinycolor2": "^1.6.0"
},
"engines": {
"node": ">=12"
}
},
"node_modules/ccount": { "node_modules/ccount": {
"version": "2.0.1", "version": "2.0.1",
"resolved": "https://registry.npmjs.org/ccount/-/ccount-2.0.1.tgz", "resolved": "https://registry.npmjs.org/ccount/-/ccount-2.0.1.tgz",
@ -6664,222 +6626,6 @@
"dev": true, "dev": true,
"license": "MIT" "license": "MIT"
}, },
"node_modules/d3-array": {
"version": "3.2.4",
"resolved": "https://registry.npmjs.org/d3-array/-/d3-array-3.2.4.tgz",
"integrity": "sha512-tdQAmyA18i4J7wprpYq8ClcxZy3SC31QMeByyCFyRt7BVHdREQZ5lpzoe5mFEYZUWe+oq8HBvk9JjpibyEV4Jg==",
"license": "ISC",
"dependencies": {
"internmap": "1 - 2"
},
"engines": {
"node": ">=12"
}
},
"node_modules/d3-binarytree": {
"version": "1.0.2",
"resolved": "https://registry.npmjs.org/d3-binarytree/-/d3-binarytree-1.0.2.tgz",
"integrity": "sha512-cElUNH+sHu95L04m92pG73t2MEJXKu+GeKUN1TJkFsu93E5W8E9Sc3kHEGJKgenGvj19m6upSn2EunvMgMD2Yw==",
"license": "MIT"
},
"node_modules/d3-color": {
"version": "3.1.0",
"resolved": "https://registry.npmjs.org/d3-color/-/d3-color-3.1.0.tgz",
"integrity": "sha512-zg/chbXyeBtMQ1LbD/WSoW2DpC3I0mpmPdW+ynRTj/x2DAWYrIY7qeZIHidozwV24m4iavr15lNwIwLxRmOxhA==",
"license": "ISC",
"engines": {
"node": ">=12"
}
},
"node_modules/d3-dispatch": {
"version": "3.0.1",
"resolved": "https://registry.npmjs.org/d3-dispatch/-/d3-dispatch-3.0.1.tgz",
"integrity": "sha512-rzUyPU/S7rwUflMyLc1ETDeBj0NRuHKKAcvukozwhshr6g6c5d8zh4c2gQjY2bZ0dXeGLWc1PF174P2tVvKhfg==",
"license": "ISC",
"engines": {
"node": ">=12"
}
},
"node_modules/d3-drag": {
"version": "3.0.0",
"resolved": "https://registry.npmjs.org/d3-drag/-/d3-drag-3.0.0.tgz",
"integrity": "sha512-pWbUJLdETVA8lQNJecMxoXfH6x+mO2UQo8rSmZ+QqxcbyA3hfeprFgIT//HW2nlHChWeIIMwS2Fq+gEARkhTkg==",
"license": "ISC",
"dependencies": {
"d3-dispatch": "1 - 3",
"d3-selection": "3"
},
"engines": {
"node": ">=12"
}
},
"node_modules/d3-ease": {
"version": "3.0.1",
"resolved": "https://registry.npmjs.org/d3-ease/-/d3-ease-3.0.1.tgz",
"integrity": "sha512-wR/XK3D3XcLIZwpbvQwQ5fK+8Ykds1ip7A2Txe0yxncXSdq1L9skcG7blcedkOX+ZcgxGAmLX1FrRGbADwzi0w==",
"license": "BSD-3-Clause",
"engines": {
"node": ">=12"
}
},
"node_modules/d3-force-3d": {
"version": "3.0.6",
"resolved": "https://registry.npmjs.org/d3-force-3d/-/d3-force-3d-3.0.6.tgz",
"integrity": "sha512-4tsKHUPLOVkyfEffZo1v6sFHvGFwAIIjt/W8IThbp08DYAsXZck+2pSHEG5W1+gQgEvFLdZkYvmJAbRM2EzMnA==",
"license": "MIT",
"dependencies": {
"d3-binarytree": "1",
"d3-dispatch": "1 - 3",
"d3-octree": "1",
"d3-quadtree": "1 - 3",
"d3-timer": "1 - 3"
},
"engines": {
"node": ">=12"
}
},
"node_modules/d3-format": {
"version": "3.1.2",
"resolved": "https://registry.npmjs.org/d3-format/-/d3-format-3.1.2.tgz",
"integrity": "sha512-AJDdYOdnyRDV5b6ArilzCPPwc1ejkHcoyFarqlPqT7zRYjhavcT3uSrqcMvsgh2CgoPbK3RCwyHaVyxYcP2Arg==",
"license": "ISC",
"engines": {
"node": ">=12"
}
},
"node_modules/d3-interpolate": {
"version": "3.0.1",
"resolved": "https://registry.npmjs.org/d3-interpolate/-/d3-interpolate-3.0.1.tgz",
"integrity": "sha512-3bYs1rOD33uo8aqJfKP3JWPAibgw8Zm2+L9vBKEHJ2Rg+viTR7o5Mmv5mZcieN+FRYaAOWX5SJATX6k1PWz72g==",
"license": "ISC",
"dependencies": {
"d3-color": "1 - 3"
},
"engines": {
"node": ">=12"
}
},
"node_modules/d3-octree": {
"version": "1.1.0",
"resolved": "https://registry.npmjs.org/d3-octree/-/d3-octree-1.1.0.tgz",
"integrity": "sha512-F8gPlqpP+HwRPMO/8uOu5wjH110+6q4cgJvgJT6vlpy3BEaDIKlTZrgHKZSp/i1InRpVfh4puY/kvL6MxK930A==",
"license": "MIT"
},
"node_modules/d3-quadtree": {
"version": "3.0.1",
"resolved": "https://registry.npmjs.org/d3-quadtree/-/d3-quadtree-3.0.1.tgz",
"integrity": "sha512-04xDrxQTDTCFwP5H6hRhsRcb9xxv2RzkcsygFzmkSIOJy3PeRJP7sNk3VRIbKXcog561P9oU0/rVH6vDROAgUw==",
"license": "ISC",
"engines": {
"node": ">=12"
}
},
"node_modules/d3-scale": {
"version": "4.0.2",
"resolved": "https://registry.npmjs.org/d3-scale/-/d3-scale-4.0.2.tgz",
"integrity": "sha512-GZW464g1SH7ag3Y7hXjf8RoUuAFIqklOAq3MRl4OaWabTFJY9PN/E1YklhXLh+OQ3fM9yS2nOkCoS+WLZ6kvxQ==",
"license": "ISC",
"dependencies": {
"d3-array": "2.10.0 - 3",
"d3-format": "1 - 3",
"d3-interpolate": "1.2.0 - 3",
"d3-time": "2.1.1 - 3",
"d3-time-format": "2 - 4"
},
"engines": {
"node": ">=12"
}
},
"node_modules/d3-scale-chromatic": {
"version": "3.1.0",
"resolved": "https://registry.npmjs.org/d3-scale-chromatic/-/d3-scale-chromatic-3.1.0.tgz",
"integrity": "sha512-A3s5PWiZ9YCXFye1o246KoscMWqf8BsD9eRiJ3He7C9OBaxKhAd5TFCdEx/7VbKtxxTsu//1mMJFrEt572cEyQ==",
"license": "ISC",
"dependencies": {
"d3-color": "1 - 3",
"d3-interpolate": "1 - 3"
},
"engines": {
"node": ">=12"
}
},
"node_modules/d3-selection": {
"version": "3.0.0",
"resolved": "https://registry.npmjs.org/d3-selection/-/d3-selection-3.0.0.tgz",
"integrity": "sha512-fmTRWbNMmsmWq6xJV8D19U/gw/bwrHfNXxrIN+HfZgnzqTHp9jOmKMhsTUjXOJnZOdZY9Q28y4yebKzqDKlxlQ==",
"license": "ISC",
"engines": {
"node": ">=12"
}
},
"node_modules/d3-time": {
"version": "3.1.0",
"resolved": "https://registry.npmjs.org/d3-time/-/d3-time-3.1.0.tgz",
"integrity": "sha512-VqKjzBLejbSMT4IgbmVgDjpkYrNWUYJnbCGo874u7MMKIWsILRX+OpX/gTk8MqjpT1A/c6HY2dCA77ZN0lkQ2Q==",
"license": "ISC",
"dependencies": {
"d3-array": "2 - 3"
},
"engines": {
"node": ">=12"
}
},
"node_modules/d3-time-format": {
"version": "4.1.0",
"resolved": "https://registry.npmjs.org/d3-time-format/-/d3-time-format-4.1.0.tgz",
"integrity": "sha512-dJxPBlzC7NugB2PDLwo9Q8JiTR3M3e4/XANkreKSUxF8vvXKqm1Yfq4Q5dl8budlunRVlUUaDUgFt7eA8D6NLg==",
"license": "ISC",
"dependencies": {
"d3-time": "1 - 3"
},
"engines": {
"node": ">=12"
}
},
"node_modules/d3-timer": {
"version": "3.0.1",
"resolved": "https://registry.npmjs.org/d3-timer/-/d3-timer-3.0.1.tgz",
"integrity": "sha512-ndfJ/JxxMd3nw31uyKoY2naivF+r29V+Lc0svZxe1JvvIRmi8hUsrMvdOwgS1o6uBHmiz91geQ0ylPP0aj1VUA==",
"license": "ISC",
"engines": {
"node": ">=12"
}
},
"node_modules/d3-transition": {
"version": "3.0.1",
"resolved": "https://registry.npmjs.org/d3-transition/-/d3-transition-3.0.1.tgz",
"integrity": "sha512-ApKvfjsSR6tg06xrL434C0WydLr7JewBB3V+/39RMHsaXTOG0zmt/OAXeng5M5LBm0ojmxJrpomQVZ1aPvBL4w==",
"license": "ISC",
"dependencies": {
"d3-color": "1 - 3",
"d3-dispatch": "1 - 3",
"d3-ease": "1 - 3",
"d3-interpolate": "1 - 3",
"d3-timer": "1 - 3"
},
"engines": {
"node": ">=12"
},
"peerDependencies": {
"d3-selection": "2 - 3"
}
},
"node_modules/d3-zoom": {
"version": "3.0.0",
"resolved": "https://registry.npmjs.org/d3-zoom/-/d3-zoom-3.0.0.tgz",
"integrity": "sha512-b8AmV3kfQaqWAuacbPuNbL6vahnOJflOhexLzMMNLga62+/nh0JzvJ0aO/5a5MVgUFGS7Hu1P9P03o3fJkDCyw==",
"license": "ISC",
"dependencies": {
"d3-dispatch": "1 - 3",
"d3-drag": "2 - 3",
"d3-interpolate": "1 - 3",
"d3-selection": "2 - 3",
"d3-transition": "2 - 3"
},
"engines": {
"node": ">=12"
}
},
"node_modules/debug": { "node_modules/debug": {
"version": "4.4.3", "version": "4.4.3",
"resolved": "https://registry.npmjs.org/debug/-/debug-4.4.3.tgz", "resolved": "https://registry.npmjs.org/debug/-/debug-4.4.3.tgz",
@ -7194,46 +6940,6 @@
"url": "https://github.com/sponsors/sindresorhus" "url": "https://github.com/sponsors/sindresorhus"
} }
}, },
"node_modules/float-tooltip": {
"version": "1.7.5",
"resolved": "https://registry.npmjs.org/float-tooltip/-/float-tooltip-1.7.5.tgz",
"integrity": "sha512-/kXzuDnnBqyyWyhDMH7+PfP8J/oXiAavGzcRxASOMRHFuReDtofizLLJsf7nnDLAfEaMW4pVWaXrAjtnglpEkg==",
"license": "MIT",
"dependencies": {
"d3-selection": "2 - 3",
"kapsule": "^1.16",
"preact": "10"
},
"engines": {
"node": ">=12"
}
},
"node_modules/force-graph": {
"version": "1.51.4",
"resolved": "https://registry.npmjs.org/force-graph/-/force-graph-1.51.4.tgz",
"integrity": "sha512-TdJ2KbkoiDQ7NIRx8IPGD0mAXXpLhamS7c+b7W98b0MHG7lphnda1VOQX/98UDTsttIAdH4TcP0l0MauSnLK8w==",
"license": "MIT",
"dependencies": {
"@tweenjs/tween.js": "18 - 25",
"accessor-fn": "1",
"bezier-js": "3 - 6",
"canvas-color-tracker": "^1.3",
"d3-array": "1 - 3",
"d3-drag": "2 - 3",
"d3-force-3d": "2 - 3",
"d3-scale": "1 - 4",
"d3-scale-chromatic": "1 - 3",
"d3-selection": "2 - 3",
"d3-zoom": "2 - 3",
"float-tooltip": "^1.7",
"index-array-by": "1",
"kapsule": "^1.16",
"lodash-es": "4"
},
"engines": {
"node": ">=12"
}
},
"node_modules/forwarded-parse": { "node_modules/forwarded-parse": {
"version": "2.1.2", "version": "2.1.2",
"resolved": "https://registry.npmjs.org/forwarded-parse/-/forwarded-parse-2.1.2.tgz", "resolved": "https://registry.npmjs.org/forwarded-parse/-/forwarded-parse-2.1.2.tgz",
@ -7509,30 +7215,12 @@
"node": ">=18" "node": ">=18"
} }
}, },
"node_modules/index-array-by": {
"version": "1.4.2",
"resolved": "https://registry.npmjs.org/index-array-by/-/index-array-by-1.4.2.tgz",
"integrity": "sha512-SP23P27OUKzXWEC/TOyWlwLviofQkCSCKONnc62eItjp69yCZZPqDQtr3Pw5gJDnPeUMqExmKydNZaJO0FU9pw==",
"license": "MIT",
"engines": {
"node": ">=12"
}
},
"node_modules/inline-style-parser": { "node_modules/inline-style-parser": {
"version": "0.2.7", "version": "0.2.7",
"resolved": "https://registry.npmjs.org/inline-style-parser/-/inline-style-parser-0.2.7.tgz", "resolved": "https://registry.npmjs.org/inline-style-parser/-/inline-style-parser-0.2.7.tgz",
"integrity": "sha512-Nb2ctOyNR8DqQoR0OwRG95uNWIC0C1lCgf5Naz5H6Ji72KZ8OcFZLz2P5sNgwlyoJ8Yif11oMuYs5pBQa86csA==", "integrity": "sha512-Nb2ctOyNR8DqQoR0OwRG95uNWIC0C1lCgf5Naz5H6Ji72KZ8OcFZLz2P5sNgwlyoJ8Yif11oMuYs5pBQa86csA==",
"license": "MIT" "license": "MIT"
}, },
"node_modules/internmap": {
"version": "2.0.3",
"resolved": "https://registry.npmjs.org/internmap/-/internmap-2.0.3.tgz",
"integrity": "sha512-5Hh7Y1wQbvY5ooGgPbDaL5iYLAPzMTUrjMulskHLH6wnv/A+1q5rgEaiuqEjB+oxGXIVZs1FF+R/KPN3ZSQYYg==",
"license": "ISC",
"engines": {
"node": ">=12"
}
},
"node_modules/is-alphabetical": { "node_modules/is-alphabetical": {
"version": "2.0.1", "version": "2.0.1",
"resolved": "https://registry.npmjs.org/is-alphabetical/-/is-alphabetical-2.0.1.tgz", "resolved": "https://registry.npmjs.org/is-alphabetical/-/is-alphabetical-2.0.1.tgz",
@ -7681,15 +7369,6 @@
"integrity": "sha512-RHxMLp9lnKHGHRng9QFhRCMbYAcVpn69smSGcq3f36xjgVVWThj4qqLbTLlq7Ssj8B+fIQ1EuCEGI2lKsyQeIw==", "integrity": "sha512-RHxMLp9lnKHGHRng9QFhRCMbYAcVpn69smSGcq3f36xjgVVWThj4qqLbTLlq7Ssj8B+fIQ1EuCEGI2lKsyQeIw==",
"license": "ISC" "license": "ISC"
}, },
"node_modules/jerrypick": {
"version": "1.1.2",
"resolved": "https://registry.npmjs.org/jerrypick/-/jerrypick-1.1.2.tgz",
"integrity": "sha512-YKnxXEekXKzhpf7CLYA0A+oDP8V0OhICNCr5lv96FvSsDEmrb0GKM776JgQvHTMjr7DTTPEVv/1Ciaw0uEWzBA==",
"license": "MIT",
"engines": {
"node": ">=12"
}
},
"node_modules/jiti": { "node_modules/jiti": {
"version": "1.21.7", "version": "1.21.7",
"resolved": "https://registry.npmjs.org/jiti/-/jiti-1.21.7.tgz", "resolved": "https://registry.npmjs.org/jiti/-/jiti-1.21.7.tgz",
@ -7743,18 +7422,6 @@
"node": ">=6" "node": ">=6"
} }
}, },
"node_modules/kapsule": {
"version": "1.16.3",
"resolved": "https://registry.npmjs.org/kapsule/-/kapsule-1.16.3.tgz",
"integrity": "sha512-4+5mNNf4vZDSwPhKprKwz3330iisPrb08JyMgbsdFrimBCKNHecua/WBwvVg3n7vwx0C1ARjfhwIpbrbd9n5wg==",
"license": "MIT",
"dependencies": {
"lodash-es": "4"
},
"engines": {
"node": ">=12"
}
},
"node_modules/kind-of": { "node_modules/kind-of": {
"version": "6.0.3", "version": "6.0.3",
"resolved": "https://registry.npmjs.org/kind-of/-/kind-of-6.0.3.tgz", "resolved": "https://registry.npmjs.org/kind-of/-/kind-of-6.0.3.tgz",
@ -7799,12 +7466,6 @@
"url": "https://github.com/sponsors/sindresorhus" "url": "https://github.com/sponsors/sindresorhus"
} }
}, },
"node_modules/lodash-es": {
"version": "4.18.1",
"resolved": "https://registry.npmjs.org/lodash-es/-/lodash-es-4.18.1.tgz",
"integrity": "sha512-J8xewKD/Gk22OZbhpOVSwcs60zhd95ESDwezOFuA3/099925PdHJ7OFHNTGtajL3AlZkykD32HykiMo+BIBI8A==",
"license": "MIT"
},
"node_modules/longest-streak": { "node_modules/longest-streak": {
"version": "3.1.0", "version": "3.1.0",
"resolved": "https://registry.npmjs.org/longest-streak/-/longest-streak-3.1.0.tgz", "resolved": "https://registry.npmjs.org/longest-streak/-/longest-streak-3.1.0.tgz",
@ -7815,18 +7476,6 @@
"url": "https://github.com/sponsors/wooorm" "url": "https://github.com/sponsors/wooorm"
} }
}, },
"node_modules/loose-envify": {
"version": "1.4.0",
"resolved": "https://registry.npmjs.org/loose-envify/-/loose-envify-1.4.0.tgz",
"integrity": "sha512-lyuxPGr/Wfhrlem2CL/UcnUc1zcqKAImBDzukY7Y5F/yQiNdko6+fRLevlw1HgMySw7f611UIY408EtxRSoK3Q==",
"license": "MIT",
"dependencies": {
"js-tokens": "^3.0.0 || ^4.0.0"
},
"bin": {
"loose-envify": "cli.js"
}
},
"node_modules/lru-cache": { "node_modules/lru-cache": {
"version": "5.1.1", "version": "5.1.1",
"resolved": "https://registry.npmjs.org/lru-cache/-/lru-cache-5.1.1.tgz", "resolved": "https://registry.npmjs.org/lru-cache/-/lru-cache-5.1.1.tgz",
@ -9511,6 +9160,7 @@
"version": "4.1.1", "version": "4.1.1",
"resolved": "https://registry.npmjs.org/object-assign/-/object-assign-4.1.1.tgz", "resolved": "https://registry.npmjs.org/object-assign/-/object-assign-4.1.1.tgz",
"integrity": "sha512-rJgTQnkUnH1sFw8yT6VSU3zD3sWmu6sZhIseY8VX+GRu3P6F7Fu+JNDoXfklElbLJSnc3FUQHVe4cU5hj+BcUg==", "integrity": "sha512-rJgTQnkUnH1sFw8yT6VSU3zD3sWmu6sZhIseY8VX+GRu3P6F7Fu+JNDoXfklElbLJSnc3FUQHVe4cU5hj+BcUg==",
"dev": true,
"license": "MIT", "license": "MIT",
"engines": { "engines": {
"node": ">=0.10.0" "node": ">=0.10.0"
@ -10023,16 +9673,6 @@
"node": ">=0.10.0" "node": ">=0.10.0"
} }
}, },
"node_modules/preact": {
"version": "10.29.1",
"resolved": "https://registry.npmjs.org/preact/-/preact-10.29.1.tgz",
"integrity": "sha512-gQCLc/vWroE8lIpleXtdJhTFDogTdZG9AjMUpVkDf2iTCNwYNWA+u16dL41TqUDJO4gm2IgrcMv3uTpjd4Pwmg==",
"license": "MIT",
"funding": {
"type": "opencollective",
"url": "https://opencollective.com/preact"
}
},
"node_modules/process-warning": { "node_modules/process-warning": {
"version": "5.0.0", "version": "5.0.0",
"resolved": "https://registry.npmjs.org/process-warning/-/process-warning-5.0.0.tgz", "resolved": "https://registry.npmjs.org/process-warning/-/process-warning-5.0.0.tgz",
@ -10058,17 +9698,6 @@
"node": ">=0.4.0" "node": ">=0.4.0"
} }
}, },
"node_modules/prop-types": {
"version": "15.8.1",
"resolved": "https://registry.npmjs.org/prop-types/-/prop-types-15.8.1.tgz",
"integrity": "sha512-oj87CgZICdulUohogVAR7AjlC0327U4el4L6eAvOqCeudMDVU0NThNaV+b9Df4dXgSP1gXMTnPdhfe/2qDH5cg==",
"license": "MIT",
"dependencies": {
"loose-envify": "^1.4.0",
"object-assign": "^4.1.1",
"react-is": "^16.13.1"
}
},
"node_modules/property-information": { "node_modules/property-information": {
"version": "7.1.0", "version": "7.1.0",
"resolved": "https://registry.npmjs.org/property-information/-/property-information-7.1.0.tgz", "resolved": "https://registry.npmjs.org/property-information/-/property-information-7.1.0.tgz",
@ -10248,44 +9877,6 @@
"react": "^19.2.6" "react": "^19.2.6"
} }
}, },
"node_modules/react-force-graph-2d": {
"version": "1.29.1",
"resolved": "https://registry.npmjs.org/react-force-graph-2d/-/react-force-graph-2d-1.29.1.tgz",
"integrity": "sha512-1Rl/1Z3xy2iTHKj6a0jRXGyiI86xUti81K+jBQZ+Oe46csaMikp47L5AjrzA9hY9fNGD63X8ffrqnvaORukCuQ==",
"license": "MIT",
"dependencies": {
"force-graph": "^1.51",
"prop-types": "15",
"react-kapsule": "^2.5"
},
"engines": {
"node": ">=12"
},
"peerDependencies": {
"react": "*"
}
},
"node_modules/react-is": {
"version": "16.13.1",
"resolved": "https://registry.npmjs.org/react-is/-/react-is-16.13.1.tgz",
"integrity": "sha512-24e6ynE2H+OKt4kqsOvNd8kBpV65zoxbA4BVsEOB3ARVWQki/DHzaUoC5KuON/BiccDaCCTZBuOcfZs70kR8bQ==",
"license": "MIT"
},
"node_modules/react-kapsule": {
"version": "2.5.7",
"resolved": "https://registry.npmjs.org/react-kapsule/-/react-kapsule-2.5.7.tgz",
"integrity": "sha512-kifAF4ZPD77qZKc4CKLmozq6GY1sBzPEJTIJb0wWFK6HsePJatK3jXplZn2eeAt3x67CDozgi7/rO8fNQ/AL7A==",
"license": "MIT",
"dependencies": {
"jerrypick": "^1.1.1"
},
"engines": {
"node": ">=12"
},
"peerDependencies": {
"react": ">=16.13.1"
}
},
"node_modules/react-markdown": { "node_modules/react-markdown": {
"version": "9.1.0", "version": "9.1.0",
"resolved": "https://registry.npmjs.org/react-markdown/-/react-markdown-9.1.0.tgz", "resolved": "https://registry.npmjs.org/react-markdown/-/react-markdown-9.1.0.tgz",
@ -10991,12 +10582,6 @@
"integrity": "sha512-P4nbQYQfePJxRSmY+v/KINxVucm4NF3p3s7pJveMTtom52FR4YGltUQLB8idDXwDDWW+eYrWDFbuzUnjoWHF7g==", "integrity": "sha512-P4nbQYQfePJxRSmY+v/KINxVucm4NF3p3s7pJveMTtom52FR4YGltUQLB8idDXwDDWW+eYrWDFbuzUnjoWHF7g==",
"license": "MIT" "license": "MIT"
}, },
"node_modules/tinycolor2": {
"version": "1.6.0",
"resolved": "https://registry.npmjs.org/tinycolor2/-/tinycolor2-1.6.0.tgz",
"integrity": "sha512-XPaBkWQJdsf3pLKJV9p4qN/S+fm2Oj8AIPo1BTUhg5oxkvm9+SVEGFdhyOz7tTdUTfvxMiAs4sp6/eZO2Ew+pw==",
"license": "MIT"
},
"node_modules/tinyglobby": { "node_modules/tinyglobby": {
"version": "0.2.16", "version": "0.2.16",
"resolved": "https://registry.npmjs.org/tinyglobby/-/tinyglobby-0.2.16.tgz", "resolved": "https://registry.npmjs.org/tinyglobby/-/tinyglobby-0.2.16.tgz",

View file

@ -28,7 +28,6 @@
"pino": "^10.3.1", "pino": "^10.3.1",
"react": "^19.0.0", "react": "^19.0.0",
"react-dom": "^19.0.0", "react-dom": "^19.0.0",
"react-force-graph-2d": "^1.27.0",
"react-markdown": "^9.0.0", "react-markdown": "^9.0.0",
"remark-gfm": "^4.0.0", "remark-gfm": "^4.0.0",
"remark-wiki-link": "^2.0.1", "remark-wiki-link": "^2.0.1",