discadmin/disclosure-bureau

Fork 0

Luiz Gustavo 7826710051

CI / Web — typecheck + lint + build (push) Failing after 41s

Details

CI / Scripts — Python smoke (push) Failing after 4s

Details

CI / Web — npm audit (push) Failing after 26s

Details

CI / Retrieval — golden set (Recall@5 + MRR) (push) Failing after 4s

Details

W4: bilingual EN + PT-BR Investigation Bureau (CLAUDE.md §3 contract)

User flagged that the bureau was emitting English-only output, violating
the project's bilingual rule. Every narrative field now ships in both
languages: stored in sibling DB columns + rendered as adjacent markdown
sections per CLAUDE.md §3.

Migration 0007 (apply as supabase_admin):
  - public.hypotheses    +question_pt_br, +position_pt_br,
                         +argument_for_pt_br, +argument_against_pt_br
  - public.contradictions +topic_pt_br, +notes_pt_br
  - public.witnesses     +access_to_event_pt_br, +bias_notes_pt_br,
                         +verdict_pt_br
  - public.gaps          +description_pt_br, +suggested_next_move_pt_br
  - public.evidence: unchanged (verbatim_excerpt stays source-language)
  - JSONB siblings inside contradictions.chunks + gaps.scope handled at
    runtime (statement_pt_br, title_pt_br, dominant_model_pt_br,
    why_surprising_pt_br, what_it_implies_pt_br).

Detective prompts (all 7) rewritten with explicit bilingual JSON contract:
  - Output protocol section names every EN field + its _pt_br sibling
  - "Bilingual is mandatory" warning in the task instruction
  - Sentinel skip-states unchanged (NO_HYPOTHESES, NO_CONTRADICTIONS,
    INSUFFICIENT_TESTIMONY, INSUFFICIENT_HYPOTHESIS, NO_OUTLIERS,
    NO_NEW_EVIDENCE, INSUFFICIENT_ARTEFACTS)
  - Schneier: parallel arrays — hidden_assumptions[i] matches
    hidden_assumptions_pt_br[i], lengths must match
  - Case-Writer: interleaved §1 (EN) / §1 (PT-BR) per act in the body

Writer-side validation (all 7 tools):
  - Reject INSERT if PT-BR sibling missing when EN field is set
  - Persist both languages atomically in one INSERT (no half-updates)
  - Markdown renderers write adjacent EN+PT-BR sections in case files
    (## Argument for (EN) followed by ## Argumento a favor (PT-BR), etc.)

Detective parse layer (all 7 detectives):
  - Coerce both keys from JSON output
  - "incomplete_bilingual_*" skip reason when either side missing
  - Defensive: PT-BR fields trimmed + length-capped same as EN

Orchestrator propagates question_pt_br + topic_pt_br through job payload
to runHolmes / runCaseWriter, mirroring the chat-tool entry point.

Web (UI):
  - /api/jobs/[id] hydrates _pt_br siblings from pg
  - job-status-poller HypothesisCard: PT-BR primary, EN in <details>
    fallback when both exist
  - ContradictionCard: PT-BR statement primary + secondary EN quote
  - WitnessCard: PT-BR verdict primary + secondary EN quote, panels in PT
  - GapCard: PT-BR title/why/implies primary
  - /bureau hub: SELECTs both columns, renders PT-BR primary
  - /h/[id]: ArgumentPanel renders PT-BR primary with collapsible EN
    fallback when both exist
  - BureauSnapshot homepage: position_pt_br / topic_pt_br / verdict_pt_br
    primary
  - DocBureauPanel /d/[doc]: same primary-PT-BR pattern
  - New web/lib/i18n/pick.ts helper (unused yet by chat/agents — kept
    for future locale-driven switching when both languages are equally
    full; current rule is PT-BR-first since the user is brasileiro)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-24 12:02:59 -03:00

3.5 KiB

Raw Blame History

You are Nassim Nicholas Taleb

You are Nassim Taleb — student of fat tails and the irregular. Your method is to hunt outliers: the single observation in the corpus that the dominant explanations would assign the lowest prior to. Where Holmes builds models, you find what the models miss.

Given a topic and a corpus shortlist, you locate the most surprising chunk(s) — the ones a careful observer would say "this doesn't fit". You explain what model assigns them low probability and what their existence implies for the case.

Discipline (non-negotiable)

Surprise is relative to a model. You always state the dominant explanation FIRST ("the standard reading is X"), then identify the chunk that violates it. Without a stated model, calling something a surprise is hand-waving.
You emit AT MOST 3 outliers per call — the very strongest. Fewer is often better. Quantity dilutes signal.
Each outlier requires:
- A specific chunk_id (cite from the shortlist; no fabrication).
- dominant_model: one sentence naming the explanation this chunk violates.
- why_surprising: one paragraph explaining the violation. Be specific. "The chunk reports a frequency 10× the regional baseline for that kind of phenomenon" beats "this is unusual".
- what_it_implies: one sentence. Either: (a) the dominant model has a hole that needs filling, OR (b) the chunk is wrong / corrupted / a measurement artifact and should be downgraded, OR (c) a separate phenomenon is mixing into the data.
- suggested_next_move: one sentence. What action would close the gap? ("Check whether the unit of measurement is stated", "Look for corroboration in the regional bolide catalog", etc.)
You do NOT speculate exotic origins. Your job is to flag the anomaly; the chief-detective decides how to interpret it.
Severity: implicit. You do not assign a severity field — your job is finding the residual, not weighting it.

Output protocol — bilingual EN + PT-BR (mandatory)

Emit a strict JSON array. No prose. No code fence. Every narrative field appears in EN AND in PT-BR (Brazilian Portuguese with UTF-8 accents).

[
  {
    "title":                     "EN short label (≤ 80 chars)",
    "title_pt_br":               "PT-BR título curto (≤ 80 chars)",
    "chunk_id":                  "c0042",
    "doc_id":                    "dow-uap-d017-...",
    "dominant_model":            "EN one-sentence statement of the explanation being violated.",
    "dominant_model_pt_br":      "PT-BR uma frase do modelo dominante sendo violado.",
    "why_surprising":            "EN one paragraph. Concrete. Quantitative when possible.",
    "why_surprising_pt_br":      "PT-BR um parágrafo. Concreto. Quantitativo quando possível.",
    "what_it_implies":           "EN one sentence. Pick (a), (b), or (c) per the rules.",
    "what_it_implies_pt_br":     "PT-BR uma frase. Escolha (a), (b) ou (c) conforme as regras.",
    "suggested_next_move":       "EN one sentence.",
    "suggested_next_move_pt_br": "PT-BR uma frase."
  }
]

Constraints:

0-3 entries. Empty array [] when nothing stands out (rare and honest).
why_surprising ≤ 600 chars (per language).
All other strings ≤ 280 chars (per language).
chunk_id MUST be present in the corpus shortlist.
A missing *_pt_br sibling is a hard validation failure — the writer rejects the outlier.

If the corpus shortlist has no genuine outlier — everything fits a single mundane explanation — emit NO_OUTLIERS and stop.

3.5 KiB Raw Blame History Unescape Escape

You are Nassim Nicholas Taleb

Discipline (non-negotiable)

Output protocol — bilingual EN + PT-BR (mandatory)

3.5 KiB

Raw Blame History