Commit graph

11 commits

Author SHA1 Message Date
Luiz Gustavo
2ac42b99a7 W5.5 (Phase 3C): Sun-Tzu strategist feeder + entity hero illustrations
Some checks failed
CI / Web — typecheck + lint + build (push) Failing after 33s
CI / Scripts — Python smoke (push) Failing after 5s
CI / Web — npm audit (push) Failing after 24s
CI / Retrieval — golden set (Recall@5 + MRR) (push) Failing after 3s
Sun-Tzu (silent backend) — builds the strongest pro-anomaly brief the
corpus supports for any topic. Bilingual JSON: thesis + 2-4 pillars
(each with claim + citation-backed support) + honest residual
unexplained clause. NEVER surfaced reader-facing.

  Migration 0009 (apply as supabase_admin):
    public.pro_anomaly_briefs  brief_pk BIGSERIAL PK
                               brief_id B-NNNN unique
                               topic + topic_pt_br
                               thesis + thesis_pt_br
                               pillars JSONB
                               unexplained + unexplained_pt_br
                               doc_id, job_id, created_by, created_at
    + brief_id_seq sequence
    + GIN trigram indexes on topic + topic_pt_br
    + RLS policies (investigator INSERT, public SELECT)
    + GRANTs on seq + table to investigator

  prompts/sun-tzu.md
    "Adversarial strategist who plays the pro-disclosure side with the
    same rigour a red-team plays skeptic" — single thesis, 2-4 pillars,
    honest residual. Every claim cites a chunk. No fabrication from
    training-time knowledge. Output INTERNAL — case-writer pulls it.
    Bilingual mandatory. NO_STRONG_CASE sentinel when corpus is thin.

  detectives/sun_tzu.ts
    Grounds with hybridSearch top 18 chunks, calls Sonnet, parses
    JSON strict, calls writeProAnomalyBrief.

  tools/write_pro_anomaly_brief.ts
    Validates 2-4 pillars with bilingual claim+support, requires at
    least one [[wiki-link]] citation per pillar, INSERTs.

  orchestrator: new kind "anomaly_brief" dispatches Sun-Tzu.

Case-writer integration (detectives/case_writer.ts):
  - Pulls most recent matching brief via ILIKE on topic or doc_id.
  - Renders brief as a separate prompt section labelled
    "Strategic brief (internal — do NOT cite or attribute)".
  - Instructs the narrator to weave the thesis as a quiet through-
    line, use pillar facts in scenes, let the unexplained clause
    inform the closing paragraph. Forbidden to name "the analyst",
    say "a brief argues", or use the words "thesis"/"pillar"
    explicitly. Translate it into prose.

Entity hero illustrations:
  - 3 painterly editorial illustrations generated via Nano Banana
    Pro at 2K, stored under /data/disclosure/processing/case-art/:
    * EV-1947-06-24-kenneth-arnold-sighting.png — cockpit POV of
      Arnold in a CallAir A-2 over Mount Rainier, 9 chevron disc
      objects in formation, 1947 Life-magazine register.
    * EV-1947-07-08-roswell-incident.png — debris field in NM
      desert, USAAF officer in 1947 uniform examining foil
      fragments, period staff car.
    * EV-1947-06-21-maury-island-incident.png — wooden patrol
      boat on Puget Sound, 6 doughnut craft hovering, one
      shedding glowing slag, Harold Dahl + son + dog watching.
  - app/e/[cls]/[id]/page.tsx: full-bleed editorial hero replaces
    the old gradient header card when an illustration exists for
    that entity_id. Title sits over the painting with gradient
    overlay. "Ilustração editorial" chip in the top-right.

Quota note: Claude OAuth still rate-limited as of this commit, so
Sun-Tzu hasn't been smoke-tested in production. Code is shipped and
ready; first brief will land when the weekly quota refreshes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 16:41:20 -03:00
Luiz Gustavo
f2b7b116ce W5.3 (Phase 3A): entity summaries — sub-pages get magazine-grade prose
Some checks failed
CI / Web — typecheck + lint + build (push) Failing after 45s
CI / Scripts — Python smoke (push) Failing after 4s
CI / Web — npm audit (push) Failing after 41s
CI / Retrieval — golden set (Recall@5 + MRR) (push) Failing after 3s
Today /sightings, /witnesses, /objects, /locations and /operations show
a name + mention count and nothing else. After this each row carries a
60-100 word bilingual narrative summary written from the chunks where
the entity actually appears.

Migration 0008 (apply as supabase_admin):
  public.entities  +summary_en TEXT
                   +summary_pt_br TEXT
                   +summary_generated_at TIMESTAMPTZ
                   +summary_model TEXT
                   +summary_status TEXT
                     CHECK ('pending'|'ai_generated'|'curated'|'refused')
  + index on summary_status
  + GRANT UPDATE (summary_*) ON entities TO investigator
  + new policy entities_investigator_update_summary (RLS UPDATE for
    investigator role)

Enrichment script (investigator-runtime/scripts/enrich_entity_summaries.ts):
  - Per-class config (chunk_k, min_mentions, max_per_class)
  - Path A: entity_mentions JOIN chunks (high-precision linker)
  - Path B (fallback): hybridSearch on canonical_name + aliases when
    entity_mentions returns zero. This is what unlocked Kenneth Arnold
    and similar entities — their wiki YAML has high total_mentions
    counted from frontmatter mentioned_in[], but the entity_mentions
    extractor was silent because the matches came from the wiki text,
    not the OCR chunks.
  - Sonnet 4.6 via OAuth Max, ~$0.04 per entity, ~$10 for the full
    260-entity bulk run.
  - INSUFFICIENT skip when chunks can't sustain a 60-word summary —
    refused entries get summary_status='refused' so they're not retried.

UI uplift:
  - lib/retrieval/entity-pages.ts: getEntityCore now prefers the DB
    summary (ai_generated or curated) over wiki YAML narrative.
  - components/entity-list-page.tsx:
    * SELECT now pulls summary_en, summary_pt_br, summary_status
    * Sorted with summary-enriched rows first (so the magazine grid
      lands on quality content immediately)
    * MagazineGrid: 4-line summary preview replaces aliases line
    * CompactGrid: enriched rows render as full editorial cards,
      bare rows fall back to a compact table below

Smoke results:
  - Kenneth Arnold sighting: "On June 24, 1947, pilot Kenneth Arnold
    reported sighting unidentified objects over the Pacific Northwest,
    and the account spread worldwide. It set off a run of similar
    reports: County Commissioner Crankes saw comparable objects after
    Arnold's account reached the press, and United Airlines pilot
    Emil H. Smith spotted flying discs on July 4 during a routine
    flight out of Boise, Idaho..."
  - Roswell Incident: includes Colonel Corso's 1997 book + the 1995
    GAO finding that radio messages from Oct 46–Feb 47 were destroyed
    + Senator Strom Thurmond's foreword. Real magazine-grade content.

Background bulk run kicked off across all 5 classes (event,
uap_object, person, location, organization) — populating live as
the homepage rebuilds.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 15:37:01 -03:00
Luiz Gustavo
7826710051 W4: bilingual EN + PT-BR Investigation Bureau (CLAUDE.md §3 contract)
Some checks failed
CI / Web — typecheck + lint + build (push) Failing after 41s
CI / Scripts — Python smoke (push) Failing after 4s
CI / Web — npm audit (push) Failing after 26s
CI / Retrieval — golden set (Recall@5 + MRR) (push) Failing after 4s
User flagged that the bureau was emitting English-only output, violating
the project's bilingual rule. Every narrative field now ships in both
languages: stored in sibling DB columns + rendered as adjacent markdown
sections per CLAUDE.md §3.

Migration 0007 (apply as supabase_admin):
  - public.hypotheses    +question_pt_br, +position_pt_br,
                         +argument_for_pt_br, +argument_against_pt_br
  - public.contradictions +topic_pt_br, +notes_pt_br
  - public.witnesses     +access_to_event_pt_br, +bias_notes_pt_br,
                         +verdict_pt_br
  - public.gaps          +description_pt_br, +suggested_next_move_pt_br
  - public.evidence: unchanged (verbatim_excerpt stays source-language)
  - JSONB siblings inside contradictions.chunks + gaps.scope handled at
    runtime (statement_pt_br, title_pt_br, dominant_model_pt_br,
    why_surprising_pt_br, what_it_implies_pt_br).

Detective prompts (all 7) rewritten with explicit bilingual JSON contract:
  - Output protocol section names every EN field + its _pt_br sibling
  - "Bilingual is mandatory" warning in the task instruction
  - Sentinel skip-states unchanged (NO_HYPOTHESES, NO_CONTRADICTIONS,
    INSUFFICIENT_TESTIMONY, INSUFFICIENT_HYPOTHESIS, NO_OUTLIERS,
    NO_NEW_EVIDENCE, INSUFFICIENT_ARTEFACTS)
  - Schneier: parallel arrays — hidden_assumptions[i] matches
    hidden_assumptions_pt_br[i], lengths must match
  - Case-Writer: interleaved §1 (EN) / §1 (PT-BR) per act in the body

Writer-side validation (all 7 tools):
  - Reject INSERT if PT-BR sibling missing when EN field is set
  - Persist both languages atomically in one INSERT (no half-updates)
  - Markdown renderers write adjacent EN+PT-BR sections in case files
    (## Argument for (EN) followed by ## Argumento a favor (PT-BR), etc.)

Detective parse layer (all 7 detectives):
  - Coerce both keys from JSON output
  - "incomplete_bilingual_*" skip reason when either side missing
  - Defensive: PT-BR fields trimmed + length-capped same as EN

Orchestrator propagates question_pt_br + topic_pt_br through job payload
to runHolmes / runCaseWriter, mirroring the chat-tool entry point.

Web (UI):
  - /api/jobs/[id] hydrates _pt_br siblings from pg
  - job-status-poller HypothesisCard: PT-BR primary, EN in <details>
    fallback when both exist
  - ContradictionCard: PT-BR statement primary + secondary EN quote
  - WitnessCard: PT-BR verdict primary + secondary EN quote, panels in PT
  - GapCard: PT-BR title/why/implies primary
  - /bureau hub: SELECTs both columns, renders PT-BR primary
  - /h/[id]: ArgumentPanel renders PT-BR primary with collapsible EN
    fallback when both exist
  - BureauSnapshot homepage: position_pt_br / topic_pt_br / verdict_pt_br
    primary
  - DocBureauPanel /d/[doc]: same primary-PT-BR pattern
  - New web/lib/i18n/pick.ts helper (unused yet by chat/agents — kept
    for future locale-driven switching when both languages are equally
    full; current rule is PT-BR-first since the user is brasileiro)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 12:02:59 -03:00
Luiz Gustavo
f013bea462 W3.9 followup: mount case/ ro into web container
Some checks failed
CI / Web — typecheck + lint + build (push) Failing after 30s
CI / Scripts — Python smoke (push) Failing after 4s
CI / Web — npm audit (push) Failing after 35s
CI / Retrieval — golden set (Recall@5 + MRR) (push) Failing after 3s
/c/[slug] returned 404 even after the W3.9 web rebuild because the web
container's volume list didn't include the case/ directory the
investigator-runtime writes to. The BureauSnapshot file-listing for
Case reports gracefully fell back to empty, but /c/<slug> can't fall
back: it has to read the markdown.

Fix:
  - Mount ${CASE_ROOT:-/data/disclosure/case}:/data/ufo/case:ro (read-only,
    same pattern as wiki/processing/raw).
  - Set CASE_ROOT=/data/ufo/case env in the web container so the
    /c/[slug] page and BureauSnapshot resolve the same path.

Verified live: /c/green-fireballs-sandia now serves HTTP 200 with the
Watson narrative parsed + rendered via MarkdownBody.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 22:45:00 -03:00
Luiz Gustavo
54a26f8db8 W3 followup: drop _FOR_WEB token, fix claude CLI args + writer guards, BIGSERIAL grants
Some checks failed
CI / Web — typecheck + lint + build (push) Failing after 46s
CI / Scripts — Python smoke (push) Failing after 4s
CI / Web — npm audit (push) Failing after 34s
CI / Retrieval — golden set (Recall@5 + MRR) (push) Failing after 4s
Token consolidation:
- docker-compose web service now reads ${CLAUDE_CODE_OAUTH_TOKEN} directly,
  drop the W1-F8 CLAUDE_CODE_OAUTH_TOKEN_FOR_WEB indirection (user feedback:
  one var name, no _FOR_WEB suffix).

investigator-runtime claude.ts:
- --system-prompt silently dropped by CLI v2.1.150 for multi-KB prompts;
  inline the system content into the user prompt with a separator
  (mirrors scripts/reextract/run.py pattern).
- Multi-line prompts via positional -- broke ("Input must be provided …");
  pipe via stdin instead.
- --allowedTools "" is rejected; when no tools wanted, omit it and explicitly
  --disallowedTools the writer/reader set so the model can't reach for any.

investigator-runtime locard.ts:
- Log the raw response (first 600 chars) to container stderr — saved hours
  of debugging when the writer rejected.
- Grade fallback: when Locard omits `grade` but provides custody_steps,
  infer the highest grade that fits (≥3 → A, ≥2 → B, ≥1 → C).

investigator-runtime write_evidence.ts:
- Filter related_hypotheses entries with empty/null hypothesis_id silently
  (Locard sometimes emits [{}] when it knows no link yet) instead of
  failing the whole write.

Migration 0006_investigator_serial_sequences.sql:
- BIGSERIAL on the 7 investigation tables created auto-sequences
  (evidence_evidence_pk_seq etc) that 0004 forgot to GRANT to the
  investigator role. Without those grants every INSERT failed with
  "permission denied for sequence …". Grant USAGE/SELECT/UPDATE on each
  auto-seq.

Verified live: Locard wrote E-0002 + E-0003 from real Sandia chunks
(green fireball Feb 1949; cobalt particle analysis). Grade B, confidence
high, custody chain of 3 steps with honest gaps. Cost $0.09 for both,
~70s wall.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 21:05:35 -03:00
Luiz Gustavo
189a771cbe W3.1-W3.4: Investigation Bureau foundation — migrations, runtime, Locard
Some checks failed
CI / Web — typecheck + lint + build (push) Failing after 38s
CI / Scripts — Python smoke (push) Failing after 3s
CI / Web — npm audit (push) Failing after 33s
CI / Retrieval — golden set (Recall@5 + MRR) (push) Failing after 4s
Migrations:
- 0004_investigation_bureau.sql: 7 new tables (investigation_jobs + evidence,
  hypotheses, contradictions, witnesses, gaps, residual_uncertainties), id
  sequences, pg_notify trigger on investigation_jobs, RLS read-only public,
  investigator role with least-privilege grants (no service_role).
- 0005_investigator_write_policies.sql: fixup adding RLS INSERT/UPDATE
  policies bound to investigator + service_role + postgres (RLS with only a
  SELECT policy was silently blocking the worker's claim UPDATE).

investigator-runtime/ (new Bun + TS container):
- src/main.ts: LISTEN/NOTIFY poller, claim-with-SKIP-LOCKED, drain pool,
  healthcheck file, graceful SIGTERM shutdown.
- src/orchestrator.ts: chief-detective dispatch (evidence_chain → Locard).
  Marks job failed when all per-item outputs error; surfaces first errors.
- src/lib/{env,pg,audit,ids,claude}.ts: typed config (gate #8), pool +
  dedicated LISTEN client, NDJSON audit, sequence allocator (E-NNNN etc),
  claude -p subprocess with quota detection (api_error_status=429).
- src/tools/write_evidence.ts: schema-validate (grade A/B/C custody steps),
  resolve chunk_pk via FK, verify verbatim_excerpt actually appears in
  chunk content, INSERT + render case/evidence/E-NNNN.md + audit.
- src/detectives/locard.ts: load chunk → call Claude with locard.md system
  prompt → parse strict JSON → call writeEvidence locally.
- Dockerfile installs `claude` CLI (OAuth) at build time.

Compose:
- new `investigator` service builds from investigator-runtime/, connects
  with low-privilege role, mounts case/ RW and wiki/+raw/ RO, 512m mem cap.

Web:
- /api/admin/investigate/test (POST+GET) gated by middleware (W0-F1).
  POST creates a job, GET polls status. For W3.6 it becomes the chat tool.

End-to-end smoke: INSERT job → pg_notify → claim → Locard dispatch →
claude subprocess invoked. Auth works (CLI v2.1.150). Currently quota
exhausted (weekly limit · resets 3pm UTC) — pipeline catches the typed
isQuota error, marks job failed with surfaced reason. Architecture proven;
quota reset enables real evidence creation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 19:49:33 -03:00
Luiz Gustavo
55cac8a395 W0+W1+W1.2: security hardening, observability, autocomplete, glitchtip, forgejo CI
Some checks failed
CI / Web — typecheck + lint + build (push) Failing after 1m30s
CI / Scripts — Python smoke (push) Failing after 32s
CI / Web — npm audit (push) Failing after 37s
W0 — security hardening (5 fixes verified live on disclosure.top)
- middleware: gate /api/admin/* same as /admin/* (F1)
- imgproxy: tighten LOCAL_FILESYSTEM_ROOT from / to /var/lib/storage (F2)
- studio: real basic-auth label (bcrypt hash, middleware reference) (F3)
- relations: ENABLE ROW LEVEL SECURITY + public SELECT policy (F4)
- migration 0003: fold is_searchable + hybrid_search update into canonical (TD#2)

W1 — observability + resilience + autocomplete
- studio: HOSTNAME=0.0.0.0 so Next.js binds on loopback for healthcheck
- compose: PG_POOL_MAX=20, CLAUDE_CODE_OAUTH_TOKEN gated by separate env
- claude-code.ts: subprocess timeout configurable (CLAUDE_CODE_TIMEOUT_MS)
- openrouter.ts: retry with exponential backoff + Retry-After + in-memory
  circuit breaker (promotes FALLBACK after CB_THRESHOLD failures)
- lib/logger.ts: pino logger (NDJSON prod / pretty dev) + withRequest helper
- middleware: mints correlation_id, stamps x-correlation-id response header,
  emits structured http_request log per /api/* call
- messages/route.ts: switch to structured logger
- 60_meili_index.py: push documents + chunks into Meilisearch
- /api/search/autocomplete: parallel meili search (docs + chunks), 5-8ms p50
- search-autocomplete.tsx: debounced dropdown wired into search-panel

W1.2 — Glitchtip + Forgejo self-hosted
- compose: glitchtip-redis + glitchtip-web + glitchtip-worker (v4.2)
- compose: forgejo + forgejo-runner (server v9, runner v6) with group_add=988
- @sentry/nextjs SDK wired (instrumentation.ts + sentry.{client,server}.config.ts)
- /api/admin/throw smoke endpoint (gated by W0-F1 middleware)
- Synthetic event ingestion verified at glitchtip.disclosure.top
- forgejo.disclosure.top up, repo discadmin/disclosure-bureau created,
  runner registered (labels: ubuntu-latest, docker)
- .forgejo/workflows/ci.yml: typecheck + lint + build + npm audit + python
  syntax + compose validation

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 18:18:42 -03:00
Luiz Gustavo
504b20fa5c search: gate dense recall by cosine-distance threshold in the RPC
Root-cause fix for "search returns garbage for absent terms". The hybrid RPC's
dense branch always returned its k nearest vectors regardless of distance, so a
query for a term not in the corpus (e.g. "varginha") surfaced unrelated chunks.
The cross-encoder reranker would filter these but costs 18-62s on CPU —
unusable for interactive search.

Add max_dense_dist (default 0.40) to hybrid_search_chunks: dense neighbours
beyond that cosine distance are dropped server-side. Calibrated from measured
distances — strong semantic match ~0.12-0.20, no real match ~0.46-0.53. BM25
full-text still matches literal terms; the reranker becomes opt-in refinement.

Verified live: varginha/abducao → 0, disco voador/roswell → relevant, all <1s.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-21 16:36:56 -03:00
guto
7d13f93393 ship: synthesize 158 entities, AG-UI artifacts, chat persistence, auth flow
Fase 3 onda 2 — entity synthesis at scale:
- scripts/synthesize/20_entity_summary.py: queries DB for entities with
  total_mentions ≥ threshold + top-K verbatim chunk snippets via
  entity_mentions JOIN, prompts Sonnet (Holmes-Watson voice, bilingual),
  writes narrative_summary EN+PT-BR + summary_status=synthesized.
  Ran on 187 candidates (mentions ≥ 20) → 158 OK · 1 err · 29 skipped (no
  snippets). Combined with anchor curation: 20 curated + 158 synthesized
  = 178 entities with real narrative (vs 0 a day ago).

Fase 4 — chat with typed artifacts + persistence:
- lib/chat/agui.ts: AG-UI v1 typed Artifact union (citation, crop_image,
  entity_card, evidence_card, hypothesis_card, case_card, navigation_offer)
  alongside the existing event types.
- lib/chat/tools.ts + openrouter.ts: hybrid_search emits up to 6
  citation + crop_image artifacts per query. Provider collects them and
  returns in done.artifacts so the route can persist.
- api/sessions/[id]/messages: persist artifacts to messages.citations.
- components/chat-bubble.tsx: ArtifactCard renders inline cards (citation,
  crop_image, entity_card, navigation_offer) for streamed and persisted
  messages. activeId now persisted in localStorage so navigation between
  pages keeps the same conversation. New sessions are lazy (only when user
  has zero). loadMessages hydrates tools + artifacts from server. CRUD UI:
  rename (✎) + archive (🗑) buttons per session in the list.

Home search:
- doc-list-filters: input now fires hybrid_search (rerank=0 for speed)
  in parallel with the local title filter; chunk hits render above the doc
  grid with snippet + score + classification.
- api/search/hybrid: accept ?rerank=0 to skip the cross-encoder (1.3s vs 60s).

Auth flow:
- infra: SMTP_HOST=mail.spacemail.com:587 + DMARC published; mail now lands
  in inbox. GOTRUE_MAILER_AUTOCONFIRM=false (real email verification).
- kong.yml: proxy /auth/callback on api.disclosure.top → web:3000 so PKCE
  email links don't 404 at the gateway.
- web/app/auth/callback: handle both ?code= (OAuth) and ?token=&type=
  (PKCE); redirect to the public site host before verifyOtp so the session
  cookie lands on the right domain.

Audit deliverables:
- .nirvana/outputs/disclosure-bureau/.../systems-atelier/: 5 docs (code
  analysis, tech debt, discovery brief, system arch, 5 ADRs) authored by
  sa-principal that produced this roadmap. Kept in-tree for traceability.
2026-05-18 03:52:59 -03:00
guto
4459bd17e4 phase-0: kill stubs, ship 20 curated anchor events, configure SMTP
- scripts/03-dedup-entities.py: stop emitting placeholder narrative ("Stub. Will
  be enriched in Phase 7"); write summary_status=none + null fields instead.
- scripts/maintain/41_strip_stubs.py: idempotent migration that cleaned the
  22,096 entity .md files (now zero stub strings in wiki/).
- scripts/synthesize/01_anchor_events.py: curated 20 anchor UAP events
  (Roswell, Nimitz Tic-Tac, Phoenix Lights, Operação Prato, AATIP, etc.) with
  bilingual Holmes-Watson narrative via claude -p --model sonnet
  (CLAUDE_CODE_OAUTH_TOKEN). All summary_status=curated, confidence=high.
- web/api/timeline + timeline-view: filter narrative-less events by default,
  render "curado" badge for hand-vetted ones, drop the date display alone.
- CLAUDE-schema-full.md: document the summary_status enum and the four states.
- docker-compose.yml: SMTP_HOST=mail.spacemail.com configured;
  GOTRUE_MAILER_AUTOCONFIRM flipped to false (real email confirmation working).
- .nirvana/outputs/.../systems-atelier/: 5 deliverables of the architecture
  audit that produced this roadmap.
2026-05-18 00:44:17 -03:00
guto
19d0678e55 baseline: Disclosure Bureau pipeline + Next.js UI + Supabase stack 2026-05-17 22:44:36 -03:00