Commit graph

4 commits

Author SHA1 Message Date
Luiz Gustavo
70b2fe687f W5.4 (Phase 3B): sitemap + robots + Article schema + magazine reading view
Some checks failed
CI / Web — typecheck + lint + build (push) Failing after 31s
CI / Scripts — Python smoke (push) Failing after 5s
CI / Web — npm audit (push) Failing after 27s
CI / Retrieval — golden set (Recall@5 + MRR) (push) Failing after 5s
GEO/SEO surface area:

  app/robots.ts (new — Next.js dynamic robots)
    Explicitly ALLOWS major AI crawlers: GPTBot, OAI-SearchBot,
    ChatGPT-User, ClaudeBot, Claude-Web, anthropic-ai, PerplexityBot,
    Perplexity-User, Google-Extended, Applebot-Extended, CCBot,
    DuckAssistBot, YouBot, Bytespider, Amazonbot. The site exists to
    be cited by LLMs answering UAP/UFO questions — we want them in.
    /api/admin/, /admin/, /auth/ disallowed for everyone.

  app/sitemap.ts (new — Next.js dynamic sitemap)
    Lists 9 top-level routes + every /d/<doc> + every /c/<slug> from
    the filesystem + up to 500 entity URLs per class
    (event, person, uap_object, location, organization),
    sorted with summary-enriched entities first. ~3000 URLs total at
    current corpus size. lastModified honours summary_generated_at so
    crawlers re-index when entities are re-enriched.

  app/c/[slug]/page.tsx (rewritten — magazine reading view)
    - generateMetadata: per-case title, description (auto-extracted
      from the locale-preferred lead paragraph), canonical URL,
      hreflang alternate, OpenGraph article type with publishedTime,
      Twitter card.
    - JSON-LD Article schema embedded at end of page: schema.org
      Article + Organization publisher + inLanguage + isAccessibleForFree.
      This is what makes the case appear as a citable source in
      Google AI Overviews / Perplexity / ChatGPT search.
    - Reading view rewritten: display-serif headline (Fraunces), italic
      blockquotes with gold accent, prose-typography styling, no more
      detective stats line, no more "written by case-writer@detective"
      attribution. Locale-aware: PT-BR pulls topic_pt_br + lead in PT,
      English mirror.

  tailwind.config.ts
    + @tailwindcss/typography plugin
    + font-display family wired to var(--font-display) (Fraunces)

  package.json
    + "@tailwindcss/typography" devDependency

Phase 3A note: bulk entity enrichment hit Claude OAuth weekly quota mid-run.
6 events + 3 uap_objects landed bilingual summaries before the quota
exhausted. UI gracefully splits enriched vs bare entities so /sightings
shows the magazine-grade cards (Kenneth Arnold 1947, Roswell, Maury Island,
Joseph Perry 1960 lunar photo, Civil Defense Director 1966, etc.) on top
of a compact table of the rest. Re-run when quota refreshes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 16:09:50 -03:00
Luiz Gustavo
eaf282c535 W2: rerank opt-in, analyze_image_region tool, RAG eval, graph cleanup, ADRs
Some checks failed
CI / Web — typecheck + lint + build (push) Failing after 40s
CI / Scripts — Python smoke (push) Failing after 3s
CI / Web — npm audit (push) Failing after 29s
CI / Retrieval — golden set (Recall@5 + MRR) (push) Failing after 3s
- TD#8 hybrid.ts: rerank_strategy {always|when_top_k_gt|never} + threshold
  (default skips rerank for top_k ≤ 15; chat tool uses threshold 10)
- O11 vision.ts + tools.ts: analyze_image_region tool — sharp-crops the
  bbox, claude CLI reads the temp PNG via Read tool, Sonnet vision answers
- TD#12 /graph: SigmaGraph replaces ForceGraphCanvas; react-force-graph-2d
  uninstalled (-37 transitive deps); force-graph-canvas.tsx deleted
- TD#27 messages/route.ts gatherContext slice sizes via CTX_* env vars
- TD#22 tests/rag/: golden.yaml (15 queries) + run.py (Recall@k + MRR +
  negative-pass rate) + baseline.json + CI job in .forgejo/workflows/ci.yml
- docs/adrs/: ADR-001..005 published from systems-atelier deliverables

Verified live on disclosure.top: top_k=5 path skips rerank (6.7s embed-only,
was 12-15s with rerank); rerank=always still available on demand.
First RAG baseline: Recall@5 = 0.2083, MRR = 0.25, Negative pass = 1.0.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 19:20:09 -03:00
Luiz Gustavo
55cac8a395 W0+W1+W1.2: security hardening, observability, autocomplete, glitchtip, forgejo CI
Some checks failed
CI / Web — typecheck + lint + build (push) Failing after 1m30s
CI / Scripts — Python smoke (push) Failing after 32s
CI / Web — npm audit (push) Failing after 37s
W0 — security hardening (5 fixes verified live on disclosure.top)
- middleware: gate /api/admin/* same as /admin/* (F1)
- imgproxy: tighten LOCAL_FILESYSTEM_ROOT from / to /var/lib/storage (F2)
- studio: real basic-auth label (bcrypt hash, middleware reference) (F3)
- relations: ENABLE ROW LEVEL SECURITY + public SELECT policy (F4)
- migration 0003: fold is_searchable + hybrid_search update into canonical (TD#2)

W1 — observability + resilience + autocomplete
- studio: HOSTNAME=0.0.0.0 so Next.js binds on loopback for healthcheck
- compose: PG_POOL_MAX=20, CLAUDE_CODE_OAUTH_TOKEN gated by separate env
- claude-code.ts: subprocess timeout configurable (CLAUDE_CODE_TIMEOUT_MS)
- openrouter.ts: retry with exponential backoff + Retry-After + in-memory
  circuit breaker (promotes FALLBACK after CB_THRESHOLD failures)
- lib/logger.ts: pino logger (NDJSON prod / pretty dev) + withRequest helper
- middleware: mints correlation_id, stamps x-correlation-id response header,
  emits structured http_request log per /api/* call
- messages/route.ts: switch to structured logger
- 60_meili_index.py: push documents + chunks into Meilisearch
- /api/search/autocomplete: parallel meili search (docs + chunks), 5-8ms p50
- search-autocomplete.tsx: debounced dropdown wired into search-panel

W1.2 — Glitchtip + Forgejo self-hosted
- compose: glitchtip-redis + glitchtip-web + glitchtip-worker (v4.2)
- compose: forgejo + forgejo-runner (server v9, runner v6) with group_add=988
- @sentry/nextjs SDK wired (instrumentation.ts + sentry.{client,server}.config.ts)
- /api/admin/throw smoke endpoint (gated by W0-F1 middleware)
- Synthetic event ingestion verified at glitchtip.disclosure.top
- forgejo.disclosure.top up, repo discadmin/disclosure-bureau created,
  runner registered (labels: ubuntu-latest, docker)
- .forgejo/workflows/ci.yml: typecheck + lint + build + npm audit + python
  syntax + compose validation

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 18:18:42 -03:00
guto
19d0678e55 baseline: Disclosure Bureau pipeline + Next.js UI + Supabase stack 2026-05-17 22:44:36 -03:00