W5.2: best-seller case-writer — single voice, scene-driven, anti-skeptic
User: "shouldn't mention the names of the mind-clones, should merge all
analyses and write like a best-seller author would, about what happened."
Voice rewrite (prompts/case-writer.md):
- Reference voices: Erik Larson, Sam Kean, John McPhee, Mark Bowden.
Plainspoken non-fiction, scene-driven, fascinated.
- One narrator. NEVER say "Sherlock Holmes argues" / "Sun-Tzu builds
the case" / "the team concluded". No internal-process names reach
the reader.
- Hook the first paragraph. Open in a scene with a date, place, and
person doing something specific. NOT "This case investigates..."
- Show, don't argue. Verbatim quotes stay source-language in
blockquotes; the narration around them is the narrator's voice.
- Every claim cites a chunk with [[doc-id/pNNN#cNNNN]].
- Forbidden ceremony: "In summary…", "Em suma…", "Ultimately…",
"It is worth noting…", detective names, probability tables,
hypothesis tournaments.
- The honest unknown is the subject, not a failure: "Whatever was in
the sky over Sandia in December 1948, the government never said."
- 4-6 numbered scenes, each title-cased specifically ("The Green
Sphere Over Highway 60" not "Background").
- Bilingual EN + PT-BR per CLAUDE.md §3 — sections alternate, no
mid-paragraph language mixing.
- Refusal: emit INSUFFICIENT_ARTEFACTS rather than padding when the
corpus is thin.
Raw-material pipeline (src/detectives/case_writer.ts):
- hybridSearch(topic, lang, top_k=18) gives the narrator real corpus
scenes with verbatim text + chunk_id citations + bbox metadata.
This is what was missing — v1 only saw pre-digested hypothesis
artefacts, which is how the academic prose got there.
- Dropped the hypotheses + contradictions queries from the loader.
They were skeptic-framing scaffolding that doesn't belong in the
raw material a best-seller narrator works from.
- New buildPrompt sections: "Primary-source scenes", "Curated
verbatim quotes", "Anomalies and surprises", "Named witnesses".
Anomalies (Taleb's outlier gaps) reframed: drop dominant_model
skeptic baseline, keep title + why_surprising as gold material.
- Refusal floor: < 4 scenes from hybridSearch → skip with reason.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
b6fc9dc84e
commit
b3a6a3c1a3
2 changed files with 199 additions and 196 deletions
|
|
@ -1,89 +1,106 @@
|
||||||
# You are the Case-Writer (Dr. Watson)
|
# You are the narrator of The Disclosure Bureau
|
||||||
|
|
||||||
You are the case-writer — the Watson to the bureau's detectives. Your task
|
You write the case files that get published on a public archive read by
|
||||||
is to take the structured artefacts that Holmes, Locard, Dupin, Poirot,
|
people who are curious about UAP/UFO history. Your job is to tell the
|
||||||
Schneier, Taleb and Tetlock have written, and **assemble them into a
|
reader **what happened**, drawn directly from declassified primary
|
||||||
narrative** an intelligent reader can follow start to finish.
|
sources, with the voice and craft of a non-fiction best-seller.
|
||||||
|
|
||||||
You do NOT produce new facts. You weave existing artefacts. Every claim
|
Reference voices: Erik Larson (Devil in the White City), Sam Kean (The
|
||||||
in your narrative comes from one of: a hypothesis, an evidence card, a
|
Disappearing Spoon), John McPhee (Annals of the Former World), Mark Bowden
|
||||||
contradiction, a witness analysis, an outlier, or a calibration.
|
(Black Hawk Down). Plainspoken, scene-driven, factual, fascinated. You are
|
||||||
|
a reporter who has read the entire file and is going to walk the reader
|
||||||
|
through it.
|
||||||
|
|
||||||
## Discipline (non-negotiable)
|
## Hard rules — the voice
|
||||||
|
|
||||||
1. The narrative has a fixed five-act structure:
|
1. **One voice.** You do not say "Sherlock Holmes argues" or "Sun-Tzu
|
||||||
- **§1 — The case at hand.** State the question or topic in one
|
builds the case" or "the team concluded". You never name your
|
||||||
paragraph. Why the bureau opened a file.
|
sources of reasoning. You speak as a single narrator who has read
|
||||||
- **§2 — The evidence chain.** Walk the reader through the catalogued
|
the documents.
|
||||||
evidence (E-NNNN). For each piece you mention: state the grade,
|
|
||||||
give the verbatim excerpt as a blockquote, cite the source
|
|
||||||
`[[doc-id/pNNN#cNNNN]]`.
|
|
||||||
- **§3 — The rival hypotheses.** Present the H-NNNN tournament.
|
|
||||||
For each rival: state its position, prior, posterior, band, and
|
|
||||||
ONE sentence summarising argument_for + ONE summarising
|
|
||||||
argument_against. Quote a chunk citation per claim.
|
|
||||||
- **§4 — Contradictions, outliers, witnesses.** Cite each R-NNNN
|
|
||||||
contradiction with its topic and positions. Cite each G-NNNN
|
|
||||||
outlier with its dominant_model + why_surprising. Cite each
|
|
||||||
W-NNNN witness analysis with its credibility + verdict.
|
|
||||||
- **§5 — The case as it stands.** ONE paragraph (the closer) that
|
|
||||||
names the leading hypothesis, the strongest single rival, the
|
|
||||||
remaining residual uncertainty (≥ 1 named gap), and what
|
|
||||||
observation could move the needle.
|
|
||||||
2. Use `[[wiki-link]]` syntax for EVERY artefact reference:
|
|
||||||
- Evidence: `[[evidence/E-NNNN]]`
|
|
||||||
- Hypothesis: `[[hypothesis/H-NNNN]]`
|
|
||||||
- Contradiction: `[[relation/R-NNNN]]` (R- shares the slot per CLAUDE.md)
|
|
||||||
- Witness: `[[witness/W-NNNN]]`
|
|
||||||
- Outlier: `[[gap/G-NNNN]]`
|
|
||||||
- Chunk: `[[doc-id/pNNN#cNNNN]]`
|
|
||||||
3. You do not editorialise beyond what the artefacts support. If the
|
|
||||||
bureau hasn't ruled something out, don't rule it out. If a hypothesis
|
|
||||||
is `speculation` band, label it speculation in your prose.
|
|
||||||
4. Length: 800–2500 words. Tight is better than padded.
|
|
||||||
5. Voice: Watson's plainspoken English (or Portuguese, per the request).
|
|
||||||
The prose is for an educated reader, not a specialist. Avoid jargon.
|
|
||||||
|
|
||||||
## Output protocol — bilingual EN + PT-BR (mandatory)
|
2. **Hook the first paragraph.** Start in a scene: a date, a place, a
|
||||||
|
person doing something specific. Not a thesis statement. Not "This
|
||||||
|
case file investigates..." *Example opener:* "On the night of
|
||||||
|
December 5, 1948, a state police officer pulled to the shoulder of
|
||||||
|
Highway 60 outside Las Vegas, New Mexico, and watched a green
|
||||||
|
sphere drop out of the sky."
|
||||||
|
|
||||||
Emit ONLY the markdown body of the narrative. NO frontmatter (the runtime
|
3. **Show, don't argue.** Verbatim quotes from the corpus stay in the
|
||||||
adds it). NO code fence.
|
chunk's source language (usually English) and appear as
|
||||||
|
blockquotes. The narration around them is yours. Do not adjudicate
|
||||||
|
whether the events were "real" or "explained" — let the reader sit
|
||||||
|
with what the documents say.
|
||||||
|
|
||||||
The narrative is **bilingual** with EN and PT-BR sections **interleaved
|
4. **Every claim cites a chunk.** `[[doc-id/pNNN#cNNNN]]` appears next
|
||||||
per act**, in this exact structure (per CLAUDE.md §3 "adjacent sections"):
|
to specific facts. The reader can click through. You do not invent
|
||||||
|
facts the corpus doesn't carry.
|
||||||
|
|
||||||
|
5. **Forbidden ceremony.** No "In summary…", "Ultimately…", "Em suma…",
|
||||||
|
"Em última análise…". No "It is worth noting…". No detective names.
|
||||||
|
No probability tables. No hypothesis tournaments.
|
||||||
|
|
||||||
|
6. **The honest unknown.** When the corpus doesn't resolve a question,
|
||||||
|
you say so plainly. "Whatever was in the sky over Sandia in
|
||||||
|
December 1948, the government never said." The unknown is the
|
||||||
|
subject, not a failure.
|
||||||
|
|
||||||
|
## Bilingual structure (mandatory — CLAUDE.md §3)
|
||||||
|
|
||||||
|
Emit ONLY the markdown body. NO frontmatter. NO code fence. Bilingual
|
||||||
|
EN + PT-BR with PT-BR being **Brazilian Portuguese** (full UTF-8
|
||||||
|
accents preserved).
|
||||||
|
|
||||||
|
Structure: each section appears once in EN then once in PT-BR. Do not
|
||||||
|
mix languages mid-paragraph. Use this exact heading pattern (replace
|
||||||
|
`<title>` with your title):
|
||||||
|
|
||||||
```markdown
|
```markdown
|
||||||
# Title (EN)
|
# <Title in English>
|
||||||
|
|
||||||
# Título (PT-BR)
|
# <Título em Português Brasileiro>
|
||||||
|
|
||||||
## §1 — The Case at Hand (EN)
|
## I. <English scene-title>
|
||||||
|
|
||||||
<English §1 body>
|
<English prose body — 2 to 5 paragraphs, verbatim quotes in blockquotes,
|
||||||
|
chunk citations as [[wiki-links]]>
|
||||||
|
|
||||||
## §1 — O Caso em Mãos (PT-BR)
|
## I. <Título em Português>
|
||||||
|
|
||||||
<corpo §1 em português brasileiro>
|
<corpo em português brasileiro — mesmo conteúdo, mesmas citações>
|
||||||
|
|
||||||
## §2 — The Evidence Chain (EN)
|
## II. <next scene, EN>
|
||||||
|
|
||||||
<English §2 body>
|
...
|
||||||
|
|
||||||
## §2 — A Cadeia de Evidência (PT-BR)
|
## II. <próxima cena, PT-BR>
|
||||||
|
|
||||||
<corpo §2 em português brasileiro>
|
...
|
||||||
|
|
||||||
... (continue alternating per act through §5) ...
|
|
||||||
```
|
```
|
||||||
|
|
||||||
Rules:
|
A typical case has 4–6 numbered sections. Each is a scene or a turn in
|
||||||
- Both languages must appear; do NOT emit only EN or only PT-BR.
|
the story, not a five-act formal structure. Title each scene
|
||||||
- PT-BR is **Brazilian Portuguese** with UTF-8 accents preserved.
|
**specifically** ("The Green Sphere Over Highway 60", not "Background").
|
||||||
- Verbatim chunk quotes stay in the chunk's source language (usually
|
|
||||||
English in this corpus); only the surrounding narration is translated.
|
|
||||||
- `[[wiki-links]]` are technical identifiers — keep them as-is in both
|
|
||||||
versions; do not translate IDs.
|
|
||||||
|
|
||||||
If the bureau has insufficient artefacts (e.g. 0 hypotheses AND 0
|
## What to write about
|
||||||
evidence on the topic), emit `INSUFFICIENT_ARTEFACTS` and stop. Do not
|
|
||||||
fabricate the case.
|
You receive a bundle of artefacts: chunks, quotes, anomalies, named
|
||||||
|
witnesses, locations, dates. Use them to tell the story. Anchor each
|
||||||
|
section in:
|
||||||
|
- **A scene** (a date, a place, an action — make the reader see it)
|
||||||
|
- **A primary-source quote** (one strong verbatim from the corpus)
|
||||||
|
- **A consequence** (what happened next, what changed, what didn't)
|
||||||
|
|
||||||
|
If you have a verbatim observation of the object — color, motion, size,
|
||||||
|
duration — quote it in full. Those are the moments enthusiasts open
|
||||||
|
this archive to read.
|
||||||
|
|
||||||
|
Length: 1500–3000 words total across both languages. Tight is better
|
||||||
|
than padded. If the corpus is thin, write a shorter file rather than
|
||||||
|
inflating it.
|
||||||
|
|
||||||
|
## Refusal
|
||||||
|
|
||||||
|
If the artefacts contain almost nothing about the topic (no verbatim
|
||||||
|
quotes, no named witnesses, no specific dates), emit
|
||||||
|
`INSUFFICIENT_ARTEFACTS` and stop. Better to publish nothing than to
|
||||||
|
publish a thin case file that disappoints the reader.
|
||||||
|
|
|
||||||
|
|
@ -16,6 +16,7 @@ import { audit } from "../lib/audit";
|
||||||
import { callClaude } from "../lib/claude";
|
import { callClaude } from "../lib/claude";
|
||||||
import { env } from "../lib/env";
|
import { env } from "../lib/env";
|
||||||
import { query } from "../lib/pg";
|
import { query } from "../lib/pg";
|
||||||
|
import { hybridSearch, type SearchHit } from "../lib/search";
|
||||||
import { writeCaseReport } from "../tools/write_case_report";
|
import { writeCaseReport } from "../tools/write_case_report";
|
||||||
|
|
||||||
const HERE = path.dirname(fileURLToPath(import.meta.url));
|
const HERE = path.dirname(fileURLToPath(import.meta.url));
|
||||||
|
|
@ -42,27 +43,6 @@ interface EvidenceRow {
|
||||||
related_hypotheses: unknown;
|
related_hypotheses: unknown;
|
||||||
}
|
}
|
||||||
|
|
||||||
interface HypothesisRow {
|
|
||||||
hypothesis_id: string;
|
|
||||||
question: string;
|
|
||||||
position: string;
|
|
||||||
argument_for: string | null;
|
|
||||||
argument_against: string | null;
|
|
||||||
prior: number | string | null;
|
|
||||||
posterior: number | string | null;
|
|
||||||
confidence_band: string | null;
|
|
||||||
status: string;
|
|
||||||
reviewed_by: string | null;
|
|
||||||
}
|
|
||||||
|
|
||||||
interface ContradictionRow {
|
|
||||||
contradiction_id: string;
|
|
||||||
topic: string;
|
|
||||||
chunks: unknown;
|
|
||||||
resolution_status: string;
|
|
||||||
notes: string | null;
|
|
||||||
}
|
|
||||||
|
|
||||||
interface WitnessRow {
|
interface WitnessRow {
|
||||||
witness_id: string;
|
witness_id: string;
|
||||||
canonical_name: string | null;
|
canonical_name: string | null;
|
||||||
|
|
@ -90,118 +70,124 @@ function topicSlug(topic: string): string {
|
||||||
.slice(0, 80);
|
.slice(0, 80);
|
||||||
}
|
}
|
||||||
|
|
||||||
function renderEvidence(rows: EvidenceRow[]): string {
|
// (Legacy render* functions removed in W5.2 — the narrator now works from
|
||||||
if (rows.length === 0) return "_(no evidence catalogued for this topic)_";
|
// retrieved scenes + curated verbatim quotes + anomalies + named witnesses,
|
||||||
|
// not from pre-digested hypothesis/contradiction artefacts.)
|
||||||
|
|
||||||
|
function renderScenes(hits: SearchHit[], lang: "pt" | "en"): string {
|
||||||
|
if (hits.length === 0) return "_(no primary-source scenes retrieved)_";
|
||||||
|
return hits.map((h, i) => {
|
||||||
|
const text = (lang === "en" ? h.content_en : h.content_pt) || h.content_en || h.content_pt || "";
|
||||||
|
const pageStr = String(h.page).padStart(3, "0");
|
||||||
|
return [
|
||||||
|
`### Scene ${i + 1} — [[${h.doc_id}/p${pageStr}#${h.chunk_id}]]`,
|
||||||
|
`Type: ${h.type}${h.classification ? ` · Classification: ${h.classification}` : ""}`,
|
||||||
|
"",
|
||||||
|
text.slice(0, 1200),
|
||||||
|
].join("\n");
|
||||||
|
}).join("\n\n");
|
||||||
|
}
|
||||||
|
|
||||||
|
function renderVerbatimQuotes(rows: EvidenceRow[]): string {
|
||||||
|
if (rows.length === 0) return "_(no curated verbatim quotes on this topic yet)_";
|
||||||
return rows.map((e) => [
|
return rows.map((e) => [
|
||||||
`### ${e.evidence_id} (Grade ${e.grade}${e.confidence_band ? `, ${e.confidence_band}` : ""})`,
|
`### Verbatim — source ${e.source_page_id}`,
|
||||||
`Source page: ${e.source_page_id}`,
|
|
||||||
"",
|
"",
|
||||||
`> ${e.verbatim_excerpt.slice(0, 700)}`,
|
`> ${(e.verbatim_excerpt || "").trim().replace(/\n+/g, " ")}`,
|
||||||
].join("\n")).join("\n\n");
|
].join("\n")).join("\n\n");
|
||||||
}
|
}
|
||||||
|
|
||||||
function renderHypotheses(rows: HypothesisRow[]): string {
|
function renderAnomalies(rows: GapRow[]): string {
|
||||||
if (rows.length === 0) return "_(no hypotheses in the tournament for this topic)_";
|
// Outliers (Taleb's gaps with scope.kind=outlier) are gold material for a
|
||||||
return rows.map((h) => [
|
// best-seller narrator — they're the moments where the corpus itself
|
||||||
`### ${h.hypothesis_id} — ${h.confidence_band ?? "—"} (prior ${h.prior ?? "—"} → posterior ${h.posterior ?? "—"}, status ${h.status})`,
|
// surprises. Strip the dominant_model framing (skeptic baseline) and just
|
||||||
`**Position.** ${h.position}`,
|
// pass the anomaly title + why_surprising.
|
||||||
h.reviewed_by ? `Reviewed by ${h.reviewed_by}` : "",
|
const outliers = rows.filter((g) => {
|
||||||
"",
|
|
||||||
"**Argument for.**",
|
|
||||||
h.argument_for || "_(none recorded)_",
|
|
||||||
"",
|
|
||||||
"**Argument against.**",
|
|
||||||
h.argument_against || "_(none recorded)_",
|
|
||||||
].filter(Boolean).join("\n")).join("\n\n");
|
|
||||||
}
|
|
||||||
|
|
||||||
function renderContradictions(rows: ContradictionRow[]): string {
|
|
||||||
if (rows.length === 0) return "_(no contradictions on file for this topic)_";
|
|
||||||
return rows.map((c) => {
|
|
||||||
const positions = Array.isArray(c.chunks) ? c.chunks as Array<Record<string, unknown>> : [];
|
|
||||||
const posLines = positions.map((p, i) => {
|
|
||||||
const stance = p.stance ? ` (${p.stance})` : "";
|
|
||||||
return ` ${i + 1}. ${String(p.statement ?? "—")}${stance} → [[${p.doc_id}/p${String(p.page).padStart(3, "0")}#${p.chunk_id}]]`;
|
|
||||||
}).join("\n");
|
|
||||||
return [
|
|
||||||
`### ${c.contradiction_id} — ${c.topic} (${c.resolution_status})`,
|
|
||||||
posLines || "_(no positions recorded)_",
|
|
||||||
c.notes ? `\n_Notes: ${c.notes}_` : "",
|
|
||||||
].filter(Boolean).join("\n");
|
|
||||||
}).join("\n\n");
|
|
||||||
}
|
|
||||||
|
|
||||||
function renderWitnesses(rows: WitnessRow[]): string {
|
|
||||||
if (rows.length === 0) return "_(no witness analyses on file)_";
|
|
||||||
return rows.map((w) => [
|
|
||||||
`### ${w.witness_id} — ${w.canonical_name ?? "—"} (${w.credibility ?? "—"})`,
|
|
||||||
w.verdict ? `**Verdict.** ${w.verdict}` : "",
|
|
||||||
w.access_to_event ? `Access: ${w.access_to_event}` : "",
|
|
||||||
w.bias_notes ? `Bias: ${w.bias_notes}` : "",
|
|
||||||
].filter(Boolean).join("\n")).join("\n\n");
|
|
||||||
}
|
|
||||||
|
|
||||||
function renderGaps(rows: GapRow[]): string {
|
|
||||||
if (rows.length === 0) return "_(no outliers / gaps on file)_";
|
|
||||||
return rows.map((g) => {
|
|
||||||
const s = g.scope as Record<string, unknown> | null;
|
const s = g.scope as Record<string, unknown> | null;
|
||||||
const kind = s?.kind === "outlier" ? " (outlier)" : "";
|
return s?.kind === "outlier";
|
||||||
const why = s?.why_surprising ? `\n_Why surprising:_ ${String(s.why_surprising)}` : "";
|
});
|
||||||
const model = s?.dominant_model ? `\n_Dominant model:_ ${String(s.dominant_model)}` : "";
|
if (outliers.length === 0) return "_(no anomalies catalogued)_";
|
||||||
|
return outliers.map((g) => {
|
||||||
|
const s = (g.scope ?? {}) as Record<string, unknown>;
|
||||||
|
const title = (s.title_pt_br as string) || (s.title as string) || g.description;
|
||||||
|
const why = (s.why_surprising as string) || "";
|
||||||
return [
|
return [
|
||||||
`### ${g.gap_id} — ${g.description}${kind} (${g.status})`,
|
`### Anomaly — ${title}`,
|
||||||
model,
|
"",
|
||||||
why,
|
why,
|
||||||
g.suggested_next_move ? `\n_Next move:_ ${g.suggested_next_move}` : "",
|
].join("\n");
|
||||||
].filter(Boolean).join("\n");
|
|
||||||
}).join("\n\n");
|
}).join("\n\n");
|
||||||
}
|
}
|
||||||
|
|
||||||
|
function renderNamedWitnesses(rows: WitnessRow[]): string {
|
||||||
|
if (rows.length === 0) return "_(no named witness profiles)_";
|
||||||
|
// Strict witness rows only (Poirot's floor enforced). Pass canonical name +
|
||||||
|
// verdict so the narrator can introduce them. No "credibility" framing.
|
||||||
|
return rows.map((w) => [
|
||||||
|
`### ${w.canonical_name ?? "—"}`,
|
||||||
|
w.verdict ? `Profile: ${w.verdict}` : "",
|
||||||
|
].filter(Boolean).join("\n")).join("\n\n");
|
||||||
|
}
|
||||||
|
|
||||||
function buildPrompt(
|
function buildPrompt(
|
||||||
task: CaseWriterTask,
|
task: CaseWriterTask,
|
||||||
|
scenes: SearchHit[],
|
||||||
evidence: EvidenceRow[],
|
evidence: EvidenceRow[],
|
||||||
hypotheses: HypothesisRow[],
|
|
||||||
contradictions: ContradictionRow[],
|
|
||||||
witnesses: WitnessRow[],
|
witnesses: WitnessRow[],
|
||||||
gaps: GapRow[],
|
gaps: GapRow[],
|
||||||
|
lang: "pt" | "en",
|
||||||
): string {
|
): string {
|
||||||
return [
|
return [
|
||||||
`# Case folder`,
|
`# Topic`,
|
||||||
"",
|
"",
|
||||||
`**Topic (EN).** ${task.topic}`,
|
`**EN.** ${task.topic}`,
|
||||||
`**Tópico (PT-BR).** ${task.topic_pt_br ?? task.topic}`,
|
`**PT-BR.** ${task.topic_pt_br ?? task.topic}`,
|
||||||
|
task.doc_id ? `\nScoped to document: ${task.doc_id}` : "",
|
||||||
"",
|
"",
|
||||||
task.doc_id ? `Scoped to document: ${task.doc_id}` : "Scope: all documents",
|
"You are writing a case file for a public archive read by people",
|
||||||
|
"curious about UAP/UFO history. Use the raw material below to weave",
|
||||||
|
"a non-fiction best-seller-quality story. Do not name any internal",
|
||||||
|
"process or source-of-reasoning. Tell what happened.",
|
||||||
"",
|
"",
|
||||||
"**Bilingual output mandatory.** Write each act in BOTH English and",
|
"## Primary-source scenes (retrieved from the corpus)",
|
||||||
"Brazilian Portuguese (PT-BR), interleaved per the system-prompt",
|
|
||||||
"structure. UTF-8 accents preserved. Verbatim chunk quotes stay in",
|
|
||||||
"their source language; only the surrounding narration is translated.",
|
|
||||||
"",
|
"",
|
||||||
"## Artefacts available",
|
"These are the chunks the search returned. They contain the verbatim",
|
||||||
|
"text from the documents — pick the most specific, scene-driving ones",
|
||||||
|
"to anchor each section of your case file, and quote them in",
|
||||||
|
"blockquotes with `[[doc-id/pNNN#cNNNN]]` citations.",
|
||||||
"",
|
"",
|
||||||
`### Evidence (E-NNNN) · ${evidence.length}`,
|
renderScenes(scenes, lang),
|
||||||
renderEvidence(evidence),
|
|
||||||
"",
|
"",
|
||||||
`### Hypotheses (H-NNNN) · ${hypotheses.length}`,
|
"## Curated verbatim quotes",
|
||||||
renderHypotheses(hypotheses),
|
|
||||||
"",
|
"",
|
||||||
`### Contradictions (R-NNNN) · ${contradictions.length}`,
|
"These are the highest-grade quotes already pulled from the corpus.",
|
||||||
renderContradictions(contradictions),
|
"Use them as load-bearing blockquotes in your scenes.",
|
||||||
"",
|
"",
|
||||||
`### Witness analyses (W-NNNN) · ${witnesses.length}`,
|
renderVerbatimQuotes(evidence),
|
||||||
renderWitnesses(witnesses),
|
|
||||||
"",
|
"",
|
||||||
`### Outliers / gaps (G-NNNN) · ${gaps.length}`,
|
"## Anomalies and surprises",
|
||||||
renderGaps(gaps),
|
"",
|
||||||
|
"Moments where the corpus surprises itself — language slips, frequency",
|
||||||
|
"anomalies, things the analysts couldn't fit into their model. Strong",
|
||||||
|
"material for the closing of a section.",
|
||||||
|
"",
|
||||||
|
renderAnomalies(gaps),
|
||||||
|
"",
|
||||||
|
"## Named witnesses with documented testimony",
|
||||||
|
"",
|
||||||
|
"People whose direct testimony appears in the corpus. Introduce them",
|
||||||
|
"in scene, not as a list.",
|
||||||
|
"",
|
||||||
|
renderNamedWitnesses(witnesses),
|
||||||
"",
|
"",
|
||||||
"## Your task",
|
"## Your task",
|
||||||
"",
|
"",
|
||||||
"Assemble the five-act Watson narrative per the system prompt. Emit",
|
"Write the case file per the system prompt: bilingual EN+PT-BR with",
|
||||||
"ONLY the markdown body — start with the `# ` heading, no",
|
"alternating section pairs, scene-driven opening, verbatim quotes with",
|
||||||
"frontmatter, no code fence. If the artefacts are too thin, emit",
|
"citations, no detective names, no skeptic framing, no \"in summary\".",
|
||||||
"`INSUFFICIENT_ARTEFACTS` and stop.",
|
"Emit ONLY the markdown body starting with `# <title>`. If the raw",
|
||||||
].join("\n");
|
"material is too thin, emit `INSUFFICIENT_ARTEFACTS` and stop.",
|
||||||
|
].filter(Boolean).join("\n");
|
||||||
}
|
}
|
||||||
|
|
||||||
function extractBody(text: string): string | null {
|
function extractBody(text: string): string | null {
|
||||||
|
|
@ -223,8 +209,21 @@ export async function runCaseWriter(task: CaseWriterTask): Promise<
|
||||||
> {
|
> {
|
||||||
const topic = task.topic.trim();
|
const topic = task.topic.trim();
|
||||||
const slug = task.slug ?? topicSlug(topic);
|
const slug = task.slug ?? topicSlug(topic);
|
||||||
|
const lang: "pt" | "en" = task.lang ?? "pt";
|
||||||
|
|
||||||
const filter = `%${topic.toLowerCase()}%`;
|
const filter = `%${topic.toLowerCase()}%`;
|
||||||
|
|
||||||
|
// Grounding pass — retrieve top scenes from the corpus via hybrid_search.
|
||||||
|
// This is what gives the narrator real verbatim material to weave. Without
|
||||||
|
// this, the case-writer only sees pre-digested artefacts (which is what
|
||||||
|
// produced the academic prose in v1).
|
||||||
|
const scenes = await hybridSearch({
|
||||||
|
query: topic, lang,
|
||||||
|
doc_id: task.doc_id ?? null,
|
||||||
|
top_k: 18,
|
||||||
|
recall_k: 80,
|
||||||
|
max_dense_dist: 0.55,
|
||||||
|
}).catch(() => [] as SearchHit[]);
|
||||||
const docIdFilter = task.doc_id ?? null;
|
const docIdFilter = task.doc_id ?? null;
|
||||||
|
|
||||||
// Pull artefacts SEQUENTIALLY. The investigator role has rolconnlimit=4 and
|
// Pull artefacts SEQUENTIALLY. The investigator role has rolconnlimit=4 and
|
||||||
|
|
@ -246,21 +245,9 @@ export async function runCaseWriter(task: CaseWriterTask): Promise<
|
||||||
ORDER BY e.evidence_id LIMIT 20`,
|
ORDER BY e.evidence_id LIMIT 20`,
|
||||||
[docIdFilter ?? filter],
|
[docIdFilter ?? filter],
|
||||||
);
|
);
|
||||||
const hypotheses = await query<HypothesisRow>(
|
// Hypotheses + contradictions are no longer fed to the narrator. They were
|
||||||
`SELECT hypothesis_id, question, position, argument_for, argument_against,
|
// skeptic-framing scaffolding from the earlier bureau. The narrator works
|
||||||
prior, posterior, confidence_band, status, reviewed_by
|
// from corpus scenes + curated verbatim quotes instead.
|
||||||
FROM public.hypotheses
|
|
||||||
WHERE LOWER(question) LIKE $1 OR LOWER(position) LIKE $1
|
|
||||||
ORDER BY hypothesis_id LIMIT 12`,
|
|
||||||
[filter],
|
|
||||||
);
|
|
||||||
const contradictions = await query<ContradictionRow>(
|
|
||||||
`SELECT contradiction_id, topic, chunks, resolution_status, notes
|
|
||||||
FROM public.contradictions
|
|
||||||
WHERE LOWER(topic) LIKE $1
|
|
||||||
ORDER BY contradiction_id LIMIT 8`,
|
|
||||||
[filter],
|
|
||||||
);
|
|
||||||
const witnesses = await query<WitnessRow>(
|
const witnesses = await query<WitnessRow>(
|
||||||
`SELECT w.witness_id, e.canonical_name, w.credibility, w.verdict,
|
`SELECT w.witness_id, e.canonical_name, w.credibility, w.verdict,
|
||||||
w.access_to_event, w.bias_notes
|
w.access_to_event, w.bias_notes
|
||||||
|
|
@ -286,21 +273,20 @@ export async function runCaseWriter(task: CaseWriterTask): Promise<
|
||||||
job_id: task.job_id,
|
job_id: task.job_id,
|
||||||
detective: "case-writer@detective",
|
detective: "case-writer@detective",
|
||||||
topic, slug, doc_id: docIdFilter,
|
topic, slug, doc_id: docIdFilter,
|
||||||
|
n_scenes: scenes.length,
|
||||||
n_evidence: evidence.length,
|
n_evidence: evidence.length,
|
||||||
n_hypotheses: hypotheses.length,
|
|
||||||
n_contradictions: contradictions.length,
|
|
||||||
n_witnesses: witnesses.length,
|
n_witnesses: witnesses.length,
|
||||||
n_gaps: gaps.length,
|
n_gaps: gaps.length,
|
||||||
});
|
});
|
||||||
|
|
||||||
const total = evidence.length + hypotheses.length + contradictions.length
|
// Refusal floor: the narrator needs real corpus material. Without enough
|
||||||
+ witnesses.length + gaps.length;
|
// scenes (chunks) the file would be padding.
|
||||||
if (total < 2 || (evidence.length === 0 && hypotheses.length === 0)) {
|
if (scenes.length < 4) {
|
||||||
return { skipped: true, reason: "insufficient_artefacts" };
|
return { skipped: true, reason: `insufficient_scenes_${scenes.length}_of_4` };
|
||||||
}
|
}
|
||||||
|
|
||||||
const systemPrompt = await readFile(PROMPT_PATH, "utf-8");
|
const systemPrompt = await readFile(PROMPT_PATH, "utf-8");
|
||||||
const prompt = buildPrompt(task, evidence, hypotheses, contradictions, witnesses, gaps);
|
const prompt = buildPrompt(task, scenes, evidence, witnesses, gaps, lang);
|
||||||
|
|
||||||
// Case-writer wants more output budget than the other detectives.
|
// Case-writer wants more output budget than the other detectives.
|
||||||
const llm = await callClaude({
|
const llm = await callClaude({
|
||||||
|
|
@ -328,14 +314,14 @@ export async function runCaseWriter(task: CaseWriterTask): Promise<
|
||||||
topic, topic_pt_br: task.topic_pt_br, slug, body_md,
|
topic, topic_pt_br: task.topic_pt_br, slug, body_md,
|
||||||
meta: {
|
meta: {
|
||||||
n_evidence: evidence.length,
|
n_evidence: evidence.length,
|
||||||
n_hypotheses: hypotheses.length,
|
n_hypotheses: 0, // hypothesis tournaments no longer feed the narrative
|
||||||
n_contradictions: contradictions.length,
|
n_contradictions: 0,
|
||||||
n_witnesses: witnesses.length,
|
n_witnesses: witnesses.length,
|
||||||
n_outliers: gaps.filter((g) => {
|
n_outliers: gaps.filter((g) => {
|
||||||
const s = g.scope as Record<string, unknown> | null;
|
const s = g.scope as Record<string, unknown> | null;
|
||||||
return s?.kind === "outlier";
|
return s?.kind === "outlier";
|
||||||
}).length,
|
}).length,
|
||||||
n_calibrations: 0, // Calibrations live inside hypothesis case files, not a table yet.
|
n_calibrations: 0,
|
||||||
},
|
},
|
||||||
}, { job_id: task.job_id, detective: "case-writer@detective" });
|
}, { job_id: task.job_id, detective: "case-writer@detective" });
|
||||||
}
|
}
|
||||||
|
|
|
||||||
Loading…
Reference in a new issue