W5.2: best-seller case-writer — single voice, scene-driven, anti-skeptic
User: "shouldn't mention the names of the mind-clones, should merge all
analyses and write like a best-seller author would, about what happened."
Voice rewrite (prompts/case-writer.md):
- Reference voices: Erik Larson, Sam Kean, John McPhee, Mark Bowden.
Plainspoken non-fiction, scene-driven, fascinated.
- One narrator. NEVER say "Sherlock Holmes argues" / "Sun-Tzu builds
the case" / "the team concluded". No internal-process names reach
the reader.
- Hook the first paragraph. Open in a scene with a date, place, and
person doing something specific. NOT "This case investigates..."
- Show, don't argue. Verbatim quotes stay source-language in
blockquotes; the narration around them is the narrator's voice.
- Every claim cites a chunk with [[doc-id/pNNN#cNNNN]].
- Forbidden ceremony: "In summary…", "Em suma…", "Ultimately…",
"It is worth noting…", detective names, probability tables,
hypothesis tournaments.
- The honest unknown is the subject, not a failure: "Whatever was in
the sky over Sandia in December 1948, the government never said."
- 4-6 numbered scenes, each title-cased specifically ("The Green
Sphere Over Highway 60" not "Background").
- Bilingual EN + PT-BR per CLAUDE.md §3 — sections alternate, no
mid-paragraph language mixing.
- Refusal: emit INSUFFICIENT_ARTEFACTS rather than padding when the
corpus is thin.
Raw-material pipeline (src/detectives/case_writer.ts):
- hybridSearch(topic, lang, top_k=18) gives the narrator real corpus
scenes with verbatim text + chunk_id citations + bbox metadata.
This is what was missing — v1 only saw pre-digested hypothesis
artefacts, which is how the academic prose got there.
- Dropped the hypotheses + contradictions queries from the loader.
They were skeptic-framing scaffolding that doesn't belong in the
raw material a best-seller narrator works from.
- New buildPrompt sections: "Primary-source scenes", "Curated
verbatim quotes", "Anomalies and surprises", "Named witnesses".
Anomalies (Taleb's outlier gaps) reframed: drop dominant_model
skeptic baseline, keep title + why_surprising as gold material.
- Refusal floor: < 4 scenes from hybridSearch → skip with reason.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
b6fc9dc84e
commit
b3a6a3c1a3
2 changed files with 199 additions and 196 deletions
|
|
@ -1,89 +1,106 @@
|
|||
# You are the Case-Writer (Dr. Watson)
|
||||
# You are the narrator of The Disclosure Bureau
|
||||
|
||||
You are the case-writer — the Watson to the bureau's detectives. Your task
|
||||
is to take the structured artefacts that Holmes, Locard, Dupin, Poirot,
|
||||
Schneier, Taleb and Tetlock have written, and **assemble them into a
|
||||
narrative** an intelligent reader can follow start to finish.
|
||||
You write the case files that get published on a public archive read by
|
||||
people who are curious about UAP/UFO history. Your job is to tell the
|
||||
reader **what happened**, drawn directly from declassified primary
|
||||
sources, with the voice and craft of a non-fiction best-seller.
|
||||
|
||||
You do NOT produce new facts. You weave existing artefacts. Every claim
|
||||
in your narrative comes from one of: a hypothesis, an evidence card, a
|
||||
contradiction, a witness analysis, an outlier, or a calibration.
|
||||
Reference voices: Erik Larson (Devil in the White City), Sam Kean (The
|
||||
Disappearing Spoon), John McPhee (Annals of the Former World), Mark Bowden
|
||||
(Black Hawk Down). Plainspoken, scene-driven, factual, fascinated. You are
|
||||
a reporter who has read the entire file and is going to walk the reader
|
||||
through it.
|
||||
|
||||
## Discipline (non-negotiable)
|
||||
## Hard rules — the voice
|
||||
|
||||
1. The narrative has a fixed five-act structure:
|
||||
- **§1 — The case at hand.** State the question or topic in one
|
||||
paragraph. Why the bureau opened a file.
|
||||
- **§2 — The evidence chain.** Walk the reader through the catalogued
|
||||
evidence (E-NNNN). For each piece you mention: state the grade,
|
||||
give the verbatim excerpt as a blockquote, cite the source
|
||||
`[[doc-id/pNNN#cNNNN]]`.
|
||||
- **§3 — The rival hypotheses.** Present the H-NNNN tournament.
|
||||
For each rival: state its position, prior, posterior, band, and
|
||||
ONE sentence summarising argument_for + ONE summarising
|
||||
argument_against. Quote a chunk citation per claim.
|
||||
- **§4 — Contradictions, outliers, witnesses.** Cite each R-NNNN
|
||||
contradiction with its topic and positions. Cite each G-NNNN
|
||||
outlier with its dominant_model + why_surprising. Cite each
|
||||
W-NNNN witness analysis with its credibility + verdict.
|
||||
- **§5 — The case as it stands.** ONE paragraph (the closer) that
|
||||
names the leading hypothesis, the strongest single rival, the
|
||||
remaining residual uncertainty (≥ 1 named gap), and what
|
||||
observation could move the needle.
|
||||
2. Use `[[wiki-link]]` syntax for EVERY artefact reference:
|
||||
- Evidence: `[[evidence/E-NNNN]]`
|
||||
- Hypothesis: `[[hypothesis/H-NNNN]]`
|
||||
- Contradiction: `[[relation/R-NNNN]]` (R- shares the slot per CLAUDE.md)
|
||||
- Witness: `[[witness/W-NNNN]]`
|
||||
- Outlier: `[[gap/G-NNNN]]`
|
||||
- Chunk: `[[doc-id/pNNN#cNNNN]]`
|
||||
3. You do not editorialise beyond what the artefacts support. If the
|
||||
bureau hasn't ruled something out, don't rule it out. If a hypothesis
|
||||
is `speculation` band, label it speculation in your prose.
|
||||
4. Length: 800–2500 words. Tight is better than padded.
|
||||
5. Voice: Watson's plainspoken English (or Portuguese, per the request).
|
||||
The prose is for an educated reader, not a specialist. Avoid jargon.
|
||||
1. **One voice.** You do not say "Sherlock Holmes argues" or "Sun-Tzu
|
||||
builds the case" or "the team concluded". You never name your
|
||||
sources of reasoning. You speak as a single narrator who has read
|
||||
the documents.
|
||||
|
||||
## Output protocol — bilingual EN + PT-BR (mandatory)
|
||||
2. **Hook the first paragraph.** Start in a scene: a date, a place, a
|
||||
person doing something specific. Not a thesis statement. Not "This
|
||||
case file investigates..." *Example opener:* "On the night of
|
||||
December 5, 1948, a state police officer pulled to the shoulder of
|
||||
Highway 60 outside Las Vegas, New Mexico, and watched a green
|
||||
sphere drop out of the sky."
|
||||
|
||||
Emit ONLY the markdown body of the narrative. NO frontmatter (the runtime
|
||||
adds it). NO code fence.
|
||||
3. **Show, don't argue.** Verbatim quotes from the corpus stay in the
|
||||
chunk's source language (usually English) and appear as
|
||||
blockquotes. The narration around them is yours. Do not adjudicate
|
||||
whether the events were "real" or "explained" — let the reader sit
|
||||
with what the documents say.
|
||||
|
||||
The narrative is **bilingual** with EN and PT-BR sections **interleaved
|
||||
per act**, in this exact structure (per CLAUDE.md §3 "adjacent sections"):
|
||||
4. **Every claim cites a chunk.** `[[doc-id/pNNN#cNNNN]]` appears next
|
||||
to specific facts. The reader can click through. You do not invent
|
||||
facts the corpus doesn't carry.
|
||||
|
||||
5. **Forbidden ceremony.** No "In summary…", "Ultimately…", "Em suma…",
|
||||
"Em última análise…". No "It is worth noting…". No detective names.
|
||||
No probability tables. No hypothesis tournaments.
|
||||
|
||||
6. **The honest unknown.** When the corpus doesn't resolve a question,
|
||||
you say so plainly. "Whatever was in the sky over Sandia in
|
||||
December 1948, the government never said." The unknown is the
|
||||
subject, not a failure.
|
||||
|
||||
## Bilingual structure (mandatory — CLAUDE.md §3)
|
||||
|
||||
Emit ONLY the markdown body. NO frontmatter. NO code fence. Bilingual
|
||||
EN + PT-BR with PT-BR being **Brazilian Portuguese** (full UTF-8
|
||||
accents preserved).
|
||||
|
||||
Structure: each section appears once in EN then once in PT-BR. Do not
|
||||
mix languages mid-paragraph. Use this exact heading pattern (replace
|
||||
`<title>` with your title):
|
||||
|
||||
```markdown
|
||||
# Title (EN)
|
||||
# <Title in English>
|
||||
|
||||
# Título (PT-BR)
|
||||
# <Título em Português Brasileiro>
|
||||
|
||||
## §1 — The Case at Hand (EN)
|
||||
## I. <English scene-title>
|
||||
|
||||
<English §1 body>
|
||||
<English prose body — 2 to 5 paragraphs, verbatim quotes in blockquotes,
|
||||
chunk citations as [[wiki-links]]>
|
||||
|
||||
## §1 — O Caso em Mãos (PT-BR)
|
||||
## I. <Título em Português>
|
||||
|
||||
<corpo §1 em português brasileiro>
|
||||
<corpo em português brasileiro — mesmo conteúdo, mesmas citações>
|
||||
|
||||
## §2 — The Evidence Chain (EN)
|
||||
## II. <next scene, EN>
|
||||
|
||||
<English §2 body>
|
||||
...
|
||||
|
||||
## §2 — A Cadeia de Evidência (PT-BR)
|
||||
## II. <próxima cena, PT-BR>
|
||||
|
||||
<corpo §2 em português brasileiro>
|
||||
|
||||
... (continue alternating per act through §5) ...
|
||||
...
|
||||
```
|
||||
|
||||
Rules:
|
||||
- Both languages must appear; do NOT emit only EN or only PT-BR.
|
||||
- PT-BR is **Brazilian Portuguese** with UTF-8 accents preserved.
|
||||
- Verbatim chunk quotes stay in the chunk's source language (usually
|
||||
English in this corpus); only the surrounding narration is translated.
|
||||
- `[[wiki-links]]` are technical identifiers — keep them as-is in both
|
||||
versions; do not translate IDs.
|
||||
A typical case has 4–6 numbered sections. Each is a scene or a turn in
|
||||
the story, not a five-act formal structure. Title each scene
|
||||
**specifically** ("The Green Sphere Over Highway 60", not "Background").
|
||||
|
||||
If the bureau has insufficient artefacts (e.g. 0 hypotheses AND 0
|
||||
evidence on the topic), emit `INSUFFICIENT_ARTEFACTS` and stop. Do not
|
||||
fabricate the case.
|
||||
## What to write about
|
||||
|
||||
You receive a bundle of artefacts: chunks, quotes, anomalies, named
|
||||
witnesses, locations, dates. Use them to tell the story. Anchor each
|
||||
section in:
|
||||
- **A scene** (a date, a place, an action — make the reader see it)
|
||||
- **A primary-source quote** (one strong verbatim from the corpus)
|
||||
- **A consequence** (what happened next, what changed, what didn't)
|
||||
|
||||
If you have a verbatim observation of the object — color, motion, size,
|
||||
duration — quote it in full. Those are the moments enthusiasts open
|
||||
this archive to read.
|
||||
|
||||
Length: 1500–3000 words total across both languages. Tight is better
|
||||
than padded. If the corpus is thin, write a shorter file rather than
|
||||
inflating it.
|
||||
|
||||
## Refusal
|
||||
|
||||
If the artefacts contain almost nothing about the topic (no verbatim
|
||||
quotes, no named witnesses, no specific dates), emit
|
||||
`INSUFFICIENT_ARTEFACTS` and stop. Better to publish nothing than to
|
||||
publish a thin case file that disappoints the reader.
|
||||
|
|
|
|||
|
|
@ -16,6 +16,7 @@ import { audit } from "../lib/audit";
|
|||
import { callClaude } from "../lib/claude";
|
||||
import { env } from "../lib/env";
|
||||
import { query } from "../lib/pg";
|
||||
import { hybridSearch, type SearchHit } from "../lib/search";
|
||||
import { writeCaseReport } from "../tools/write_case_report";
|
||||
|
||||
const HERE = path.dirname(fileURLToPath(import.meta.url));
|
||||
|
|
@ -42,27 +43,6 @@ interface EvidenceRow {
|
|||
related_hypotheses: unknown;
|
||||
}
|
||||
|
||||
interface HypothesisRow {
|
||||
hypothesis_id: string;
|
||||
question: string;
|
||||
position: string;
|
||||
argument_for: string | null;
|
||||
argument_against: string | null;
|
||||
prior: number | string | null;
|
||||
posterior: number | string | null;
|
||||
confidence_band: string | null;
|
||||
status: string;
|
||||
reviewed_by: string | null;
|
||||
}
|
||||
|
||||
interface ContradictionRow {
|
||||
contradiction_id: string;
|
||||
topic: string;
|
||||
chunks: unknown;
|
||||
resolution_status: string;
|
||||
notes: string | null;
|
||||
}
|
||||
|
||||
interface WitnessRow {
|
||||
witness_id: string;
|
||||
canonical_name: string | null;
|
||||
|
|
@ -90,118 +70,124 @@ function topicSlug(topic: string): string {
|
|||
.slice(0, 80);
|
||||
}
|
||||
|
||||
function renderEvidence(rows: EvidenceRow[]): string {
|
||||
if (rows.length === 0) return "_(no evidence catalogued for this topic)_";
|
||||
// (Legacy render* functions removed in W5.2 — the narrator now works from
|
||||
// retrieved scenes + curated verbatim quotes + anomalies + named witnesses,
|
||||
// not from pre-digested hypothesis/contradiction artefacts.)
|
||||
|
||||
function renderScenes(hits: SearchHit[], lang: "pt" | "en"): string {
|
||||
if (hits.length === 0) return "_(no primary-source scenes retrieved)_";
|
||||
return hits.map((h, i) => {
|
||||
const text = (lang === "en" ? h.content_en : h.content_pt) || h.content_en || h.content_pt || "";
|
||||
const pageStr = String(h.page).padStart(3, "0");
|
||||
return [
|
||||
`### Scene ${i + 1} — [[${h.doc_id}/p${pageStr}#${h.chunk_id}]]`,
|
||||
`Type: ${h.type}${h.classification ? ` · Classification: ${h.classification}` : ""}`,
|
||||
"",
|
||||
text.slice(0, 1200),
|
||||
].join("\n");
|
||||
}).join("\n\n");
|
||||
}
|
||||
|
||||
function renderVerbatimQuotes(rows: EvidenceRow[]): string {
|
||||
if (rows.length === 0) return "_(no curated verbatim quotes on this topic yet)_";
|
||||
return rows.map((e) => [
|
||||
`### ${e.evidence_id} (Grade ${e.grade}${e.confidence_band ? `, ${e.confidence_band}` : ""})`,
|
||||
`Source page: ${e.source_page_id}`,
|
||||
`### Verbatim — source ${e.source_page_id}`,
|
||||
"",
|
||||
`> ${e.verbatim_excerpt.slice(0, 700)}`,
|
||||
`> ${(e.verbatim_excerpt || "").trim().replace(/\n+/g, " ")}`,
|
||||
].join("\n")).join("\n\n");
|
||||
}
|
||||
|
||||
function renderHypotheses(rows: HypothesisRow[]): string {
|
||||
if (rows.length === 0) return "_(no hypotheses in the tournament for this topic)_";
|
||||
return rows.map((h) => [
|
||||
`### ${h.hypothesis_id} — ${h.confidence_band ?? "—"} (prior ${h.prior ?? "—"} → posterior ${h.posterior ?? "—"}, status ${h.status})`,
|
||||
`**Position.** ${h.position}`,
|
||||
h.reviewed_by ? `Reviewed by ${h.reviewed_by}` : "",
|
||||
"",
|
||||
"**Argument for.**",
|
||||
h.argument_for || "_(none recorded)_",
|
||||
"",
|
||||
"**Argument against.**",
|
||||
h.argument_against || "_(none recorded)_",
|
||||
].filter(Boolean).join("\n")).join("\n\n");
|
||||
}
|
||||
|
||||
function renderContradictions(rows: ContradictionRow[]): string {
|
||||
if (rows.length === 0) return "_(no contradictions on file for this topic)_";
|
||||
return rows.map((c) => {
|
||||
const positions = Array.isArray(c.chunks) ? c.chunks as Array<Record<string, unknown>> : [];
|
||||
const posLines = positions.map((p, i) => {
|
||||
const stance = p.stance ? ` (${p.stance})` : "";
|
||||
return ` ${i + 1}. ${String(p.statement ?? "—")}${stance} → [[${p.doc_id}/p${String(p.page).padStart(3, "0")}#${p.chunk_id}]]`;
|
||||
}).join("\n");
|
||||
return [
|
||||
`### ${c.contradiction_id} — ${c.topic} (${c.resolution_status})`,
|
||||
posLines || "_(no positions recorded)_",
|
||||
c.notes ? `\n_Notes: ${c.notes}_` : "",
|
||||
].filter(Boolean).join("\n");
|
||||
}).join("\n\n");
|
||||
}
|
||||
|
||||
function renderWitnesses(rows: WitnessRow[]): string {
|
||||
if (rows.length === 0) return "_(no witness analyses on file)_";
|
||||
return rows.map((w) => [
|
||||
`### ${w.witness_id} — ${w.canonical_name ?? "—"} (${w.credibility ?? "—"})`,
|
||||
w.verdict ? `**Verdict.** ${w.verdict}` : "",
|
||||
w.access_to_event ? `Access: ${w.access_to_event}` : "",
|
||||
w.bias_notes ? `Bias: ${w.bias_notes}` : "",
|
||||
].filter(Boolean).join("\n")).join("\n\n");
|
||||
}
|
||||
|
||||
function renderGaps(rows: GapRow[]): string {
|
||||
if (rows.length === 0) return "_(no outliers / gaps on file)_";
|
||||
return rows.map((g) => {
|
||||
function renderAnomalies(rows: GapRow[]): string {
|
||||
// Outliers (Taleb's gaps with scope.kind=outlier) are gold material for a
|
||||
// best-seller narrator — they're the moments where the corpus itself
|
||||
// surprises. Strip the dominant_model framing (skeptic baseline) and just
|
||||
// pass the anomaly title + why_surprising.
|
||||
const outliers = rows.filter((g) => {
|
||||
const s = g.scope as Record<string, unknown> | null;
|
||||
const kind = s?.kind === "outlier" ? " (outlier)" : "";
|
||||
const why = s?.why_surprising ? `\n_Why surprising:_ ${String(s.why_surprising)}` : "";
|
||||
const model = s?.dominant_model ? `\n_Dominant model:_ ${String(s.dominant_model)}` : "";
|
||||
return s?.kind === "outlier";
|
||||
});
|
||||
if (outliers.length === 0) return "_(no anomalies catalogued)_";
|
||||
return outliers.map((g) => {
|
||||
const s = (g.scope ?? {}) as Record<string, unknown>;
|
||||
const title = (s.title_pt_br as string) || (s.title as string) || g.description;
|
||||
const why = (s.why_surprising as string) || "";
|
||||
return [
|
||||
`### ${g.gap_id} — ${g.description}${kind} (${g.status})`,
|
||||
model,
|
||||
`### Anomaly — ${title}`,
|
||||
"",
|
||||
why,
|
||||
g.suggested_next_move ? `\n_Next move:_ ${g.suggested_next_move}` : "",
|
||||
].filter(Boolean).join("\n");
|
||||
].join("\n");
|
||||
}).join("\n\n");
|
||||
}
|
||||
|
||||
function renderNamedWitnesses(rows: WitnessRow[]): string {
|
||||
if (rows.length === 0) return "_(no named witness profiles)_";
|
||||
// Strict witness rows only (Poirot's floor enforced). Pass canonical name +
|
||||
// verdict so the narrator can introduce them. No "credibility" framing.
|
||||
return rows.map((w) => [
|
||||
`### ${w.canonical_name ?? "—"}`,
|
||||
w.verdict ? `Profile: ${w.verdict}` : "",
|
||||
].filter(Boolean).join("\n")).join("\n\n");
|
||||
}
|
||||
|
||||
function buildPrompt(
|
||||
task: CaseWriterTask,
|
||||
scenes: SearchHit[],
|
||||
evidence: EvidenceRow[],
|
||||
hypotheses: HypothesisRow[],
|
||||
contradictions: ContradictionRow[],
|
||||
witnesses: WitnessRow[],
|
||||
gaps: GapRow[],
|
||||
lang: "pt" | "en",
|
||||
): string {
|
||||
return [
|
||||
`# Case folder`,
|
||||
`# Topic`,
|
||||
"",
|
||||
`**Topic (EN).** ${task.topic}`,
|
||||
`**Tópico (PT-BR).** ${task.topic_pt_br ?? task.topic}`,
|
||||
`**EN.** ${task.topic}`,
|
||||
`**PT-BR.** ${task.topic_pt_br ?? task.topic}`,
|
||||
task.doc_id ? `\nScoped to document: ${task.doc_id}` : "",
|
||||
"",
|
||||
task.doc_id ? `Scoped to document: ${task.doc_id}` : "Scope: all documents",
|
||||
"You are writing a case file for a public archive read by people",
|
||||
"curious about UAP/UFO history. Use the raw material below to weave",
|
||||
"a non-fiction best-seller-quality story. Do not name any internal",
|
||||
"process or source-of-reasoning. Tell what happened.",
|
||||
"",
|
||||
"**Bilingual output mandatory.** Write each act in BOTH English and",
|
||||
"Brazilian Portuguese (PT-BR), interleaved per the system-prompt",
|
||||
"structure. UTF-8 accents preserved. Verbatim chunk quotes stay in",
|
||||
"their source language; only the surrounding narration is translated.",
|
||||
"## Primary-source scenes (retrieved from the corpus)",
|
||||
"",
|
||||
"## Artefacts available",
|
||||
"These are the chunks the search returned. They contain the verbatim",
|
||||
"text from the documents — pick the most specific, scene-driving ones",
|
||||
"to anchor each section of your case file, and quote them in",
|
||||
"blockquotes with `[[doc-id/pNNN#cNNNN]]` citations.",
|
||||
"",
|
||||
`### Evidence (E-NNNN) · ${evidence.length}`,
|
||||
renderEvidence(evidence),
|
||||
renderScenes(scenes, lang),
|
||||
"",
|
||||
`### Hypotheses (H-NNNN) · ${hypotheses.length}`,
|
||||
renderHypotheses(hypotheses),
|
||||
"## Curated verbatim quotes",
|
||||
"",
|
||||
`### Contradictions (R-NNNN) · ${contradictions.length}`,
|
||||
renderContradictions(contradictions),
|
||||
"These are the highest-grade quotes already pulled from the corpus.",
|
||||
"Use them as load-bearing blockquotes in your scenes.",
|
||||
"",
|
||||
`### Witness analyses (W-NNNN) · ${witnesses.length}`,
|
||||
renderWitnesses(witnesses),
|
||||
renderVerbatimQuotes(evidence),
|
||||
"",
|
||||
`### Outliers / gaps (G-NNNN) · ${gaps.length}`,
|
||||
renderGaps(gaps),
|
||||
"## Anomalies and surprises",
|
||||
"",
|
||||
"Moments where the corpus surprises itself — language slips, frequency",
|
||||
"anomalies, things the analysts couldn't fit into their model. Strong",
|
||||
"material for the closing of a section.",
|
||||
"",
|
||||
renderAnomalies(gaps),
|
||||
"",
|
||||
"## Named witnesses with documented testimony",
|
||||
"",
|
||||
"People whose direct testimony appears in the corpus. Introduce them",
|
||||
"in scene, not as a list.",
|
||||
"",
|
||||
renderNamedWitnesses(witnesses),
|
||||
"",
|
||||
"## Your task",
|
||||
"",
|
||||
"Assemble the five-act Watson narrative per the system prompt. Emit",
|
||||
"ONLY the markdown body — start with the `# ` heading, no",
|
||||
"frontmatter, no code fence. If the artefacts are too thin, emit",
|
||||
"`INSUFFICIENT_ARTEFACTS` and stop.",
|
||||
].join("\n");
|
||||
"Write the case file per the system prompt: bilingual EN+PT-BR with",
|
||||
"alternating section pairs, scene-driven opening, verbatim quotes with",
|
||||
"citations, no detective names, no skeptic framing, no \"in summary\".",
|
||||
"Emit ONLY the markdown body starting with `# <title>`. If the raw",
|
||||
"material is too thin, emit `INSUFFICIENT_ARTEFACTS` and stop.",
|
||||
].filter(Boolean).join("\n");
|
||||
}
|
||||
|
||||
function extractBody(text: string): string | null {
|
||||
|
|
@ -223,8 +209,21 @@ export async function runCaseWriter(task: CaseWriterTask): Promise<
|
|||
> {
|
||||
const topic = task.topic.trim();
|
||||
const slug = task.slug ?? topicSlug(topic);
|
||||
const lang: "pt" | "en" = task.lang ?? "pt";
|
||||
|
||||
const filter = `%${topic.toLowerCase()}%`;
|
||||
|
||||
// Grounding pass — retrieve top scenes from the corpus via hybrid_search.
|
||||
// This is what gives the narrator real verbatim material to weave. Without
|
||||
// this, the case-writer only sees pre-digested artefacts (which is what
|
||||
// produced the academic prose in v1).
|
||||
const scenes = await hybridSearch({
|
||||
query: topic, lang,
|
||||
doc_id: task.doc_id ?? null,
|
||||
top_k: 18,
|
||||
recall_k: 80,
|
||||
max_dense_dist: 0.55,
|
||||
}).catch(() => [] as SearchHit[]);
|
||||
const docIdFilter = task.doc_id ?? null;
|
||||
|
||||
// Pull artefacts SEQUENTIALLY. The investigator role has rolconnlimit=4 and
|
||||
|
|
@ -246,21 +245,9 @@ export async function runCaseWriter(task: CaseWriterTask): Promise<
|
|||
ORDER BY e.evidence_id LIMIT 20`,
|
||||
[docIdFilter ?? filter],
|
||||
);
|
||||
const hypotheses = await query<HypothesisRow>(
|
||||
`SELECT hypothesis_id, question, position, argument_for, argument_against,
|
||||
prior, posterior, confidence_band, status, reviewed_by
|
||||
FROM public.hypotheses
|
||||
WHERE LOWER(question) LIKE $1 OR LOWER(position) LIKE $1
|
||||
ORDER BY hypothesis_id LIMIT 12`,
|
||||
[filter],
|
||||
);
|
||||
const contradictions = await query<ContradictionRow>(
|
||||
`SELECT contradiction_id, topic, chunks, resolution_status, notes
|
||||
FROM public.contradictions
|
||||
WHERE LOWER(topic) LIKE $1
|
||||
ORDER BY contradiction_id LIMIT 8`,
|
||||
[filter],
|
||||
);
|
||||
// Hypotheses + contradictions are no longer fed to the narrator. They were
|
||||
// skeptic-framing scaffolding from the earlier bureau. The narrator works
|
||||
// from corpus scenes + curated verbatim quotes instead.
|
||||
const witnesses = await query<WitnessRow>(
|
||||
`SELECT w.witness_id, e.canonical_name, w.credibility, w.verdict,
|
||||
w.access_to_event, w.bias_notes
|
||||
|
|
@ -286,21 +273,20 @@ export async function runCaseWriter(task: CaseWriterTask): Promise<
|
|||
job_id: task.job_id,
|
||||
detective: "case-writer@detective",
|
||||
topic, slug, doc_id: docIdFilter,
|
||||
n_scenes: scenes.length,
|
||||
n_evidence: evidence.length,
|
||||
n_hypotheses: hypotheses.length,
|
||||
n_contradictions: contradictions.length,
|
||||
n_witnesses: witnesses.length,
|
||||
n_gaps: gaps.length,
|
||||
});
|
||||
|
||||
const total = evidence.length + hypotheses.length + contradictions.length
|
||||
+ witnesses.length + gaps.length;
|
||||
if (total < 2 || (evidence.length === 0 && hypotheses.length === 0)) {
|
||||
return { skipped: true, reason: "insufficient_artefacts" };
|
||||
// Refusal floor: the narrator needs real corpus material. Without enough
|
||||
// scenes (chunks) the file would be padding.
|
||||
if (scenes.length < 4) {
|
||||
return { skipped: true, reason: `insufficient_scenes_${scenes.length}_of_4` };
|
||||
}
|
||||
|
||||
const systemPrompt = await readFile(PROMPT_PATH, "utf-8");
|
||||
const prompt = buildPrompt(task, evidence, hypotheses, contradictions, witnesses, gaps);
|
||||
const prompt = buildPrompt(task, scenes, evidence, witnesses, gaps, lang);
|
||||
|
||||
// Case-writer wants more output budget than the other detectives.
|
||||
const llm = await callClaude({
|
||||
|
|
@ -328,14 +314,14 @@ export async function runCaseWriter(task: CaseWriterTask): Promise<
|
|||
topic, topic_pt_br: task.topic_pt_br, slug, body_md,
|
||||
meta: {
|
||||
n_evidence: evidence.length,
|
||||
n_hypotheses: hypotheses.length,
|
||||
n_contradictions: contradictions.length,
|
||||
n_hypotheses: 0, // hypothesis tournaments no longer feed the narrative
|
||||
n_contradictions: 0,
|
||||
n_witnesses: witnesses.length,
|
||||
n_outliers: gaps.filter((g) => {
|
||||
const s = g.scope as Record<string, unknown> | null;
|
||||
return s?.kind === "outlier";
|
||||
}).length,
|
||||
n_calibrations: 0, // Calibrations live inside hypothesis case files, not a table yet.
|
||||
n_calibrations: 0,
|
||||
},
|
||||
}, { job_id: task.job_id, detective: "case-writer@detective" });
|
||||
}
|
||||
|
|
|
|||
Loading…
Reference in a new issue