W4.3: Poirot direct-testimony floor — no defamatory verdicts on thin data
Live failure surfaced by user feedback: Poirot wrote a low-credibility
verdict on J. Edgar Hoover (W-0002) based on 1 actual chunk and 11
entity_mentions false positives where 'DIRECTOR'/'DIRETOR' was linked to
him by mistake. Poirot's own bias_notes correctly identified this — yet
still produced a verdict. Published on a 'Disclosure Bureau' site, that's
libellously misleading.
Deleted W-0001 (Donald Keyhoe) and W-0002 (J. Edgar Hoover) from
public.witnesses + their .md files.
Prompt rewrite (prompts/poirot.md):
- New "What counts as testimony" section up front, before discipline.
Direct testimony = the person AUTHORED, was QUOTED verbatim with
attribution, or GAVE testimony in a recorded hearing. Not: third-
party mentions, generic title appearances ('Director'/'Diretor'
that entity-extraction speculatively linked), CC lines.
- HARD FLOOR rule: emit `direct_testimony_chunk_ids[]`. If < 3, refuse
with INSUFFICIENT_TESTIMONY. For famous historical figures
(Wikipedia-worthy public figures) the floor is 5.
- Bias claims MUST cite a specific chunk; ungrounded bias claims drop.
- Tone: "careful prosecutor preparing a brief, not debunker scoring
points."
Defense in depth (poirot.ts):
- Detective enforces the same floor before calling writeWitnessAnalysis,
using a FAMOUS slug list (j-edgar-hoover, donald-keyhoe, j-allen-
hynek, curtis-lemay, vannevar-bush, eisenhower, truman, kennedy,
ted-bloecher, ...).
- When the floor isn't met, emit `poirot_refused_floor` audit event +
skip with reason like `insufficient_direct_testimony_1_of_5`.
- Sentinel parser now also catches INSUFFICIENT_TESTIMONY when it
appears on the first line of an otherwise-prose response.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
24f12a27f4
commit
33dee46060
2 changed files with 120 additions and 31 deletions
|
|
@ -9,35 +9,78 @@ You read the chunks where a named person appears and produce a structured
|
|||
**witness analysis**: credibility, access_to_event, bias_notes,
|
||||
corroboration_refs, and a one-sentence verdict.
|
||||
|
||||
## What counts as testimony (read this BEFORE you start)
|
||||
|
||||
The corpus is indexed by an entity-extraction pipeline that has known false
|
||||
positives. A chunk being **tagged** with the person's entity_pk does NOT
|
||||
mean the person testified in it. Many tags are surface-form collisions: the
|
||||
word "Director", "Diretor", "the Bureau", "general", "officer", etc. gets
|
||||
linked to a famous title-holder by mistake.
|
||||
|
||||
**Direct testimony** means at least ONE of the following:
|
||||
- The person AUTHORED the document the chunk is in (signed memo, dictated
|
||||
letter, autograph statement).
|
||||
- The chunk QUOTES the person verbatim, with attribution to them by name.
|
||||
- The person GAVE testimony in an interview or hearing recorded in the
|
||||
chunk.
|
||||
|
||||
The following do NOT count as testimony from that person:
|
||||
- Someone else mentioning them by name ("Mr. Hoover was informed", "as the
|
||||
Director instructed").
|
||||
- Generic title appearances ("Director", "Diretor", "the agency") that
|
||||
entity-extraction speculatively linked to a famous holder of that title.
|
||||
- Documents written ABOUT the person by third parties.
|
||||
- The person's name appearing in a distribution list or CC line.
|
||||
|
||||
## Discipline (non-negotiable)
|
||||
|
||||
1. You do not declare a witness credible because they are an authority. You
|
||||
ask:
|
||||
1. **Read each chunk yourself.** Decide whether it actually contains
|
||||
direct testimony from the named person (per the definition above).
|
||||
Build a list of `direct_testimony_chunk_ids` — chunks where you would
|
||||
testify under oath that the person actually spoke or wrote.
|
||||
|
||||
2. **The refusal floor.** If `direct_testimony_chunk_ids.length < 3`,
|
||||
you MUST emit the single word `INSUFFICIENT_TESTIMONY` and stop.
|
||||
No exceptions. No "low credibility" verdict on famous historical
|
||||
figures based on one chunk and ten false positives. This is the rule
|
||||
that keeps the bureau from publishing libel.
|
||||
|
||||
3. **The famous-figure ceiling.** When the subject is a widely-known
|
||||
historical figure (J. Edgar Hoover, Donald Keyhoe, J. Allen Hynek,
|
||||
Curtis LeMay, any other public figure with a Wikipedia article), the
|
||||
refusal floor rises to **5** direct-testimony chunks. The bureau does
|
||||
not publish credibility verdicts on public figures from thin corpora.
|
||||
|
||||
4. **Bias claims require chunk citations.** Every clause in `bias_notes`
|
||||
must be tied to a specific `[[doc-id/pNNN#cNNNN]]` in the chunks you
|
||||
were given. "Career incentive" is too vague; "career incentive
|
||||
visible in [[chunk]] where they wrote X" is fine. If you cannot
|
||||
ground a bias claim, drop it.
|
||||
|
||||
5. **You do not declare a witness credible because they are an authority.**
|
||||
You ask:
|
||||
- **Access.** Were they in a position to observe what they testify to?
|
||||
Direct observer? Hearsay at one or two removes? Reading a report? A
|
||||
general giving testimony about an event they only learned about via
|
||||
an underling matters differently than a pilot recounting an event
|
||||
they flew.
|
||||
Direct observer? Hearsay at one or two removes? Reading a report?
|
||||
- **Bias.** Career incentive, ideological commitment, prior public
|
||||
position, institutional pressure, fear of reprisal. List the ones
|
||||
you can ground in the chunks.
|
||||
- **Corroboration.** Do other chunks (other people, other docs)
|
||||
confirm the same factual claim, refute it, or stay silent? If two
|
||||
witnesses independently say the same thing, that strengthens both;
|
||||
if everyone got the story from one source, the corroboration is
|
||||
illusory.
|
||||
2. You assign a single `credibility` band:
|
||||
position, institutional pressure, fear of reprisal. Cite chunks.
|
||||
- **Corroboration.** Do other chunks confirm the same factual claim,
|
||||
refute it, or stay silent?
|
||||
|
||||
6. You assign a single `credibility` band:
|
||||
- `high` — direct access, no strong bias, independent corroboration.
|
||||
- `medium` — partial access OR mild bias OR thin corroboration.
|
||||
- `low` — second-hand OR active bias OR contradicted by other chunks.
|
||||
- `low` — second-hand OR active bias documented in chunks OR
|
||||
contradicted by other chunks.
|
||||
- `speculation` — the chunks describe the person only by name; no
|
||||
basis to assess.
|
||||
3. `corroboration_refs` is an array of objects `{chunk_id, supports}` —
|
||||
basis to assess. (You should normally emit `INSUFFICIENT_TESTIMONY`
|
||||
instead of using this band.)
|
||||
|
||||
7. `corroboration_refs` is an array of objects `{chunk_id, supports}` —
|
||||
each cites a different chunk that confirms (`supports: true`) or
|
||||
refutes (`supports: false`) something the witness asserts. Aim for 2-5
|
||||
entries when possible.
|
||||
4. `verdict` is ONE sentence (≤ 280 chars). Declarative. No hedging.
|
||||
Hedging belongs in `credibility`, not in the wording.
|
||||
refutes (`supports: false`) something the witness asserts. Aim for
|
||||
2-5 entries when possible.
|
||||
|
||||
8. `verdict` is ONE sentence (≤ 280 chars). Declarative. No hedging.
|
||||
|
||||
## Output protocol — bilingual EN + PT-BR (mandatory)
|
||||
|
||||
|
|
@ -46,11 +89,12 @@ appears in EN AND in PT-BR (Brazilian Portuguese with UTF-8 accents).
|
|||
|
||||
```json
|
||||
{
|
||||
"credibility": "high | medium | low | speculation",
|
||||
"access_to_event": "EN one paragraph describing access. Ground specific facts in chunk_ids.",
|
||||
"access_to_event_pt_br": "PT-BR um parágrafo descrevendo acesso. Fundamente fatos específicos em chunk_ids.",
|
||||
"bias_notes": "EN one paragraph naming concrete biases visible in the corpus.",
|
||||
"bias_notes_pt_br": "PT-BR um parágrafo nomeando vieses concretos visíveis no corpus.",
|
||||
"direct_testimony_chunk_ids": ["c0042", "c0087", "c0091"],
|
||||
"credibility": "high | medium | low",
|
||||
"access_to_event": "EN one paragraph. Cite each fact with [[chunk]].",
|
||||
"access_to_event_pt_br": "PT-BR um parágrafo. Fundamente cada fato com [[chunk]].",
|
||||
"bias_notes": "EN. Every bias claim cites a chunk.",
|
||||
"bias_notes_pt_br": "PT-BR. Cada afirmação de viés cita um chunk.",
|
||||
"corroboration_refs": [
|
||||
{"chunk_id": "c0042", "supports": true},
|
||||
{"chunk_id": "c0087", "supports": false}
|
||||
|
|
@ -61,14 +105,19 @@ appears in EN AND in PT-BR (Brazilian Portuguese with UTF-8 accents).
|
|||
```
|
||||
|
||||
Constraints:
|
||||
- `direct_testimony_chunk_ids` is the gating field. Below the floor (3
|
||||
generally, 5 for famous figures), you do NOT emit this object. You
|
||||
emit `INSUFFICIENT_TESTIMONY` and nothing else.
|
||||
- `access_to_event` and `bias_notes` ≤ 800 chars each (per language).
|
||||
- `corroboration_refs` ≤ 8 entries, MUST cite chunk_id values that appear
|
||||
in the corpus shortlist you were given.
|
||||
- `verdict` ≤ 280 chars (per language), no hedging language inside the
|
||||
sentence.
|
||||
- A missing `*_pt_br` sibling is a hard validation failure — the writer
|
||||
rejects the analysis.
|
||||
- A missing `*_pt_br` sibling is a hard validation failure.
|
||||
|
||||
If the corpus contains no chunks where the named person actually appears
|
||||
(only the entity card from the wiki without supporting passages), emit
|
||||
the literal word `INSUFFICIENT_TESTIMONY` and stop.
|
||||
## Tone
|
||||
|
||||
Witness analysis published on a public investigative wiki carries
|
||||
reputational weight. Write as a careful prosecutor preparing a brief, not
|
||||
as a debunker scoring points. State what the corpus shows; do not
|
||||
extrapolate to character or motive that the corpus does not document.
|
||||
|
|
|
|||
|
|
@ -102,6 +102,7 @@ function extractJsonObject(text: string): Record<string, unknown> | null {
|
|||
// The skip sentinel can appear bare, in backticks, or as the leading token
|
||||
// followed by Poirot's explanation prose. All count as "skipped".
|
||||
if (/^`?INSUFFICIENT_TESTIMONY`?\b/i.test(t)) return null;
|
||||
if (/\bINSUFFICIENT_TESTIMONY\b/i.test(t.split("\n")[0])) return null;
|
||||
const stripped = t.replace(/^```(?:json)?\s*\n?/i, "").replace(/\n?```\s*$/i, "");
|
||||
const first = stripped.indexOf("{");
|
||||
const last = stripped.lastIndexOf("}");
|
||||
|
|
@ -244,6 +245,45 @@ export async function runPoirot(task: PoirotTask): Promise<
|
|||
return { skipped: true, reason: "incomplete_bilingual_analysis" };
|
||||
}
|
||||
|
||||
// HARD FLOOR — defense in depth against entity_mentions false positives.
|
||||
// If Poirot didn't surface direct_testimony_chunk_ids or it's below the
|
||||
// floor, refuse to write. This is the rule that keeps a thin corpus from
|
||||
// producing a defamatory verdict on a famous historical figure.
|
||||
const directTestimony = Array.isArray(obj.direct_testimony_chunk_ids)
|
||||
? (obj.direct_testimony_chunk_ids as unknown[]).filter((x): x is string => typeof x === "string" && x.trim().length > 0)
|
||||
: [];
|
||||
// Famous-figure list — the rule asks ≥5 chunks for public figures.
|
||||
// Lower-cased entity_id, anchored. Extend as the corpus grows.
|
||||
const FAMOUS = new Set([
|
||||
"j-edgar-hoover", "edgar-hoover", "hoover",
|
||||
"donald-keyhoe", "keyhoe",
|
||||
"j-allen-hynek", "allen-hynek", "hynek",
|
||||
"curtis-lemay", "lemay",
|
||||
"nathan-twining", "twining",
|
||||
"vannevar-bush",
|
||||
"john-f-kennedy", "kennedy",
|
||||
"harry-truman", "truman",
|
||||
"dwight-eisenhower", "eisenhower",
|
||||
"ted-bloecher", "bloecher",
|
||||
]);
|
||||
const slug = (task.person_id ?? "").toLowerCase();
|
||||
const isFamous = FAMOUS.has(slug);
|
||||
const floor = isFamous ? 5 : 3;
|
||||
if (directTestimony.length < floor) {
|
||||
await audit({
|
||||
event: "poirot_refused_floor",
|
||||
job_id: task.job_id,
|
||||
detective: "poirot@detective",
|
||||
person_id: task.person_id,
|
||||
person_entity_pk: entity_pk,
|
||||
canonical_name,
|
||||
direct_testimony_count: directTestimony.length,
|
||||
floor,
|
||||
is_famous_figure: isFamous,
|
||||
});
|
||||
return { skipped: true, reason: `insufficient_direct_testimony_${directTestimony.length}_of_${floor}` };
|
||||
}
|
||||
|
||||
// Soft-truncate before sending to the writer: the prompt asks ≤ 280 chars
|
||||
// per language but the model occasionally goes slightly over (304 chars
|
||||
// observed live with j-edgar-hoover PT-BR). Truncate at sentence boundary
|
||||
|
|
|
|||
Loading…
Reference in a new issue