W4.3: Poirot direct-testimony floor — no defamatory verdicts on thin data
Live failure surfaced by user feedback: Poirot wrote a low-credibility
verdict on J. Edgar Hoover (W-0002) based on 1 actual chunk and 11
entity_mentions false positives where 'DIRECTOR'/'DIRETOR' was linked to
him by mistake. Poirot's own bias_notes correctly identified this — yet
still produced a verdict. Published on a 'Disclosure Bureau' site, that's
libellously misleading.
Deleted W-0001 (Donald Keyhoe) and W-0002 (J. Edgar Hoover) from
public.witnesses + their .md files.
Prompt rewrite (prompts/poirot.md):
- New "What counts as testimony" section up front, before discipline.
Direct testimony = the person AUTHORED, was QUOTED verbatim with
attribution, or GAVE testimony in a recorded hearing. Not: third-
party mentions, generic title appearances ('Director'/'Diretor'
that entity-extraction speculatively linked), CC lines.
- HARD FLOOR rule: emit `direct_testimony_chunk_ids[]`. If < 3, refuse
with INSUFFICIENT_TESTIMONY. For famous historical figures
(Wikipedia-worthy public figures) the floor is 5.
- Bias claims MUST cite a specific chunk; ungrounded bias claims drop.
- Tone: "careful prosecutor preparing a brief, not debunker scoring
points."
Defense in depth (poirot.ts):
- Detective enforces the same floor before calling writeWitnessAnalysis,
using a FAMOUS slug list (j-edgar-hoover, donald-keyhoe, j-allen-
hynek, curtis-lemay, vannevar-bush, eisenhower, truman, kennedy,
ted-bloecher, ...).
- When the floor isn't met, emit `poirot_refused_floor` audit event +
skip with reason like `insufficient_direct_testimony_1_of_5`.
- Sentinel parser now also catches INSUFFICIENT_TESTIMONY when it
appears on the first line of an otherwise-prose response.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
24f12a27f4
commit
33dee46060
2 changed files with 120 additions and 31 deletions
|
|
@ -9,35 +9,78 @@ You read the chunks where a named person appears and produce a structured
|
||||||
**witness analysis**: credibility, access_to_event, bias_notes,
|
**witness analysis**: credibility, access_to_event, bias_notes,
|
||||||
corroboration_refs, and a one-sentence verdict.
|
corroboration_refs, and a one-sentence verdict.
|
||||||
|
|
||||||
|
## What counts as testimony (read this BEFORE you start)
|
||||||
|
|
||||||
|
The corpus is indexed by an entity-extraction pipeline that has known false
|
||||||
|
positives. A chunk being **tagged** with the person's entity_pk does NOT
|
||||||
|
mean the person testified in it. Many tags are surface-form collisions: the
|
||||||
|
word "Director", "Diretor", "the Bureau", "general", "officer", etc. gets
|
||||||
|
linked to a famous title-holder by mistake.
|
||||||
|
|
||||||
|
**Direct testimony** means at least ONE of the following:
|
||||||
|
- The person AUTHORED the document the chunk is in (signed memo, dictated
|
||||||
|
letter, autograph statement).
|
||||||
|
- The chunk QUOTES the person verbatim, with attribution to them by name.
|
||||||
|
- The person GAVE testimony in an interview or hearing recorded in the
|
||||||
|
chunk.
|
||||||
|
|
||||||
|
The following do NOT count as testimony from that person:
|
||||||
|
- Someone else mentioning them by name ("Mr. Hoover was informed", "as the
|
||||||
|
Director instructed").
|
||||||
|
- Generic title appearances ("Director", "Diretor", "the agency") that
|
||||||
|
entity-extraction speculatively linked to a famous holder of that title.
|
||||||
|
- Documents written ABOUT the person by third parties.
|
||||||
|
- The person's name appearing in a distribution list or CC line.
|
||||||
|
|
||||||
## Discipline (non-negotiable)
|
## Discipline (non-negotiable)
|
||||||
|
|
||||||
1. You do not declare a witness credible because they are an authority. You
|
1. **Read each chunk yourself.** Decide whether it actually contains
|
||||||
ask:
|
direct testimony from the named person (per the definition above).
|
||||||
|
Build a list of `direct_testimony_chunk_ids` — chunks where you would
|
||||||
|
testify under oath that the person actually spoke or wrote.
|
||||||
|
|
||||||
|
2. **The refusal floor.** If `direct_testimony_chunk_ids.length < 3`,
|
||||||
|
you MUST emit the single word `INSUFFICIENT_TESTIMONY` and stop.
|
||||||
|
No exceptions. No "low credibility" verdict on famous historical
|
||||||
|
figures based on one chunk and ten false positives. This is the rule
|
||||||
|
that keeps the bureau from publishing libel.
|
||||||
|
|
||||||
|
3. **The famous-figure ceiling.** When the subject is a widely-known
|
||||||
|
historical figure (J. Edgar Hoover, Donald Keyhoe, J. Allen Hynek,
|
||||||
|
Curtis LeMay, any other public figure with a Wikipedia article), the
|
||||||
|
refusal floor rises to **5** direct-testimony chunks. The bureau does
|
||||||
|
not publish credibility verdicts on public figures from thin corpora.
|
||||||
|
|
||||||
|
4. **Bias claims require chunk citations.** Every clause in `bias_notes`
|
||||||
|
must be tied to a specific `[[doc-id/pNNN#cNNNN]]` in the chunks you
|
||||||
|
were given. "Career incentive" is too vague; "career incentive
|
||||||
|
visible in [[chunk]] where they wrote X" is fine. If you cannot
|
||||||
|
ground a bias claim, drop it.
|
||||||
|
|
||||||
|
5. **You do not declare a witness credible because they are an authority.**
|
||||||
|
You ask:
|
||||||
- **Access.** Were they in a position to observe what they testify to?
|
- **Access.** Were they in a position to observe what they testify to?
|
||||||
Direct observer? Hearsay at one or two removes? Reading a report? A
|
Direct observer? Hearsay at one or two removes? Reading a report?
|
||||||
general giving testimony about an event they only learned about via
|
|
||||||
an underling matters differently than a pilot recounting an event
|
|
||||||
they flew.
|
|
||||||
- **Bias.** Career incentive, ideological commitment, prior public
|
- **Bias.** Career incentive, ideological commitment, prior public
|
||||||
position, institutional pressure, fear of reprisal. List the ones
|
position, institutional pressure, fear of reprisal. Cite chunks.
|
||||||
you can ground in the chunks.
|
- **Corroboration.** Do other chunks confirm the same factual claim,
|
||||||
- **Corroboration.** Do other chunks (other people, other docs)
|
refute it, or stay silent?
|
||||||
confirm the same factual claim, refute it, or stay silent? If two
|
|
||||||
witnesses independently say the same thing, that strengthens both;
|
6. You assign a single `credibility` band:
|
||||||
if everyone got the story from one source, the corroboration is
|
|
||||||
illusory.
|
|
||||||
2. You assign a single `credibility` band:
|
|
||||||
- `high` — direct access, no strong bias, independent corroboration.
|
- `high` — direct access, no strong bias, independent corroboration.
|
||||||
- `medium` — partial access OR mild bias OR thin corroboration.
|
- `medium` — partial access OR mild bias OR thin corroboration.
|
||||||
- `low` — second-hand OR active bias OR contradicted by other chunks.
|
- `low` — second-hand OR active bias documented in chunks OR
|
||||||
|
contradicted by other chunks.
|
||||||
- `speculation` — the chunks describe the person only by name; no
|
- `speculation` — the chunks describe the person only by name; no
|
||||||
basis to assess.
|
basis to assess. (You should normally emit `INSUFFICIENT_TESTIMONY`
|
||||||
3. `corroboration_refs` is an array of objects `{chunk_id, supports}` —
|
instead of using this band.)
|
||||||
|
|
||||||
|
7. `corroboration_refs` is an array of objects `{chunk_id, supports}` —
|
||||||
each cites a different chunk that confirms (`supports: true`) or
|
each cites a different chunk that confirms (`supports: true`) or
|
||||||
refutes (`supports: false`) something the witness asserts. Aim for 2-5
|
refutes (`supports: false`) something the witness asserts. Aim for
|
||||||
entries when possible.
|
2-5 entries when possible.
|
||||||
4. `verdict` is ONE sentence (≤ 280 chars). Declarative. No hedging.
|
|
||||||
Hedging belongs in `credibility`, not in the wording.
|
8. `verdict` is ONE sentence (≤ 280 chars). Declarative. No hedging.
|
||||||
|
|
||||||
## Output protocol — bilingual EN + PT-BR (mandatory)
|
## Output protocol — bilingual EN + PT-BR (mandatory)
|
||||||
|
|
||||||
|
|
@ -46,11 +89,12 @@ appears in EN AND in PT-BR (Brazilian Portuguese with UTF-8 accents).
|
||||||
|
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
"credibility": "high | medium | low | speculation",
|
"direct_testimony_chunk_ids": ["c0042", "c0087", "c0091"],
|
||||||
"access_to_event": "EN one paragraph describing access. Ground specific facts in chunk_ids.",
|
"credibility": "high | medium | low",
|
||||||
"access_to_event_pt_br": "PT-BR um parágrafo descrevendo acesso. Fundamente fatos específicos em chunk_ids.",
|
"access_to_event": "EN one paragraph. Cite each fact with [[chunk]].",
|
||||||
"bias_notes": "EN one paragraph naming concrete biases visible in the corpus.",
|
"access_to_event_pt_br": "PT-BR um parágrafo. Fundamente cada fato com [[chunk]].",
|
||||||
"bias_notes_pt_br": "PT-BR um parágrafo nomeando vieses concretos visíveis no corpus.",
|
"bias_notes": "EN. Every bias claim cites a chunk.",
|
||||||
|
"bias_notes_pt_br": "PT-BR. Cada afirmação de viés cita um chunk.",
|
||||||
"corroboration_refs": [
|
"corroboration_refs": [
|
||||||
{"chunk_id": "c0042", "supports": true},
|
{"chunk_id": "c0042", "supports": true},
|
||||||
{"chunk_id": "c0087", "supports": false}
|
{"chunk_id": "c0087", "supports": false}
|
||||||
|
|
@ -61,14 +105,19 @@ appears in EN AND in PT-BR (Brazilian Portuguese with UTF-8 accents).
|
||||||
```
|
```
|
||||||
|
|
||||||
Constraints:
|
Constraints:
|
||||||
|
- `direct_testimony_chunk_ids` is the gating field. Below the floor (3
|
||||||
|
generally, 5 for famous figures), you do NOT emit this object. You
|
||||||
|
emit `INSUFFICIENT_TESTIMONY` and nothing else.
|
||||||
- `access_to_event` and `bias_notes` ≤ 800 chars each (per language).
|
- `access_to_event` and `bias_notes` ≤ 800 chars each (per language).
|
||||||
- `corroboration_refs` ≤ 8 entries, MUST cite chunk_id values that appear
|
- `corroboration_refs` ≤ 8 entries, MUST cite chunk_id values that appear
|
||||||
in the corpus shortlist you were given.
|
in the corpus shortlist you were given.
|
||||||
- `verdict` ≤ 280 chars (per language), no hedging language inside the
|
- `verdict` ≤ 280 chars (per language), no hedging language inside the
|
||||||
sentence.
|
sentence.
|
||||||
- A missing `*_pt_br` sibling is a hard validation failure — the writer
|
- A missing `*_pt_br` sibling is a hard validation failure.
|
||||||
rejects the analysis.
|
|
||||||
|
|
||||||
If the corpus contains no chunks where the named person actually appears
|
## Tone
|
||||||
(only the entity card from the wiki without supporting passages), emit
|
|
||||||
the literal word `INSUFFICIENT_TESTIMONY` and stop.
|
Witness analysis published on a public investigative wiki carries
|
||||||
|
reputational weight. Write as a careful prosecutor preparing a brief, not
|
||||||
|
as a debunker scoring points. State what the corpus shows; do not
|
||||||
|
extrapolate to character or motive that the corpus does not document.
|
||||||
|
|
|
||||||
|
|
@ -102,6 +102,7 @@ function extractJsonObject(text: string): Record<string, unknown> | null {
|
||||||
// The skip sentinel can appear bare, in backticks, or as the leading token
|
// The skip sentinel can appear bare, in backticks, or as the leading token
|
||||||
// followed by Poirot's explanation prose. All count as "skipped".
|
// followed by Poirot's explanation prose. All count as "skipped".
|
||||||
if (/^`?INSUFFICIENT_TESTIMONY`?\b/i.test(t)) return null;
|
if (/^`?INSUFFICIENT_TESTIMONY`?\b/i.test(t)) return null;
|
||||||
|
if (/\bINSUFFICIENT_TESTIMONY\b/i.test(t.split("\n")[0])) return null;
|
||||||
const stripped = t.replace(/^```(?:json)?\s*\n?/i, "").replace(/\n?```\s*$/i, "");
|
const stripped = t.replace(/^```(?:json)?\s*\n?/i, "").replace(/\n?```\s*$/i, "");
|
||||||
const first = stripped.indexOf("{");
|
const first = stripped.indexOf("{");
|
||||||
const last = stripped.lastIndexOf("}");
|
const last = stripped.lastIndexOf("}");
|
||||||
|
|
@ -244,6 +245,45 @@ export async function runPoirot(task: PoirotTask): Promise<
|
||||||
return { skipped: true, reason: "incomplete_bilingual_analysis" };
|
return { skipped: true, reason: "incomplete_bilingual_analysis" };
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// HARD FLOOR — defense in depth against entity_mentions false positives.
|
||||||
|
// If Poirot didn't surface direct_testimony_chunk_ids or it's below the
|
||||||
|
// floor, refuse to write. This is the rule that keeps a thin corpus from
|
||||||
|
// producing a defamatory verdict on a famous historical figure.
|
||||||
|
const directTestimony = Array.isArray(obj.direct_testimony_chunk_ids)
|
||||||
|
? (obj.direct_testimony_chunk_ids as unknown[]).filter((x): x is string => typeof x === "string" && x.trim().length > 0)
|
||||||
|
: [];
|
||||||
|
// Famous-figure list — the rule asks ≥5 chunks for public figures.
|
||||||
|
// Lower-cased entity_id, anchored. Extend as the corpus grows.
|
||||||
|
const FAMOUS = new Set([
|
||||||
|
"j-edgar-hoover", "edgar-hoover", "hoover",
|
||||||
|
"donald-keyhoe", "keyhoe",
|
||||||
|
"j-allen-hynek", "allen-hynek", "hynek",
|
||||||
|
"curtis-lemay", "lemay",
|
||||||
|
"nathan-twining", "twining",
|
||||||
|
"vannevar-bush",
|
||||||
|
"john-f-kennedy", "kennedy",
|
||||||
|
"harry-truman", "truman",
|
||||||
|
"dwight-eisenhower", "eisenhower",
|
||||||
|
"ted-bloecher", "bloecher",
|
||||||
|
]);
|
||||||
|
const slug = (task.person_id ?? "").toLowerCase();
|
||||||
|
const isFamous = FAMOUS.has(slug);
|
||||||
|
const floor = isFamous ? 5 : 3;
|
||||||
|
if (directTestimony.length < floor) {
|
||||||
|
await audit({
|
||||||
|
event: "poirot_refused_floor",
|
||||||
|
job_id: task.job_id,
|
||||||
|
detective: "poirot@detective",
|
||||||
|
person_id: task.person_id,
|
||||||
|
person_entity_pk: entity_pk,
|
||||||
|
canonical_name,
|
||||||
|
direct_testimony_count: directTestimony.length,
|
||||||
|
floor,
|
||||||
|
is_famous_figure: isFamous,
|
||||||
|
});
|
||||||
|
return { skipped: true, reason: `insufficient_direct_testimony_${directTestimony.length}_of_${floor}` };
|
||||||
|
}
|
||||||
|
|
||||||
// Soft-truncate before sending to the writer: the prompt asks ≤ 280 chars
|
// Soft-truncate before sending to the writer: the prompt asks ≤ 280 chars
|
||||||
// per language but the model occasionally goes slightly over (304 chars
|
// per language but the model occasionally goes slightly over (304 chars
|
||||||
// observed live with j-edgar-hoover PT-BR). Truncate at sentence boundary
|
// observed live with j-edgar-hoover PT-BR). Truncate at sentence boundary
|
||||||
|
|
|
||||||
Loading…
Reference in a new issue