W3.5: Holmes hypothesis tournament detective
Adds the second AI detective in the Investigation Bureau runtime: Sherlock
Holmes, who builds 2-3 rival hypotheses with calibrated priors + posteriors
against a corpus shortlist.
Pipeline:
1. hybridSearch() grounds Holmes with 8-15 chunks via the same
hybrid_search_chunks RPC the web uses (BM25 + dense + RRF). Default
max_dense_dist=0.55 (runtime favors recall over precision; web's
/api/search/hybrid stays at 0.40 for chat).
2. claude-sonnet-4-6 emits a strict JSON array with position +
argument_for + argument_against + prior + posterior + confidence_band
+ evidence_refs. Citations use [[doc-id/pNNN#cNNNN]] wiki-links.
3. writeHypothesis() validates posterior ∈ [0,1], auto-corrects the
Tetlock band from the posterior (high ≥0.90, medium 0.60-0.89,
low 0.30-0.59, speculation <0.30), checks evidence_refs FK against
public.evidence, INSERTs into public.hypotheses + writes
case/hypotheses/H-NNNN.md.
Discipline guarantees (prompts/holmes.md):
- posteriors across rivals sum to ≈1.0
- no claim without chunk citation
- prefer lower band when ambiguous (anti-inflation)
- declarative one-sentence position, no hedging
- emit `NO_HYPOTHESES` when corpus is silent (refuses to fabricate)
Smoke test (Sandia green fireballs 1948-49):
- H-0001 prior 0.5 → posterior 0.2 (speculation): natural meteoric
- H-0002 prior 0.3 → posterior 0.4 (low): classified weapons / tests
- H-0003 prior 0.2 → posterior 0.4 (low): genuinely unidentified
Bayesian update visible: "natural meteoric" prior dropped 60%; both
rivals climbed. 4 unique chunk citations across the 3 hypotheses.
orchestrator dispatches `hypothesis_tournament` kind via runHolmes;
job marked `failed` if all rivals error, `complete` otherwise.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
54a26f8db8
commit
4d4c02a8e1
5 changed files with 508 additions and 0 deletions
62
investigator-runtime/prompts/holmes.md
Normal file
62
investigator-runtime/prompts/holmes.md
Normal file
|
|
@ -0,0 +1,62 @@
|
|||
# You are Sherlock Holmes
|
||||
|
||||
You are Sherlock Holmes — deductive detective whose method is to construct
|
||||
**rival hypotheses** for any phenomenon, argue for each from observable
|
||||
evidence, and assign a posterior probability so the field of possibilities
|
||||
narrows toward what remains, however improbable.
|
||||
|
||||
## Discipline (non-negotiable)
|
||||
|
||||
1. Given a question and a corpus of cited chunks, you produce **2 or 3 rival
|
||||
hypotheses**. Each is a one-sentence proposition that could explain the
|
||||
phenomenon.
|
||||
2. For each hypothesis you write a brief `argument_for` (≤ 6 sentences) and
|
||||
`argument_against` (≤ 6 sentences). **Every claim cites a chunk** via the
|
||||
wiki-link grammar `[[doc-id/pNNN#cNNNN]]`. No chunk citation → no claim.
|
||||
3. You assign:
|
||||
* `prior` — your baseline probability before reading the chunks (≈ how
|
||||
unusual the proposition is in the literature).
|
||||
* `posterior` — the probability after weighing the cited evidence.
|
||||
* **Posteriors across the rival set should sum to roughly 1.0**. If they
|
||||
don't, you adjust until they do.
|
||||
4. `confidence_band` follows Tetlock:
|
||||
* `high` ≥ 0.90 · `medium` 0.60-0.89 · `low` 0.30-0.59 · `speculation` < 0.30.
|
||||
* When evidence is ambiguous, prefer the lower band. Inflation is a sin.
|
||||
5. You do not invent `chunk_id`s. If you cannot find a chunk that supports
|
||||
a claim, state "[no evidence in corpus]" inline and lower the posterior
|
||||
accordingly.
|
||||
6. You do not hedge in prose. The position is **one sentence**, declarative.
|
||||
Hedging belongs in the posterior, not in the wording.
|
||||
|
||||
## Output protocol
|
||||
|
||||
Emit a strict JSON array. No prose around it. No code fence. Just the array.
|
||||
|
||||
```json
|
||||
[
|
||||
{
|
||||
"position": "...",
|
||||
"argument_for": "...",
|
||||
"argument_against": "...",
|
||||
"prior": 0.30,
|
||||
"posterior": 0.55,
|
||||
"confidence_band": "low",
|
||||
"evidence_refs": [
|
||||
{"evidence_id": "E-0042", "supports": true},
|
||||
{"evidence_id": "E-0043", "supports": false}
|
||||
]
|
||||
},
|
||||
{ ... another rival ... },
|
||||
{ ... another rival ... }
|
||||
]
|
||||
```
|
||||
|
||||
Note:
|
||||
- `evidence_refs` is **optional** — leave as `[]` if no `E-NNNN` evidence has
|
||||
been catalogued yet for this question; chunk citations in the prose are
|
||||
sufficient for v0.
|
||||
- `question` is supplied by the runtime; you do not echo it.
|
||||
- The runtime owns the writer; you emit data only.
|
||||
|
||||
If the corpus contains nothing relevant to the question, emit the literal
|
||||
single word `NO_HYPOTHESES` and stop.
|
||||
177
investigator-runtime/src/detectives/holmes.ts
Normal file
177
investigator-runtime/src/detectives/holmes.ts
Normal file
|
|
@ -0,0 +1,177 @@
|
|||
/**
|
||||
* holmes.ts — hypothesis tournament detective.
|
||||
*
|
||||
* Workflow (matches agentic-layer-spec sec 7):
|
||||
* 1. The runtime grounds Holmes with a small corpus shortlist via
|
||||
* hybridSearch — Holmes never gets the whole DB, just the relevant 8-15
|
||||
* chunks.
|
||||
* 2. Claude Sonnet 4.6 reads the question + chunks, emits a JSON array of
|
||||
* 2-3 rival hypotheses with priors/posteriors/citations.
|
||||
* 3. The runtime parses the array and calls writeHypothesis() for each.
|
||||
* The writer enforces posterior bounds + Tetlock band + FK to evidence.
|
||||
*
|
||||
* Holmes does NOT get tool calls. All grounding is pre-fed; all writes are
|
||||
* applied by the runtime after validation (sa-security gate #2).
|
||||
*/
|
||||
import { readFile } from "node:fs/promises";
|
||||
import path from "node:path";
|
||||
import { fileURLToPath } from "node:url";
|
||||
import { audit } from "../lib/audit";
|
||||
import { callClaude } from "../lib/claude";
|
||||
import { env } from "../lib/env";
|
||||
import { hybridSearch, type SearchHit } from "../lib/search";
|
||||
import { writeHypothesis, type WriteHypothesisArgs } from "../tools/write_hypothesis";
|
||||
|
||||
const HERE = path.dirname(fileURLToPath(import.meta.url));
|
||||
const PROMPT_PATH = path.resolve(HERE, "..", "..", "prompts", "holmes.md");
|
||||
|
||||
export interface HolmesTask {
|
||||
job_id: string;
|
||||
question: string;
|
||||
/** Optional scope narrowing — restrict the search to one doc / entity. */
|
||||
doc_id?: string;
|
||||
lang?: "pt" | "en";
|
||||
/** How many chunks to feed Holmes. Default 12. */
|
||||
context_chunks?: number;
|
||||
budget_cap_usd?: number;
|
||||
}
|
||||
|
||||
function renderChunkBlock(hits: SearchHit[], lang: "pt" | "en"): string {
|
||||
const blocks = hits.map((h, i) => {
|
||||
const text = (lang === "en" ? h.content_en : h.content_pt) || h.content_en || h.content_pt || "";
|
||||
const pageStr = String(h.page).padStart(3, "0");
|
||||
return [
|
||||
`--- chunk ${i + 1} ---`,
|
||||
`id: [[${h.doc_id}/p${pageStr}#${h.chunk_id}]]`,
|
||||
`type: ${h.type}`,
|
||||
h.classification ? `classification: ${h.classification}` : null,
|
||||
"",
|
||||
text.slice(0, 1200),
|
||||
].filter(Boolean).join("\n");
|
||||
});
|
||||
return blocks.join("\n\n");
|
||||
}
|
||||
|
||||
function buildPrompt(task: HolmesTask, hits: SearchHit[], lang: "pt" | "en"): string {
|
||||
const block = renderChunkBlock(hits, lang);
|
||||
return [
|
||||
`# Question to investigate`,
|
||||
"",
|
||||
task.question,
|
||||
"",
|
||||
`## Corpus shortlist (${hits.length} chunks${task.doc_id ? `, scoped to ${task.doc_id}` : ""})`,
|
||||
"",
|
||||
block,
|
||||
"",
|
||||
"## Your task",
|
||||
"",
|
||||
"Build 2-3 rival hypotheses about the question above. Each must cite at",
|
||||
"least one chunk via [[doc-id/pNNN#cNNNN]] in argument_for and",
|
||||
"argument_against. Assign priors + posteriors summing roughly to 1.0.",
|
||||
"Emit the JSON array exactly as specified by the system prompt — no prose,",
|
||||
"no code fence, no preamble.",
|
||||
].join("\n");
|
||||
}
|
||||
|
||||
function extractJsonArray(text: string): unknown[] | null {
|
||||
const t = text.trim();
|
||||
if (t === "NO_HYPOTHESES") return null;
|
||||
const stripped = t.replace(/^```(?:json)?\s*\n?/i, "").replace(/\n?```\s*$/i, "");
|
||||
const first = stripped.indexOf("[");
|
||||
const last = stripped.lastIndexOf("]");
|
||||
if (first === -1 || last === -1) {
|
||||
throw new Error(`holmes returned no JSON array: ${t.slice(0, 200)}`);
|
||||
}
|
||||
const parsed = JSON.parse(stripped.slice(first, last + 1));
|
||||
if (!Array.isArray(parsed)) throw new Error("holmes JSON is not an array");
|
||||
return parsed;
|
||||
}
|
||||
|
||||
export async function runHolmes(task: HolmesTask): Promise<
|
||||
| { hypotheses: Array<{ hypothesis_id: string; case_file: string }> }
|
||||
| { skipped: true; reason: string }
|
||||
> {
|
||||
const lang: "pt" | "en" = task.lang ?? "pt";
|
||||
const k = task.context_chunks ?? 12;
|
||||
|
||||
// 1. Ground with hybrid_search.
|
||||
const hits = await hybridSearch({
|
||||
query: task.question,
|
||||
lang,
|
||||
doc_id: task.doc_id ?? null,
|
||||
top_k: k,
|
||||
recall_k: 60,
|
||||
});
|
||||
await audit({
|
||||
event: "holmes_grounded",
|
||||
job_id: task.job_id,
|
||||
detective: "holmes@detective",
|
||||
question: task.question,
|
||||
n_chunks: hits.length,
|
||||
doc_id: task.doc_id ?? null,
|
||||
});
|
||||
if (hits.length === 0) {
|
||||
return { skipped: true, reason: "no_corpus_match" };
|
||||
}
|
||||
|
||||
// 2. Call Claude.
|
||||
const systemPrompt = await readFile(PROMPT_PATH, "utf-8");
|
||||
const prompt = buildPrompt(task, hits, lang);
|
||||
const llm = await callClaude({
|
||||
prompt,
|
||||
systemPrompt,
|
||||
model: env.CLAUDE_MODEL,
|
||||
allowedTools: [],
|
||||
timeoutMs: env.JOB_TIMEOUT_SECONDS * 1000,
|
||||
budgetCapUsd: task.budget_cap_usd ?? env.BUDGET_CAP_USD_PER_JOB,
|
||||
});
|
||||
await audit({
|
||||
event: "detective_completed",
|
||||
job_id: task.job_id,
|
||||
detective: "holmes@detective",
|
||||
cost_usd: llm.costUsd,
|
||||
tokens_in: llm.tokensIn,
|
||||
tokens_out: llm.tokensOut,
|
||||
duration_ms: llm.durationMs,
|
||||
});
|
||||
|
||||
console.error(`[holmes] response (${llm.text.length} chars): ${llm.text.slice(0, 800)}`);
|
||||
|
||||
// 3. Parse + write.
|
||||
const arr = extractJsonArray(llm.text);
|
||||
if (arr === null) return { skipped: true, reason: "NO_HYPOTHESES" };
|
||||
|
||||
const out: Array<{ hypothesis_id: string; case_file: string }> = [];
|
||||
for (const raw of arr.slice(0, 3)) {
|
||||
const args: WriteHypothesisArgs = {
|
||||
question: task.question,
|
||||
position: String((raw as { position?: unknown }).position ?? "").trim(),
|
||||
argument_for: typeof (raw as { argument_for?: unknown }).argument_for === "string"
|
||||
? (raw as { argument_for: string }).argument_for : undefined,
|
||||
argument_against: typeof (raw as { argument_against?: unknown }).argument_against === "string"
|
||||
? (raw as { argument_against: string }).argument_against : undefined,
|
||||
prior: Number((raw as { prior?: unknown }).prior),
|
||||
posterior: Number((raw as { posterior?: unknown }).posterior),
|
||||
confidence_band: (raw as { confidence_band?: WriteHypothesisArgs["confidence_band"] }).confidence_band,
|
||||
evidence_refs: Array.isArray((raw as { evidence_refs?: unknown }).evidence_refs)
|
||||
? (raw as { evidence_refs: Array<{ evidence_id?: string; supports?: boolean; weight?: number }> }).evidence_refs
|
||||
.filter((r): r is { evidence_id: string; supports?: boolean; weight?: number } =>
|
||||
typeof r?.evidence_id === "string" && r.evidence_id.length > 0)
|
||||
: [],
|
||||
};
|
||||
if (!args.position) continue;
|
||||
try {
|
||||
const r = await writeHypothesis(args, { job_id: task.job_id, detective: "holmes@detective" });
|
||||
out.push(r);
|
||||
} catch (e) {
|
||||
await audit({
|
||||
event: "write_hypothesis_failed",
|
||||
job_id: task.job_id,
|
||||
detective: "holmes@detective",
|
||||
error: (e as Error).message,
|
||||
position: args.position.slice(0, 200),
|
||||
});
|
||||
}
|
||||
}
|
||||
return { hypotheses: out };
|
||||
}
|
||||
89
investigator-runtime/src/lib/search.ts
Normal file
89
investigator-runtime/src/lib/search.ts
Normal file
|
|
@ -0,0 +1,89 @@
|
|||
/**
|
||||
* search.ts — hybrid retrieval inside the runtime.
|
||||
*
|
||||
* Wraps the same `public.hybrid_search_chunks` RPC the web uses, with the
|
||||
* query embedding fetched from the embed-service. Detectives that need
|
||||
* grounding (Holmes, Dupin, …) call this to assemble a chunk shortlist
|
||||
* before reasoning.
|
||||
*
|
||||
* Default `rerank: never` — the runtime is latency-sensitive (it runs in the
|
||||
* agentic loop) and the RPC's RRF order is good enough for the head.
|
||||
*/
|
||||
import { env } from "./env";
|
||||
import { query } from "./pg";
|
||||
|
||||
export interface SearchHit {
|
||||
chunk_pk: number;
|
||||
doc_id: string;
|
||||
chunk_id: string;
|
||||
page: number;
|
||||
type: string;
|
||||
bbox: { x: number; y: number; w: number; h: number } | null;
|
||||
content_en: string | null;
|
||||
content_pt: string | null;
|
||||
classification: string | null;
|
||||
score: number;
|
||||
bm25_rank: number | null;
|
||||
dense_rank: number | null;
|
||||
}
|
||||
|
||||
async function embedQuery(q: string): Promise<number[]> {
|
||||
const r = await fetch(`${env.EMBED_SERVICE_URL}/embed`, {
|
||||
method: "POST",
|
||||
headers: { "Content-Type": "application/json" },
|
||||
body: JSON.stringify({ texts: [q] }),
|
||||
signal: AbortSignal.timeout(60_000),
|
||||
});
|
||||
if (!r.ok) {
|
||||
const t = await r.text();
|
||||
throw new Error(`embed-service HTTP ${r.status}: ${t.slice(0, 200)}`);
|
||||
}
|
||||
const data = await r.json() as { embeddings: number[][] };
|
||||
const vec = data.embeddings?.[0];
|
||||
if (!Array.isArray(vec)) throw new Error("embed-service returned no vector");
|
||||
return vec;
|
||||
}
|
||||
|
||||
function toVectorLiteral(vec: number[]): string {
|
||||
return "[" + vec.join(",") + "]";
|
||||
}
|
||||
|
||||
export interface HybridSearchOpts {
|
||||
query: string;
|
||||
lang?: "pt" | "en";
|
||||
doc_id?: string | null;
|
||||
type?: string | null;
|
||||
classification?: string | null;
|
||||
ufo_only?: boolean;
|
||||
recall_k?: number;
|
||||
top_k?: number;
|
||||
/**
|
||||
* Max cosine distance for dense matches. Lower = stricter (less noise),
|
||||
* higher = better recall. The web's `/api/search/hybrid` uses 0.40 for
|
||||
* the chat (precision matters). The runtime defaults to 0.55: Holmes
|
||||
* needs RECALL so the model has chunks to reason over; it filters out
|
||||
* the irrelevant ones in the prose.
|
||||
*/
|
||||
max_dense_dist?: number;
|
||||
}
|
||||
|
||||
export async function hybridSearch(opts: HybridSearchOpts): Promise<SearchHit[]> {
|
||||
const {
|
||||
query: q,
|
||||
lang = "pt",
|
||||
doc_id = null,
|
||||
type = null,
|
||||
classification = null,
|
||||
ufo_only = false,
|
||||
recall_k = 100,
|
||||
top_k = 20,
|
||||
max_dense_dist = 0.55,
|
||||
} = opts;
|
||||
if (!q.trim()) return [];
|
||||
const vec = await embedQuery(q);
|
||||
const rows = await query<SearchHit>(
|
||||
`SELECT * FROM public.hybrid_search_chunks($1, $2::vector, $3, $4, $5, $6, $7, $8, 60, $9)`,
|
||||
[q, toVectorLiteral(vec), lang, doc_id, type, classification, ufo_only, recall_k, max_dense_dist],
|
||||
);
|
||||
return rows.slice(0, top_k);
|
||||
}
|
||||
|
|
@ -8,6 +8,7 @@
|
|||
import { audit } from "./lib/audit";
|
||||
import { query } from "./lib/pg";
|
||||
import { runLocard, type LocardTask } from "./detectives/locard";
|
||||
import { runHolmes, type HolmesTask } from "./detectives/holmes";
|
||||
|
||||
export interface InvestigationJob {
|
||||
job_id: string;
|
||||
|
|
@ -47,6 +48,26 @@ export async function dispatch(job: InvestigationJob, workerId: string): Promise
|
|||
}
|
||||
break;
|
||||
}
|
||||
case "hypothesis_tournament": {
|
||||
// Payload: { question, doc_id?, lang?, context_chunks? }
|
||||
const question = String(job.payload.question ?? "").trim();
|
||||
if (!question) throw new Error("hypothesis_tournament requires payload.question");
|
||||
const task: HolmesTask = {
|
||||
job_id: job.job_id,
|
||||
question,
|
||||
doc_id: typeof job.payload.doc_id === "string" ? job.payload.doc_id : undefined,
|
||||
lang: job.payload.lang === "en" ? "en" : "pt",
|
||||
context_chunks: typeof job.payload.context_chunks === "number" ? job.payload.context_chunks : undefined,
|
||||
};
|
||||
const r = await runHolmes(task);
|
||||
if ("skipped" in r) {
|
||||
outputs.push({ kind: "hypothesis_tournament", skipped: true, reason: r.reason });
|
||||
} else {
|
||||
for (const h of r.hypotheses) outputs.push({ kind: "hypothesis", ...h });
|
||||
}
|
||||
break;
|
||||
}
|
||||
|
||||
default:
|
||||
throw new Error(`unknown_kind: ${job.kind}`);
|
||||
}
|
||||
|
|
|
|||
159
investigator-runtime/src/tools/write_hypothesis.ts
Normal file
159
investigator-runtime/src/tools/write_hypothesis.ts
Normal file
|
|
@ -0,0 +1,159 @@
|
|||
/**
|
||||
* write_hypothesis.ts — Holmes's primary writer.
|
||||
*
|
||||
* Inserts a row into public.hypotheses and renders case/hypotheses/H-NNNN.md.
|
||||
* Validates:
|
||||
* - question + position + arguments are non-empty strings
|
||||
* - prior ∈ [0,1], posterior ∈ [0,1] when present
|
||||
* - confidence_band matches Tetlock thresholds (high ≥ 0.90, medium 0.60-0.89,
|
||||
* low 0.30-0.59, speculation < 0.30); auto-corrects to the band that
|
||||
* matches the posterior when caller disagrees
|
||||
* - evidence_refs[].evidence_id must already exist in public.evidence
|
||||
*/
|
||||
import { mkdir, writeFile } from "node:fs/promises";
|
||||
import path from "node:path";
|
||||
import { audit } from "../lib/audit";
|
||||
import { env } from "../lib/env";
|
||||
import { allocate } from "../lib/ids";
|
||||
import { query, queryOne } from "../lib/pg";
|
||||
|
||||
export interface EvidenceRef {
|
||||
evidence_id: string;
|
||||
supports?: boolean;
|
||||
weight?: number;
|
||||
}
|
||||
|
||||
export interface WriteHypothesisArgs {
|
||||
question: string;
|
||||
position: string;
|
||||
argument_for?: string;
|
||||
argument_against?: string;
|
||||
prior?: number;
|
||||
posterior?: number;
|
||||
confidence_band?: "high" | "medium" | "low" | "speculation";
|
||||
evidence_refs?: EvidenceRef[];
|
||||
status?: "open" | "closed" | "dormant" | "superseded";
|
||||
}
|
||||
|
||||
export interface WriteHypothesisContext {
|
||||
job_id: string;
|
||||
detective: string;
|
||||
}
|
||||
|
||||
function bandFromPosterior(p: number): "high" | "medium" | "low" | "speculation" {
|
||||
if (p >= 0.90) return "high";
|
||||
if (p >= 0.60) return "medium";
|
||||
if (p >= 0.30) return "low";
|
||||
return "speculation";
|
||||
}
|
||||
|
||||
function clamp01(n: unknown): number | null {
|
||||
if (typeof n !== "number" || !Number.isFinite(n)) return null;
|
||||
return Math.max(0, Math.min(1, n));
|
||||
}
|
||||
|
||||
function renderMd(id: string, body: WriteHypothesisArgs, ctx: WriteHypothesisContext): string {
|
||||
const evRefs = (body.evidence_refs ?? []).map((r) =>
|
||||
` - [[evidence/${r.evidence_id}]] (${r.supports === false ? "refutes" : "supports"}${typeof r.weight === "number" ? `, w=${r.weight}` : ""})`,
|
||||
).join("\n");
|
||||
const fm = [
|
||||
"---",
|
||||
`schema_version: "0.1.0"`,
|
||||
`type: hypothesis`,
|
||||
`hypothesis_id: ${id}`,
|
||||
body.prior !== undefined ? `prior: ${body.prior}` : null,
|
||||
body.posterior !== undefined ? `posterior: ${body.posterior}` : null,
|
||||
body.confidence_band ? `confidence_band: ${body.confidence_band}` : null,
|
||||
`status: ${body.status ?? "open"}`,
|
||||
`created_by: ${ctx.detective}`,
|
||||
`job_id: ${ctx.job_id}`,
|
||||
`created_at: ${new Date().toISOString()}`,
|
||||
"---",
|
||||
].filter(Boolean).join("\n");
|
||||
return [
|
||||
fm,
|
||||
"",
|
||||
`# Hypothesis ${id}`,
|
||||
"",
|
||||
`**Question.** ${body.question}`,
|
||||
"",
|
||||
`**Position.** ${body.position}`,
|
||||
"",
|
||||
"## Argument for",
|
||||
"",
|
||||
body.argument_for || "_(none recorded — speculation)_",
|
||||
"",
|
||||
"## Argument against",
|
||||
"",
|
||||
body.argument_against || "_(none recorded — no counter-argument framed yet)_",
|
||||
"",
|
||||
"## Evidence",
|
||||
"",
|
||||
evRefs || "_(none linked yet — Locard chain pending)_",
|
||||
"",
|
||||
].join("\n");
|
||||
}
|
||||
|
||||
export async function writeHypothesis(
|
||||
body: WriteHypothesisArgs,
|
||||
ctx: WriteHypothesisContext,
|
||||
): Promise<{ hypothesis_id: string; case_file: string }> {
|
||||
if (!body.question?.trim()) throw new Error("question required");
|
||||
if (!body.position?.trim()) throw new Error("position required");
|
||||
|
||||
const prior = clamp01(body.prior);
|
||||
const posterior = clamp01(body.posterior);
|
||||
|
||||
// Recompute band from posterior when set. Holmes can mis-label its own
|
||||
// confidence; we keep the model's intent in audit but persist the
|
||||
// band that matches the actual posterior.
|
||||
let band = body.confidence_band ?? null;
|
||||
if (posterior !== null) {
|
||||
const expected = bandFromPosterior(posterior);
|
||||
if (!band || band !== expected) band = expected;
|
||||
}
|
||||
|
||||
// Validate evidence refs exist.
|
||||
const refs: EvidenceRef[] = [];
|
||||
for (const r of body.evidence_refs ?? []) {
|
||||
if (!r?.evidence_id?.trim()) continue;
|
||||
const e = await queryOne<{ evidence_pk: number }>(
|
||||
`SELECT evidence_pk FROM public.evidence WHERE evidence_id = $1`,
|
||||
[r.evidence_id],
|
||||
);
|
||||
if (!e) throw new Error(`evidence not found: ${r.evidence_id}`);
|
||||
refs.push({ evidence_id: r.evidence_id, supports: r.supports !== false, weight: r.weight });
|
||||
}
|
||||
|
||||
const hypothesis_id = await allocate.hypothesisId();
|
||||
await query(
|
||||
`INSERT INTO public.hypotheses
|
||||
(hypothesis_id, question, position, argument_for, argument_against,
|
||||
evidence_refs, prior, posterior, confidence_band, status, created_by)
|
||||
VALUES ($1,$2,$3,$4,$5,$6::jsonb,$7,$8,$9,$10,$11)`,
|
||||
[
|
||||
hypothesis_id, body.question, body.position,
|
||||
body.argument_for ?? null, body.argument_against ?? null,
|
||||
JSON.stringify(refs),
|
||||
prior, posterior, band, body.status ?? "open",
|
||||
ctx.detective,
|
||||
],
|
||||
);
|
||||
|
||||
const dir = path.join(env.CASE_ROOT, "hypotheses");
|
||||
await mkdir(dir, { recursive: true });
|
||||
const file = path.join(dir, `${hypothesis_id}.md`);
|
||||
await writeFile(file, renderMd(hypothesis_id, { ...body, evidence_refs: refs, confidence_band: band ?? undefined }, ctx), "utf-8");
|
||||
|
||||
await audit({
|
||||
event: "write_hypothesis",
|
||||
job_id: ctx.job_id,
|
||||
detective: ctx.detective,
|
||||
hypothesis_id,
|
||||
posterior,
|
||||
band,
|
||||
n_evidence: refs.length,
|
||||
file,
|
||||
});
|
||||
return { hypothesis_id, case_file: file };
|
||||
}
|
||||
Loading…
Reference in a new issue