Two bugs combined to make the chat reply with only cards and no prose:
1. SQL trigger rollup_session_stats was failing with "column reference
total_cost_usd is ambiguous" because the UPDATE on public.profiles had
a FROM public.chat_sessions clause and both tables expose that column.
Persistence of every user message died at this point — sessions were
created in the DB but had message_count=0 forever. Applied SQL fix
that qualifies columns with p./s. aliases (production DB updated;
ALTER FUNCTION run live, not yet codified in a migration file).
2. The free-tier model (nemotron-3-super:free) spent all 5 tool-loop
turns on hybrid_search calls and never wrote any prose, returning
content_len=0. Added a forced-synthesis pass in openrouter.ts: when
the loop exits with empty assembledText but the model did call tools,
we send ONE final turn with tools omitted from the request payload
and a user message instructing the model to answer in 3-8 sentences
citing chunks. openrouterStreamCall now accepts a `withTools` opt
so the synthesis call can disable tool calling entirely.
Verified end-to-end with the actual user query "O que os astronautas
viram? Quem foi que viu?" on /d/nasa-uap-d6-apollo-17-...:
- content_len: 0 → 947 chars (real synthesis citing Schmitt)
- artifacts: 44 preserved
- assistant message persisted with tool_calls + citations columns