EngineeringMay 8, 20268 min read
Cutting Voice AI Costs: Knowledge Bases vs. Mega-Prompts
Retell bills LLM usage per minute, but prompts over ~3,500 tokens trigger proportional surcharges. A 15,000-token mega-prompt can 4× your LLM line item.
Fix one: deduplicate. If your platform injects tool schemas, don't also paste full tool docs in the prompt.
Fix two: knowledge base. Move clinic hours, procedure lists, and FAQs into KB chunks retrieved per turn.
Fix three: multi-state LLM. Booking, refills, and records each get a short state prompt and only their tools.
We've seen customers cut effective LLM cost 40–60% with these three changes alone—without sacrificing call quality.