Back to blog
EngineeringMay 8, 20268 min read

Cutting Voice AI Costs: Knowledge Bases vs. Mega-Prompts

Retell bills LLM usage per minute, but prompts over ~3,500 tokens trigger proportional surcharges. A 15,000-token mega-prompt can 4× your LLM line item.

Fix one: deduplicate. If your platform injects tool schemas, don't also paste full tool docs in the prompt.

Fix two: knowledge base. Move clinic hours, procedure lists, and FAQs into KB chunks retrieved per turn.

Fix three: multi-state LLM. Booking, refills, and records each get a short state prompt and only their tools.

We've seen customers cut effective LLM cost 40–60% with these three changes alone—without sacrificing call quality.

Ready to apply this to your deployment?