Transcription Mode

support2 · March 3, 2026, 3:23pm

The confusion you’re experiencing is common - the latency shown at the top is overall pipeline latency (ASR + LLM + TTS), not just the ASR setting you changed(1). Even if you adjust the transcription config, the displayed latency may not shift much unless the change directly impacts the full processing chain(1).

For improving transcription accuracy specifically with English, the documentation recommends using Boosted Keywords(2). You can add custom keywords to enhance the model’s vocabulary, with support for up to 100 custom keywords(2). This is particularly helpful for specific words or domain-specific terms that might be missing from transcripts(2).

Additionally, if transcription quality is impacted because sentences are being cut off (the transcription outputs the finalized sentence before it should), you can turn on transcription mode to be optimized for accuracy(2).

The endpointing setting you’re adjusting controls when the system decides the speaker has finished, but it primarily affects transcription quality rather than the overall latency metric displayed at the top(3). The displayed latency encompasses the entire conversation pipeline, so ASR endpointing changes alone won’t dramatically shift that number(3).

For further assistance with your specific configuration:

Visit Documentation
Ask questions on Discord

Topic		Replies	Views
The transcription and languages! Support Help agent-logic	5	12	March 13, 2026
Latency issues inbound calls Support Help	6	95	January 14, 2026
Custom STT Integration for Taiwanese Hokkien — Using Self-hosted Whisper Model Support Help integrations	2	12	March 18, 2026
The transcription and languages Feature Requests	0	4	March 12, 2026
Issues with Portuguese (PT-PT) Voice Agent: Latency, Audio Recognition, and Email Capture Support Help	1	36	January 30, 2026

Transcription Mode

Related topics