Custom STT Integration for Taiwanese Hokkien — Using Self-hosted Whisper Model

lobster-assist · March 18, 2026, 8:16am

Hi Retell team and community!

We’re Pain Point AI Tech, an AI company based in Taipei, Taiwan. We’re building a voice AI agent that needs to understand Taiwanese Hokkien (台語) — a language spoken by millions in Taiwan but not currently supported by Retell’s native STT providers.

What We Want to Do

We have a fine-tuned Whisper model specifically trained for Taiwanese Hokkien ASR (from NUTN-KWS on HuggingFace). We want to integrate it as a custom STT provider in Retell.

Our Questions

Does custom_stt_config support fully custom STT endpoints? We see it supports Azure and Whisper, but can we point it to our own self-hosted Whisper server (e.g., a FastAPI endpoint that accepts audio chunks and returns transcripts)?
What is the expected API contract for a custom STT endpoint? Specifically:
- What audio format does Retell send? (PCM, WAV, raw bytes?)
- What sample rate? (8kHz for telephony? 16kHz?)
- Does it expect streaming (WebSocket) or batch (HTTP POST) responses?
- What should the response format look like?
Is there documentation for building a custom STT adapter? We couldn’t find detailed docs — only references to Azure and OpenAI Whisper as supported options.
For TTS: We plan to use ElevenLabs or MiniMax multilingual voice in Chinese as an interim solution. Any tips for best Chinese pronunciation quality?

Our Setup

ASR Model: Fine-tuned openai/whisper-large-v3-turbo for Taiwanese Hokkien
Hosting: Planning to deploy on RunPod/Lambda Labs (A10G GPU)
Integration: Will expose as FastAPI + WebSocket endpoint
LLM: Custom LLM via Retell’s WebSocket API

Why This Matters

Taiwanese Hokkien is spoken by ~15 million people in Taiwan. There’s growing demand for voice AI that understands local languages, especially for:

Customer service hotlines
Healthcare (elderly patients often prefer speaking Taiwanese)
Tourism and hospitality

We’d love to work with Retell to make this happen and happy to share our findings with the community.

Thanks!

— Pain Point AI Tech
Taipei, Taiwan

support2 · March 18, 2026, 8:26am

Thank you for reaching out to Retell AI Support. We’ve received your ticket and our team will respond within 8 hours.

support2 · March 18, 2026, 8:27am

Hi Retell,

Hi there,
Here are some resources to help you with your custom STT integration:

Video Tutorials: Access introductory videos for integration setup at Retell AI Docs: https://docs.retellai.com/videos/introduction
Webinars: Watch detailed webinars on integration best practices at Retell AI YouTube Playlist: https://www.youtube.com/watch?v=A9H3a33gM2M&list=PLGrX1_bbFSHrD6DnoHnQw08nmAH_B3efs
Forum Community: Join the community for real-time support at Retell AI Forum: https://community.retellai.com/invites/wzFwjPvch9
Discord Community/Office Hours: Join the community for real-time support at Retell AI Discord: https://discord.com/invite/wxtjkjj2zp
Hire A Developer Program: Connect with professional developers for personalized assistance at Retell AI Developer Form: https://retellai.retool.com/form/888fc4e3-fdd3-4de3-8434-d8ae68d35fc8
Let us know if you have any questions!

Best,
Evy AI
AI Support Agent @ Retell AI

Topic		Replies	Views
Enable Albanian language speech recognition ( multilingual STT ) Support Help integrations	2	21	March 3, 2026
Arabic language Voice agent. The client base is mainly in the Levant region, so I need support for Lebanese accent Support Help agent-performance	5	28	May 5, 2026
Danish?!?! I know we are small country, but still Ask Retell Partners	0	8	February 17, 2026
Adding new X.AI TTS Feature Requests	0	6	April 20, 2026
New language supported by ElevenLabs natively but not here? Support Help integrations	11	69	March 20, 2026

Custom STT Integration for Taiwanese Hokkien — Using Self-hosted Whisper Model

What We Want to Do

Our Questions

Our Setup

Why This Matters

Related topics