What is LLM Configuration?
The Language Model (LLM) is the brain of your voice agent, powering conversation understanding, response generation, and decision-making. Configure which model to use, how creative or consistent it should be, and how much it can say to optimize performance for your specific use case.Supported LLM Providers:
- OpenAI - GPT-5, GPT-4.1, GPT-4o (and Mini/Nano variants)
- Gemini - Gemini 2.5 Pro, Gemini 2.5 Flash
- DeepMyst - Voice-optimized GPT-4.1 models
- Custom - Self-hosted or third-party models via OpenAI-compatible API
Key Configuration Parameters
Model Selection- Choose the right balance of speed, quality, and cost
- Faster models (GPT-5 Nano, Gemini Flash) for high volume
- Powerful models (GPT-5, GPT-4.1) for complex reasoning
- Controls randomness and creativity of responses
- Lower (0.2-0.4) for consistent, factual responses
- Higher (0.7-1.0) for natural, varied conversations
- Limits the length of agent responses
- Short (50-150) for quick confirmations
- Medium (150-500) for balanced conversations
- Long (500+) for detailed explanations
- Top P (nucleus sampling)
- Frequency and presence penalties
- Stop sequences
Model Comparison
| Model Type | Speed | Quality | Cost | Best For |
|---|---|---|---|---|
| Premium (GPT-5, GPT-4.1, Gemini Pro) | ⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | $$$ | Complex reasoning, critical conversations |
| Balanced (GPT-5 Mini, GPT-4.1 Mini, GPT-4o) | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐ | $$ | Most voice agent use cases |
| Fast (GPT-5 Nano, GPT-4.1 Nano, Gemini Flash) | ⭐⭐⭐⭐⭐ | ⭐⭐⭐ | $ | High volume, cost-sensitive |
Common Use Cases
Customer Support (Temperature: 0.3-0.5, Max Tokens: 150-300)- Consistent, accurate information delivery
- Professional and predictable tone
- Factual responses without creativity
- Engaging, natural conversations
- Personality and warmth
- Adaptive to caller mood
- Precise date/time handling
- Minimal errors in critical details
- Clear, unambiguous confirmations
Getting Started
Choose your approach based on how you want to configure models:Configure in Dashboard
Select models and adjust parameters using the web interface. Perfect for testing different configurations and finding optimal settings.
Set via API
Configure LLM settings programmatically. Ideal for dynamic model selection and parameter tuning based on use case.
Learn More
Prompt Engineering
Write effective prompts for better results
Testing
Test different LLM configurations
Single Prompt Agents
Configure LLM for Single Prompt Agents
Flow Agents
Set LLM parameters in Flow Agents
Best Practices
Model Selection- Start with balanced models (GPT-5 Mini, GPT-4.1 Mini)
- Use faster models for simple, high-volume tasks
- Reserve premium models for complex reasoning
- Begin conservative (0.4) and increase if needed
- Test extensively with real scenarios
- Lower temperature reduces hallucinations
- Higher temperature for more natural conversations
- Monitor token usage and costs continuously
- Set appropriate max tokens for your use case
- Use cheaper models where quality isn’t critical
- Implement smart routing based on complexity
- Track response times and latency
- Monitor error rates and quality metrics
- Adjust parameters based on user feedback
- A/B test different configurations