LLM Configuration

What is LLM Configuration?

The Language Model (LLM) is the brain of your voice agent, powering conversation understanding, response generation, and decision-making. Configure which model to use, how creative or consistent it should be, and how much it can say to optimize performance for your specific use case.

Supported LLM Providers:

OpenAI - GPT-5, GPT-4.1, GPT-4o (and Mini/Nano variants)
Gemini - Gemini 2.5 Pro, Gemini 2.5 Flash
DeepMyst - Voice-optimized GPT-4.1 models
Custom - Self-hosted or third-party models via OpenAI-compatible API

Key Configuration Parameters

Model Selection

Choose the right balance of speed, quality, and cost
Faster models (GPT-5 Nano, Gemini Flash) for high volume
Powerful models (GPT-5, GPT-4.1) for complex reasoning

Temperature (0.0 - 2.0)

Controls randomness and creativity of responses
Lower (0.2-0.4) for consistent, factual responses
Higher (0.7-1.0) for natural, varied conversations

Max Tokens (50 - 4096)

Limits the length of agent responses
Short (50-150) for quick confirmations
Medium (150-500) for balanced conversations
Long (500+) for detailed explanations

Advanced Parameters

Top P (nucleus sampling)
Frequency and presence penalties
Stop sequences

Model Comparison

Model Type	Speed	Quality	Cost	Best For
Premium (GPT-5, GPT-4.1, Gemini Pro)	⭐⭐⭐⭐	⭐⭐⭐⭐⭐	$$$	Complex reasoning, critical conversations
Balanced (GPT-5 Mini, GPT-4.1 Mini, GPT-4o)	⭐⭐⭐⭐⭐	⭐⭐⭐⭐	$$	Most voice agent use cases
Fast (GPT-5 Nano, GPT-4.1 Nano, Gemini Flash)	⭐⭐⭐⭐⭐	⭐⭐⭐	$	High volume, cost-sensitive

Common Use Cases

Customer Support (Temperature: 0.3-0.5, Max Tokens: 150-300)

Consistent, accurate information delivery
Professional and predictable tone
Factual responses without creativity

Sales & Lead Qualification (Temperature: 0.6-0.8, Max Tokens: 200-400)

Engaging, natural conversations
Personality and warmth
Adaptive to caller mood

Appointment Scheduling (Temperature: 0.3-0.5, Max Tokens: 100-200)

Precise date/time handling
Minimal errors in critical details
Clear, unambiguous confirmations

Getting Started

Choose your approach based on how you want to configure models:

Configure in Dashboard

Select models and adjust parameters using the web interface. Perfect for testing different configurations and finding optimal settings.

Set via API

Configure LLM settings programmatically. Ideal for dynamic model selection and parameter tuning based on use case.

Learn More

Prompt Engineering

Write effective prompts for better results

Testing

Test different LLM configurations

Single Prompt Agents

Configure LLM for Single Prompt Agents

Flow Agents

Set LLM parameters in Flow Agents

Best Practices

Model Selection

Start with balanced models (GPT-5 Mini, GPT-4.1 Mini)
Use faster models for simple, high-volume tasks
Reserve premium models for complex reasoning

Temperature Tuning

Begin conservative (0.4) and increase if needed
Test extensively with real scenarios
Lower temperature reduces hallucinations
Higher temperature for more natural conversations

Cost Optimization

Monitor token usage and costs continuously
Set appropriate max tokens for your use case
Use cheaper models where quality isn’t critical
Implement smart routing based on complexity

Performance Monitoring

Track response times and latency
Monitor error rates and quality metrics
Adjust parameters based on user feedback
A/B test different configurations

Getting Started

Capabilities

Administration

What is LLM Configuration?

Key Configuration Parameters

Model Comparison

Common Use Cases

Getting Started

Configure in Dashboard

Set via API

Learn More

Prompt Engineering

Testing

Single Prompt Agents

Flow Agents

Best Practices

Getting Started

Capabilities

Administration

​What is LLM Configuration?

​Key Configuration Parameters

​Model Comparison

​Common Use Cases

​Getting Started

Configure in Dashboard

Set via API

​Learn More

Prompt Engineering

Testing

Single Prompt Agents

Flow Agents

​Best Practices

What is LLM Configuration?

Key Configuration Parameters

Model Comparison

Common Use Cases

Getting Started

Learn More

Best Practices