Overview
Intelligence features are advanced AI capabilities that enhance your agent’s understanding, responsiveness, and contextual awareness. These features use machine learning and natural language processing to provide smarter, more personalized interactions.Available Intelligence Features:
- Gender Detection - Identify caller gender for personalized speech
- Smart Call End - Automatically detect conversation completion
- Speaker Identification - Distinguish multiple speakers on same call
- Agentic RAG - Advanced knowledge retrieval with reasoning
- Language Dialect Switcher - Adapt to caller’s specific dialect
Intelligence features are optional and can be enabled/disabled independently. Each feature has trade-offs between capability and latency.
Gender Detection
What is Gender Detection?
Automatically analyzes voice characteristics to detect the caller’s likely gender (male or female) for personalized responses. How it works:- Analyzes pitch, resonance, and vocal patterns
- Determines probability: male, female, or uncertain
- Makes result available to agent via variable
- Detection occurs early in conversation (first 5-10 seconds)
Enabling Gender Detection
To enable:- Open Call Settings section
- Locate User Gender Detection toggle
- Enable the toggle
- Save changes
- No additional settings required
- Detection is automatic once enabled
- Results available immediately after detection
Using Gender Detection in Prompts
Access detected gender via the{{user_gender}} variable:
Use Cases for Gender Detection
Personalized GreetingsBest Practices
✅ Do:- Use gender for grammatical agreement (languages with gendered grammar)
- Personalize formal address (sir/ma’am) when culturally appropriate
- Provide gender-specific product recommendations when relevant
- Always have gender-neutral fallback
- Make stereotypical assumptions about interests or preferences
- Use gender to limit options or suggestions
- Over-emphasize gender when not relevant
- Assume gender defines needs or wants
Privacy and Compliance Considerations
Transparency:- Consider mentioning in privacy policy
- Disclose if asked by caller
- Explain it’s for personalization, not discrimination
- Gender is not binary in all cultures
- Some users may not identify with detected gender
- Always allow for correction or preference
Accuracy and Limitations
Accuracy Rate: 85-95% for binary gender detection Factors Affecting Accuracy:- Voice quality and clarity
- Background noise levels
- Age of speaker (children harder to detect)
- Vocal training or characteristics
- Trans individuals may not align with detected gender
Gender detection is probabilistic, not deterministic. Build your prompts to gracefully handle all outcomes including uncertain detection.
Smart Call End
What is Smart Call End?
Automatically detects when the conversation objectives are complete and the caller is ready to end the call, then gracefully concludes the interaction. How it works:- LLM monitors conversation for completion signals
- Detects farewell indicators (“goodbye”, “that’s all”, “thanks, bye”)
- Identifies when objectives are met
- Triggers graceful call conclusion
- Prevents unnecessary conversation extension
Enabling Smart Call End
To enable:- Open Call Settings section
- Locate Smart Call End toggle
- Enable the toggle
- Save changes
How Smart Call End Detects Completion
Farewell Indicators:- “Goodbye”
- “Bye”
- “That’s all I needed”
- “Thank you, goodbye”
- “I’m all set”
- “Have a good day” (from user)
- Task explicitly completed
- User confirms satisfaction
- No additional questions when asked
- Natural conversation conclusion
Configuring Smart Call End Behavior
Control Smart Call End behavior in your prompt:Use Cases for Smart Call End
Appointment BookingBenefits of Smart Call End
✅ Advantages: Better User Experience- No awkward “how do I end this call?” moments
- Natural conversation flow
- Users feel in control
- Prevents unnecessarily long calls
- Reduces average call duration
- Eliminates “dead air” at end
- Smooth, confident conclusions
- No stammering or uncertain endings
- Clean, polished experience
- No need to manually script every ending
- Adapts to various conclusion patterns
- Handles unexpected farewell styles
Smart Call End vs. Max Call Duration
Both features can end calls, but serve different purposes:| Feature | Purpose | When It Triggers | Type |
|---|---|---|---|
| Smart Call End | Natural conclusion | Conversation complete, objectives met | Intelligent |
| Max Call Duration | Safety net | Time limit reached | Hard limit |
- Smart Call End: Handles 90%+ of calls naturally
- Max Call Duration: Safety net for edge cases
Troubleshooting Smart Call End
Problem: Calls End Too Early Agent ends call when user still has questions. Solution:- Verify Smart Call End is enabled
- Check prompt doesn’t override with “never end calls”
- Simplify farewell detection in prompt
- Review call transcripts to find patterns
- Adjust prompt to consistently ask if anything else needed
- Ensure clear task completion markers
Speaker Identification
What is Speaker Identification?
Distinguishes between different speakers on the same call, identifying who said what in multi-person conversations. How it works:- Analyzes voice biometrics (pitch, timber, resonance)
- Assigns unique identifiers to each speaker
- Tracks conversation by speaker
- Maintains context per speaker
Enabling Speaker Identification
To enable:- Open Call Settings section
- Locate Speaker Identification toggle (marked Beta)
- Enable the toggle
- Save changes
Speaker Identification is in beta. Accuracy improves with clear, distinct voices and minimal background noise.
Use Cases for Speaker Identification
Speakerphone CallsTechnical Implementation
In Conversation Transcript:Accuracy and Limitations
Accuracy: 80-90% with clear, distinct voices Works Best With:- Distinct male/female voices
- Clear audio quality
- Minimal background noise
- Significant voice characteristic differences
- Speakerphone or conference calls
- Similar voices may be confused
- Background noise reduces accuracy
- Very brief utterances harder to identify
- More than 3-4 speakers becomes challenging
- Requires distinct speech patterns
Best Practices
✅ Do:- Ask speakers to identify themselves by name
- Use speaker ID to enhance conversation flow
- Address speakers by name when identified
- Gracefully handle when identification is uncertain
- Use for authentication or security
- Assume perfect accuracy
- Ignore verbal identification in favor of voice only
- Make critical decisions based solely on speaker ID
Agentic RAG
What is Agentic RAG?
Advanced knowledge retrieval where the AI agent actively decides when and what to search in your knowledge base, using multi-step reasoning to find the best answer. Standard RAG (Retrieval Augmented Generation):- User asks question
- System retrieves potentially relevant documents
- LLM uses retrieved docs to answer
- Single-pass retrieval
- User asks question
- Agent reasons about what information is needed
- Agent searches knowledge base strategically
- Agent evaluates results
- Agent may search again with refined query
- Agent synthesizes final answer
- Multi-step, iterative retrieval
Enabling Agentic RAG
To enable:- Open Knowledge Base section in Configuration
- Locate Agentic RAG toggle
- Enable the toggle
- Save changes
Agentic RAG requires having knowledge base items configured. If no knowledge base is attached, this feature has no effect.
How Agentic RAG Works
Example Flow: User: “What’s your return policy for electronics purchased online during holiday sales?” Standard RAG:Benefits of Agentic RAG
✅ Advantages: Higher Accuracy- More precise answers to complex questions
- Better handling of multi-faceted queries
- Fewer incorrect or partial answers
- Understands nuanced questions
- Identifies what information is actually needed
- Handles ambiguity better
- Can answer questions requiring multiple pieces of information
- Synthesizes information from different sources
- Handles “compare X and Y” questions
- Only searches when actually needed
- Refines search based on what’s found
- Avoids information overload
Trade-offs
⚠️ Disadvantages: Higher Latency- Additional LLM calls for reasoning
- Multiple search operations
- Adds 500-2000ms per response
- More LLM API calls
- More processing overhead
- Higher token usage
- More moving parts
- Harder to debug
- Requires well-structured knowledge base
When to Use Agentic RAG
✅ Good Use Cases: Complex Technical SupportOptimizing Agentic RAG Performance
Knowledge Base Structure: ✅ Good Structure:- 20 well-written docs > 100 poorly organized docs
- Clear, concise content
- Up-to-date information
- Consistent formatting
Monitoring Agentic RAG
Metrics to Track: Accuracy Improvement:- Compare answer accuracy vs standard RAG
- Track questions correctly answered
- Monitor user satisfaction
- Average response time increase
- P95/P99 latency
- User feedback on speed
- Additional LLM costs
- Token usage increase
- ROI vs improved accuracy
- How often does it search multiple times?
- Average number of searches per query
- Which questions trigger multi-step reasoning?
Agentic RAG Configuration in Prompts
Guide agentic behavior in your prompt:Agentic RAG + Other Features
Combined with Smart Call End:- Complex questions answered thoroughly
- Natural conclusion once complete
- Better user satisfaction
- Multiple people with different questions
- Agent handles each person’s needs
- Retrieves relevant info for each speaker
- Gender-specific information retrieval
- Personalized recommendations from knowledge base
Language Dialect Switcher
What is Language Dialect Switcher?
Automatically detects the caller’s specific dialect or accent within the same language and adapts speech recognition for better understanding. How it works:- Analyzes pronunciation patterns
- Identifies regional dialect
- Switches speech recognition model
- Improves transcription accuracy
Enabling Language Dialect Switcher
To enable:- Open Call Settings section
- Locate Language/Dialect Switcher toggle
- Enable the toggle
- Save changes
How Dialect Switching Works
Arabic Example: Scenario:- Agent configured with Egyptian Arabic voice
- Caller speaks in Saudi dialect
- Speech recognition optimized for Egyptian
- Saudi pronunciation may be misunderstood
- Some words transcribed incorrectly
- Detects Saudi dialect from first few sentences
- Switches to Saudi-optimized recognition
- Better transcription accuracy
- Agent still speaks in Egyptian voice
- Agent configured with US English
- Caller has strong Scottish accent
- US English speech recognition
- Accent causes transcription errors
- May misunderstand key words
- Detects UK/Scottish accent
- Adapts recognition
- Better understands caller
- Agent still speaks in US English voice
Supported Dialects
Arabic:- Egyptian (EGY)
- Saudi Arabian (KSA)
- Emirati (UAE)
- Jordanian (JOR)
- Lebanese (LEB)
- Syrian (SYR)
- Palestinian (PLS)
- Iraqi (IRQ)
- Bahraini (BAH)
- US English
- UK English
- Australian English
- Canadian English
- Indian English
- Various regional accents
Benefits of Dialect Switching
✅ Advantages: Improved Understanding- Better transcription accuracy
- Fewer misunderstood words
- Reduced need for repetition
- Single agent handles multiple dialects
- No need for separate agents per region
- Consistent brand voice across markets
- Callers feel understood
- Natural conversation flow
- Less frustration
- One agent instead of many region-specific agents
- Reduced development complexity
Latency and Performance
Latency Impact:- Detection: 2-5 seconds (during first few sentences)
- Switching: ~50-100ms
- Total impact: Minimal (one-time cost at call start)
- Dialect detection: 85-95% accuracy
- Recognition improvement: 10-30% better transcription for non-standard dialects
Dialect switching happens once at the beginning of the call. There’s no ongoing latency impact after initial detection.
Use Cases
Multi-Region Service CentersLimitations
Does Not Change Agent’s Voice:- Agent continues speaking in configured dialect
- Only affects how agent understands caller
- For voice change, need separate agents
- Cannot switch English ↔ Arabic
- Only within same language
- For multi-language, need separate agents
- Works best with standard dialects
- Heavy accents may still challenge system
- Non-standard speech patterns may not be recognized
Language Dialect Switcher improves understanding of caller speech. It does NOT change the agent’s speaking voice or dialect.
Best Practices
✅ Do:- Enable for agents serving multi-region markets
- Test with speakers from target dialects
- Monitor transcription accuracy improvements
- Combine with standard RAG or knowledge base
- Expect it to change agent’s speaking dialect
- Rely on it for completely different languages
- Assume 100% accuracy across all accents
- Skip testing with actual dialect speakers
Combining Intelligence Features
Intelligence features work well together:Recommended Combinations
Customer Service Excellence:Feature Selection Matrix
| Use Case | Gender | Smart End | Speaker ID | Agentic RAG | Dialect |
|---|---|---|---|---|---|
| Simple FAQ | Optional | ✓ | ✗ | ✗ | ✓ |
| Customer Support | ✓ | ✓ | ✗ | Optional | ✓ |
| Technical Support | Optional | ✓ | Optional | ✓ | ✓ |
| Sales Calls | ✓ | ✓ | ✗ | ✗ | ✓ |
| Multi-Person | Optional | ✓ | ✓ | Optional | ✓ |
| Complex Research | Optional | ✓ | ✗ | ✓ | ✓ |
| Quick Transactions | Optional | ✓ | ✗ | ✗ | Optional |
Performance Considerations
Latency Impact by Feature
| Feature | Latency Added | When Applied |
|---|---|---|
| Gender Detection | ~100-200ms | Once, early in call |
| Smart Call End | ~0ms | Throughout (no extra latency) |
| Speaker Identification | ~50-100ms | Per speaker change |
| Agentic RAG | ~500-2000ms | Per knowledge query |
| Dialect Switcher | ~50-100ms | Once, at call start |
- Detection overhead: ~400ms (one time)
- Agentic RAG: ~2000ms (when used)
- Total: ~2400ms for complex knowledge query
- Detection overhead: ~200ms (one time)
- Smart call end: 0ms
- Total: Negligible impact
Cost Impact
Higher API Usage:- Gender Detection: +1-2 API calls per call
- Smart Call End: +0-1 API calls per call (minimal)
- Speaker Identification: +1-3 API calls per speaker
- Agentic RAG: +2-5 API calls per knowledge query
- Dialect Switcher: +1-2 API calls at start
Troubleshooting Intelligence Features
Gender Detection Issues
Problem: Incorrect gender detected Solution:- Review prompt to handle uncertainty gracefully
- Use gender-neutral fallbacks
- Don’t over-rely on detection for critical decisions
- Check audio quality
- Verify feature is enabled
- Check for background noise interference
- Some voices are naturally ambiguous
Smart Call End Issues
Problem: Calls end prematurely Solution:- Review transcripts for patterns
- Simplify farewell detection
- Ensure agent asks confirming question before end
Speaker Identification Issues
Problem: Speakers confused or not distinguished Solution:- Ask speakers to identify themselves verbally
- Use names instead of relying solely on voice
- Check audio quality (speakerphone distance)
- Always ask “who am I speaking with now?”
- Confirm name when speaker changes
- Use verbal identification as primary method
Agentic RAG Issues
Problem: Very slow responses Solution:- Review knowledge base size (too large?)
- Reduce complexity of documents
- Consider disabling for simple questions
- Check for prompt encouraging excessive searching
- Review knowledge base content quality
- Ensure information is up to date
- Check for contradictory information
- Improve document structure and titles
Dialect Switcher Issues
Problem: Still not understanding caller Solution:- Verify caller’s dialect is supported
- Check for heavy accent vs standard dialect
- Review audio quality
- May need custom training for very specific accents
- First few seconds may be ambiguous
- System typically corrects after more speech
- Check for code-switching (mixing dialects)
Best Practices Summary
Do’s
✅ Test before enabling- Test each feature with real calls
- Measure latency impact
- Verify value added
- Begin with Smart Call End + Dialect Switcher
- Add features one at a time
- Measure impact of each
- Track latency metrics
- Monitor cost increases
- Measure user satisfaction
- Handle feature failures elegantly
- Always have fallbacks
- Don’t assume 100% accuracy
Don’ts
❌ Don’t enable everything- Only enable features you need
- More features = more latency
- Each feature has trade-offs
- Test with real target audience
- Verify benefits outweigh costs
- Check edge cases
- Review legal requirements
- Update privacy policy if needed
- Consider data retention
- All ML features have error rates
- Build robust fallbacks
- Handle uncertainty gracefully
Feature-Specific Recommendations
Always Recommend
Smart Call End- Minimal cost
- No latency
- Better UX
- Enable by default
- Minimal cost (~50ms one-time)
- Improves understanding
- Broader reach
- Enable for multi-region
Situationally Recommend
Gender Detection- If culturally appropriate
- If personalization adds value
- If prompt can use it effectively
- Check legal compliance
- If knowledge base is large/complex
- If accuracy > latency priority
- If questions are multi-faceted
- If budget allows
- If multi-person calls expected
- If call context benefits from it
- If willing to handle beta limitations
- If accuracy is acceptable
Next Steps
Write Effective Prompts
Leverage intelligence features in your prompts
Call Behavior Settings
Optimize timing and interaction controls
Voice Settings
Configure voice and language options
Knowledge Base
Build a robust knowledge base for Agentic RAG
Need more advanced capabilities? Explore Flow Agent Features for node-based intelligence configuration.