Learn about the models that power the Hamsa API.
Flagship models
Text to Speech
Hamsa TTS Standard
High-quality Arabic and English speech synthesis
Natural-sounding output optimized for Arabic dialects
Multiple Arabic dialects + English
10,000 character limit
Low latency (~300ms-500ms)
Hamsa TTS Realtime
Ultra-fast model optimized for real-time voice agents
Ultra-low latency (~150ms-200ms)
Arabic dialects + English
5,000 character limit
Optimized for conversational AI
Speech to Text
Hamsa STT Standard
Accurate Arabic speech recognition across dialects
High accuracy transcription for Arabic dialects
Word-level timestamps
Speaker diarization support
Batch processing optimized
Hamsa STT Realtime
Real-time speech recognition for live conversations
Arabic dialects + English
Real-time streaming transcription
Low latency (~150ms-250ms)
Word-level timestamps
Models overview
The Hamsa API offers a range of audio models optimized for Arabic language processing, with support for multiple dialects and English.| Model ID | Description | Languages |
|---|---|---|
hamsa-tts-standard | High-quality Arabic and English speech synthesis | Arabic dialects (Egyptian, Gulf, Levantine, North African), English (US) |
hamsa-tts-realtime | Ultra-fast TTS optimized for real-time applications | Arabic dialects, English (US) |
hamsa-stt-standard | High-accuracy Arabic speech recognition | Arabic dialects (Egyptian, Gulf, Levantine, North African), English (US) |
hamsa-stt-realtime | Real-time speech recognition for live conversations | Arabic dialects, English (US) |
Hamsa TTS Standard
Hamsa TTS Standard is our high-quality speech synthesis model optimized for Arabic dialects and English. It produces natural, lifelike speech with proper pronunciation of Arabic text across multiple dialects. This model works well in the following scenarios:- Content Creation: Perfect for generating Arabic audio content, podcasts, and videos
- Accessibility: Generate audio versions of written Arabic content
- E-Learning: Create educational content in Arabic with natural pronunciation
- Media Production: Professional-quality voiceovers for Arabic media
Supported languages
The Hamsa TTS Standard model supports: Arabic Dialects:- Egyptian Arabic (arz)
- Gulf Arabic (afb) - Saudi, UAE, Kuwait, etc.
- Levantine Arabic (apc) - Syrian, Lebanese, Jordanian, Palestinian
- North African Arabic (arq/ary) - Moroccan, Algerian, Tunisian, Libyan
- Modern Standard Arabic (arb)
- English (US) (eng)
Hamsa TTS Realtime
Hamsa TTS Realtime is our fastest speech synthesis model, designed for real-time applications and Voice Agents Platform. It delivers high-quality Arabic speech with ultra-low latency. This model is particularly well-suited for:- Voice Agents Platform: Perfect for real-time voice agents and phone calls
- Interactive Applications: Ideal for chatbots requiring immediate voice response
- Live Conversations: Real-time TTS for conversational AI applications
Hamsa STT Standard
Hamsa STT Standard is our high-accuracy speech recognition model designed for accurate transcription of Arabic speech across multiple dialects. It provides precise word-level timestamps and speaker diarization. This model excels in scenarios requiring accurate speech-to-text conversion:- Transcription Services: Perfect for converting Arabic audio/video content to text
- Meeting Documentation: Ideal for capturing and documenting Arabic conversations
- Content Analysis: Well-suited for Arabic audio content processing and analysis
- Media Subtitling: Generate accurate subtitles for Arabic media content
- Accurate transcription with word-level timestamps
- Speaker diarization for multi-speaker audio
- Support for multiple Arabic dialects
- Punctuation and formatting
Hamsa STT Realtime
Hamsa STT Realtime is our real-time speech recognition model, delivering accurate transcription of Arabic speech with ultra-low latency for live conversations. This model excels in conversational use cases:- Live call transcription: Perfect for real-time Arabic call transcription
- AI Voice Agents: Ideal for live conversations in Arabic
- Live Meetings: Real-time transcription for Arabic meetings and conferences
- Ultra-low latency: Get transcriptions in real-time
- Streaming support: Send audio in chunks while receiving transcripts
- Multiple audio formats: Support for various audio encodings
- Dialect recognition: Automatic recognition of Arabic dialects
Model selection guide
Requirements
Requirements
Quality
Use
hamsa-tts-standard or hamsa-stt-standardBest for high-quality audio output and accurate transcriptionLow-latency
Use
hamsa-tts-realtime or hamsa-stt-realtimeOptimized for real-time applicationsArabic Dialects
All models support multiple Arabic dialectsChoose based on latency vs quality requirements
Balanced
Use Standard models for best balanceGood balance between quality and performance
Use case
Use case
Content creation
Use
hamsa-tts-standardIdeal for professional Arabic content, media & video narration.Voice Agents Platform
Use
hamsa-tts-realtime and hamsa-stt-realtimePerfect for real-time conversational applications in ArabicTranscription
Use
hamsa-stt-standardBest accuracy for batch Arabic transcriptionCharacter limits
The maximum number of characters supported in a single text-to-speech request varies by model.| Model ID | Character limit | Approximate audio duration |
|---|---|---|
hamsa-tts-standard | 10,000 | ~10 minutes |
hamsa-tts-realtime | 5,000 | ~5 minutes |
For longer content, consider splitting the input into multiple requests.
Audio duration limits
The maximum audio duration supported for speech-to-text varies by model.| Model ID | Audio duration limit | File size limit |
|---|---|---|
hamsa-stt-standard | 60 minutes | 500 MB |
hamsa-stt-realtime | Streaming (unlimited) | N/A |
Plans and Usage Limits
Your subscription plan determines your monthly usage limits and concurrent call capacity.Plan Comparison
| Plan | Price | Credits | Voice Agent | Speech to Text | Text to Speech | Concurrency | KB Storage |
|---|---|---|---|---|---|---|---|
| Free | $0/mo | 50 | 9 min | 50 min | 25 min | 1 | 1 MB |
| Starter | $5/mo | 100 | 17 min | 100 min | 50 min | 1 | 5 MB |
| Creator | $15/mo | 500 | 84 min | 500 min | 250 min | 2 | 10 MB |
| Pro | $100/mo | 5,000 | 834 min | 5,000 min | 2,500 min | 5 | 50 MB |
| Business | $320/mo | 20,000 | 3,334 min | 20,000 min | 10,000 min | 10 | 100 MB |
| Enterprise | Custom | Custom | Unlimited | Unlimited | Unlimited | Unlimited | 300 MB |
Plan Features
| Feature | Free | Starter | Creator | Pro | Business | Enterprise |
|---|---|---|---|---|---|---|
| Access to All Models | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| Fine-tuned AI Models | - | - | - | - | ✓ | ✓ |
| Basic Cloud Support | - | - | - | ✓ | - | - |
| Full Cloud Support | - | - | - | - | ✓ | ✓ |
| On-Premise Solution | - | - | - | - | - | ✓ |
To increase your usage limits & concurrent calls, upgrade your subscription plan.Enterprise customers can request custom limits by contacting sales.