Skip to main content

What is Text to Speech?

Text to Speech (TTS) converts written content into human-like audio using configurable voice models. It enables you to generate natural-sounding speech with advanced controls for voice selection, speed, expressiveness, and output quality.
Text to Speech enables you to:
  • Convert text to natural-sounding speech
  • Choose from 200+ voice options
  • Adjust voice characteristics (speed, expressiveness)
  • Generate long-form audio content
  • Download audio in multiple formats

Core Capabilities

Long-Form Text Input

Process large amounts of text efficiently:
  • Text length limit: Up to 3000 characters
  • Inline text editing: Edit text directly in the interface
  • Character count tracking: Monitor text length
  • Multi-language support: Support for Arabic and English with multiple dialects

Human-Like Voice Synthesis

Generate natural, expressive speech:
  • High-quality voices: Professional-grade voice models
  • Emotional expression: Control expressiveness and tone
  • Natural pauses: Automatic and manual pause insertion
  • Dialect support: Multiple Arabic dialects

Job-Based Generation

Organized generation workflow:
  • Job creation: Create jobs for each generation task
  • Job history: Access all previous generations
  • Job management: Edit, delete, and remix jobs

Text Input Capabilities

Text Editing

  • Inline editing: Direct text editing in the interface
  • Paste support: Easy text input from clipboard
  • Character counting: Real-time character count display
  • Multi-line support: Handle paragraphs and line breaks

Advanced Text Features

  • Emotion markers: Add emoji-based emotion indicators (😊, 😢, etc.)
  • Silence breaks: Insert pauses for natural pacing
    • Short breaks: Brief pauses between phrases
    • Long breaks: Extended pauses for emphasis
  • Fillers: Add natural filler words (Uh, Umm) for realism

Text Optimization Tools

  • Silence controls: Add strategic pauses
  • Emotion adjustment: Enhance emotional delivery

Voice Selection Capabilities

System Voices

Choose from a library of pre-trained voices:
  • 100+ voices: Extensive voice library
  • Multiple languages: Support for Arabic and English
  • Gender options: Male and female voices
  • Dialect variety: Multiple dialects per language
  • Style options: Narrator, Conversational, and more

Custom (Cloned) Voices

Use your own custom voices:
  • Cloned voices: Voices created from audio samples
  • Voice library: Access to your custom voice collection
  • Voice preview: Test voices before generation
  • Favorite voices: Mark frequently used voices
  • Recent voices: Quick access to recently used voices

Voice Metadata

View detailed voice information:
  • Voice name: Identifiable voice names
  • Gender: Male or female
  • Dialect: Language and regional dialect
  • Style: Voice style (Narrator, Conversational)
  • Language code: Technical language identifiers

Voice Organization

  • Favorites: Mark voices for quick access
  • Recent usage: View recently used voices
  • Voice filtering: Filter by language, gender, style
  • Voice search: Search voices by name
  • Voice preview: Play sample audio before selection

Voice Control Capabilities

Speed Adjustment

Control how fast the voice reads:
  • Speed range: 0x to 2x (default: 1x)
  • Fine control: Adjust in 0.1 increments
  • Slow option: Slower for clear, natural delivery
  • Fast option: Faster for quicker playback
  • Real-time preview: Hear changes immediately

Expressiveness Adjustment

Control the emotional range and variation:
  • Expressiveness range: 0 to 2 (default: 1)
  • More neutral: Natural, consistent delivery
  • More expressive: Dynamic, emotionally varied delivery
  • Fine-tuning: Precise control over emotional range
  • Voice-specific optimization: Adapts to selected voice

Generation Management Capabilities

Job List View

  • Job history: View all generation jobs
  • Job metadata: Voice, language, creation date
  • Quick actions: View, download, delete from list

Job Operations

  • View job: Open job details and audio
  • Edit job: Modify text and regenerate (resets status)
  • Delete job: Remove completed or failed jobs
  • Remix job: Create new job based on existing one
  • Download audio: Get audio file directly

Output Capabilities

Audio Playback

  • In-browser playback: Listen directly in the interface
  • Playback controls: Play, pause, seek
  • Streaming playback: Start playback while generating
  • Audio quality: High-quality audio output
  • Multiple formats: Support for various audio formats

Audio Download

Export audio files:
  • MP3 format: Compressed audio format
  • One-click download: Direct download from interface

Dictionaries and Customization

Voice Dictionaries

Customize pronunciation:
  • Dictionary selection: Choose dictionaries for voices
  • Custom pronunciations: Override default pronunciations
  • Multiple dictionaries: Use multiple dictionaries per job
  • Dictionary management: Create and manage dictionaries

Use Cases

Content Creation

Generate audio for various content:
  • Podcast intros: Create podcast introductions
  • Audiobooks: Convert text to audiobook format
  • Video narration: Generate voiceovers for videos
  • Educational content: Create learning materials

Marketing and Advertising

Create marketing audio:
  • Advertisement voiceovers: Commercial audio
  • Social media content: Audio for platforms
  • Brand voice: Consistent voice across content
  • Multilingual campaigns: Same content in multiple languages

Customer Service

Enhance customer interactions:
  • IVR systems: Automated phone systems
  • Voice prompts: System announcements
  • Notification audio: Alert sounds and messages
  • Training materials: Audio training content

Key Features

Real-Time Generation

  • Streaming generation: Start playback while generating
  • Progressive loading: Audio available as it generates
  • Status updates: Real-time progress indicators
  • Error handling: Clear error messages and recovery

Voice Quality

  • High fidelity: Professional-grade audio quality
  • Natural intonation: Human-like speech patterns
  • Emotion support: Expressive delivery options
  • Consistency: Stable voice characteristics

Getting Started

  1. Enter Your Text
    • Type or paste text into the editor
    • Add emotion markers and pauses if needed
    • Optimize text for better results
  2. Select a Voice
    • Browse available voices
    • Preview voice samples
    • Select your preferred voice
  3. Adjust Controls
    • Set speed and expressiveness
    • Fine-tune voice characteristics
    • Preview changes in real-time
  4. Generate Audio
    • Click generate and wait for processing
    • Listen to your audio
    • Download or share as needed

What’s Next?