Skip to main content
Understanding these core concepts will help you build better voice agents and make the most of Hamsa’s features.

Agent Types

Single Prompt Agent

A voice agent powered by a single comprehensive prompt (called a “preamble”) that defines all behaviors, personality, and logic. Key characteristics:
  • One unified prompt for all conversation stages
  • Simpler to build and maintain
  • Best for linear conversational flows
  • Limited conditional branching capability
→ Learn More

Flow Agent

A voice agent built using visual node-based workflows, where each node represents a step in the conversation with its own logic and behavior. Key characteristics:
  • Visual conversation design
  • Unlimited conditional branching
  • 8 different node types
  • Advanced variable system
  • DTMF support
→ Learn More

Key Components

Preamble (Single Prompt)

The core instructions that define your agent’s identity, personality, behavior, and task. Think of it as the agent’s “constitution” - the fundamental rules it follows. Best practices:
  • Keep under 500 words for optimal performance
  • Structure with clear sections (Identity, Style, Task)
  • Be specific about what the agent should and shouldn’t do
  • Include examples for complex scenarios

Nodes (Flow Agent)

Individual building blocks in a flow agent representing specific steps or actions:
  • Conversation Nodes: Natural dialogue with users
  • Tool Nodes: Execute functions and API calls
  • Router Nodes: Pure logic branching
  • Transfer Nodes: Call and agent transfers
  • End Call Nodes: Conversation termination
→ Learn More

Transitions

Connections between nodes that define how conversations progress. Four types:
  1. Natural Language: LLM evaluates conditions described in prose
  2. Structured Equation: Mathematical/logical conditions with variables
  3. DTMF: Triggered by phone keypad presses
  4. Always: Default fallback path
→ Learn More

Variables

System Variables

Built-in variables automatically populated by Hamsa:
  • Time: current_time, current_date, current_datetime, current_timestamp, current_day, current_month, current_year, current_weekday
  • Call: call_id, call_type, call_start_time, direction
  • User: user_number, user_number_area_code
  • Agent: agent_name, agent_id, agent_number
Total: 17 system variables always available

Extracted Variables

Data collected from conversations and stored for use throughout the flow:
  • Captured via “Extract Variables” in conversation nodes
  • Available to all downstream nodes
  • Fully integrated into variable system

Custom Variables

Workflow-level variables you define and populate via API when initiating calls:
  • Defined in global settings
  • Always available to all nodes
  • Useful for caller-specific data (account info, preferences)
→ Learn More

DTMF (Dual-Tone Multi-Frequency)

Phone keypad interaction using digits 0-9, *, and #. Hamsa provides three distinct DTMF features:

Simple DTMF Transitions

IVR menu navigation: “Press 1 for Sales, Press 2 for Support”

DTMF Input Capture

Collect sequences: account numbers, PINs, phone numbers

Global DTMF Triggers

Universal shortcuts: “Press 0 for operator anytime” → Learn More

Knowledge Base

A collection of information sources your agent can access during conversations: Supported formats:
  • Documents: PDF, DOC, TXT, CSV, and more
  • Web pages: Scrape websites and sitemaps
  • Custom text: Manually added content
How it works:
  • Agent searches knowledge base when relevant
  • Retrieves information in real-time (<100ms)
  • Synthesizes answers from multiple sources
→ Learn More

Tools & Functions

Extend your agent’s capabilities by connecting to external systems:

Function Tools

Custom API integrations you define:
  • HTTP method, URL, parameters
  • Headers and authentication
  • Response mapping to variables

Web Tools

Simplified HTTP requests for quick integrations

MCP Tools

Model Context Protocol tools for advanced use cases → Learn More

Call Settings

Configuration controlling conversation dynamics:

Response Delay

Time agent waits before responding (100-1500ms)
  • Lower: More responsive, faster-paced
  • Higher: More thoughtful, better for complex topics

Interruption Handling

Whether users can interrupt the agent mid-speech
  • Enabled: Natural conversation flow
  • Disabled: Agent completes full message

Timeouts

  • Inactivity: How long to wait for user response (5-30s)
  • Max Duration: Maximum call length (30-600s)
→ Learn More

LLM Configuration

Language model settings for your agent:

Provider

  • OpenAI: GPT-4, GPT-4o, GPT-4.1
  • Gemini: Gemini 2.5 Pro, Flash
  • DeepMyst: Optimized models for voice AI
  • Custom: OpenAI-compatible endpoints

Model Selection

Different models for different needs:
  • GPT-4.1: Best reasoning, higher cost
  • GPT-4o Mini: Fast, cost-effective
  • Gemini 2.5: Google’s latest, competitive pricing

Temperature

Creativity vs consistency (0.0-1.0):
  • 0.0-0.3: Deterministic, consistent responses
  • 0.4-0.7: Balanced (recommended)
  • 0.8-1.0: Creative, varied responses
→ Learn More

Voice Settings

Text-to-speech configuration:

Provider

  • Deepgram: Fast, natural voices
  • ElevenLabs: Highly expressive, premium quality
  • PlayHT: Good balance of quality and speed

Voice Selection

Each provider offers dozens of voices:
  • Different genders, ages, accents
  • Preview before selecting
  • Can override per-node (Flow Agents)

Audio Processing

  • Noise Cancellation: Remove background noise
  • Background Noise: Add ambient sound for realism
  • Thinking Voice: Natural pauses and filler words

Testing

Verify agent behavior before deployment:

Browser Testing

  • Test immediately without phone
  • Real-time logs and debugging
  • Edit and test iteratively

Phone Testing

  • Real phone call experience
  • Test with actual telephony
  • Verify audio quality and latency

Test Call Logs

  • Full conversation transcript
  • Timing information
  • Tool call results
  • Error tracking
→ Learn More

Global Nodes

Nodes accessible from anywhere in the flow (Flow Agents only): Use cases:
  • “Speak to operator” - emergency escalation
  • “Repeat menu” - accessibility
  • “Return to start” - navigation reset
Trigger types:
  • Natural Language: User expresses intent in speech
  • DTMF: User presses specific keypad button
Limitations:
  • Cannot access extracted variables (only system & custom vars)
  • Not available for start nodes or web_tool nodes
→ Learn More

Webhooks

Receive real-time notifications about call events: Events:
  • Call started
  • Call ended
  • Agent response generated
  • Tool executed
  • Error occurred
Authentication:
  • Bearer token
  • Custom headers
  • HTTPS required
→ Learn More

Outcome Parameters

Define structured data schema for information collected during calls: Use cases:
  • Lead qualification data
  • Survey responses
  • Appointment details
  • Customer feedback
Format: JSON Schema with validation → Learn More

Best Practices

Prompt Engineering

  • Be specific and concrete
  • Use examples for complex behaviors
  • Test edge cases thoroughly
  • Iterate based on real conversations
→ Learn More

Flow Design

  • Keep nodes focused (one purpose each)
  • Provide fallback paths
  • Test all branches
  • Use descriptive names
→ Learn More

Testing

  • Test in browser first
  • Then test via phone
  • Test edge cases and errors
  • Get feedback from real users
→ Learn More

Glossary

TermDefinition
AgentAI-powered voice assistant that handles phone calls
PreambleCore instructions for single prompt agents
NodeBuilding block in flow agents representing a step
TransitionConnection between nodes defining conversation flow
VariableData stored and passed between conversation steps
DTMFPhone keypad tones (0-9, *, #)
LLMLarge Language Model (GPT-4.1, Gemini, etc.)
TTSText-to-Speech synthesis
STTSpeech-to-Text recognition
IVRInteractive Voice Response (menu system)
RAGRetrieval-Augmented Generation (knowledge base search)

Next Steps