Data Collection Guide

Overview

Effective data collection is essential for voice agents that need to gather information from callers. Hamsa provides multiple methods to collect, validate, and use data throughout conversations, from simple name collection to complex multi-field forms.

Data Collection Methods:

Natural Language Extraction - AI extracts data from spoken conversation
DTMF Input Capture - Collect digits via keypad
Structured Prompting - Guide users to provide specific information
Variables - Store and reference collected data

Collection Methods

1. Natural Language Extraction

The most natural method - AI extracts information from conversation. How It Works:

User speaks naturally
AI identifies and extracts specific data
Data stored in variables
Available for use throughout the flow

Example: Collecting Customer Information

Conversation Node: Collect_Info
  Message: "I'll need a few details to help you.
           What's your name and phone number?"

  Variable Extraction:
    - Variable: customer_name
      Instructions: "Extract the customer's full name"

    - Variable: phone_number
      Instructions: "Extract phone number in format XXX-XXX-XXXX"

User Response: "My name is John Smith and my number is 555-123-4567"

Extracted:
  customer_name: "John Smith"
  phone_number: "555-123-4567"

Best For:

Names, addresses, email addresses
Dates and times (flexible formats)
Free-form descriptions
Complex multi-field responses
Natural conversation flow

Advantages:

Natural user experience
Flexible input formats
Handles variations well
No learning curve for users

Disadvantages:

Potential transcription errors
Format inconsistencies
Requires validation
May need clarification

2. DTMF Input Capture

Collect precise numeric data via keypad. How It Works:

Agent prompts for numeric input
User enters digits on keypad
System captures key presses
Stores in variable

Example: Account Number Collection

Conversation Node: Get_Account
  Message: "Please enter your 10-digit account number,
           followed by the pound key."

  DTMF Input Capture:
    Enabled: true
    Variable: account_number
    Digit Limit: 10
    Termination Key: #
    Timeout: 15 seconds

User Input: 1-2-3-4-5-6-7-8-9-0-#

Result:
  account_number: "1234567890"

Best For:

Account numbers
Phone numbers
ZIP codes
PINs and passwords
Social security numbers (last 4 digits)
Confirmation codes
Numeric IDs

Advantages:

100% accurate (no transcription errors)
Works in noisy environments
Familiar to users
Secure for sensitive data

Disadvantages:

Numbers only (0-9)
Slower than speaking
Requires hands-free device awareness
Not accessible to all users

3. Guided Prompting

Ask specific questions to collect structured data. How It Works:

Ask focused, single questions
Extract one piece of information
Confirm understanding
Move to next question

Example: Appointment Scheduling

Node 1: Get_Date
  Message: "What date would you like to schedule?
           For example, January 15th."

  Extract: appointment_date

Node 2: Confirm_Date
  Message: "Got it, {{appointment_date}}.
           And what time works best for you?"

  Extract: appointment_time

Node 3: Verify_All
  Message: "Perfect! I have you scheduled for {{appointment_date}}
           at {{appointment_time}}. Is that correct?"

  Extract: confirmation (yes/no)

Best For:

Multi-step forms
Complex data collection
Situations requiring validation
When precision matters

Advantages:

Clear expectations
Easy to validate
Reduces errors
Good user experience

Disadvantages:

Takes more time
Multiple conversational turns
Can feel rigid
Requires good flow design

Variable System

Defining Variables

Choose Variable Type

Extracted Variables: Collected during conversation Custom Variables: Passed via API when call starts System Variables: Built-in (time, caller ID, etc.)

Name Your Variable

Use snake_case format:

customer_name
phone_number
appointment_date
order_number

Configure Extraction

Provide clear extraction instructions:

What to extract
Expected format
Examples if helpful

Use Throughout Flow

Reference variable anywhere:

Prompts: {{customer_name}}
Tool parameters
Routing conditions

Variable Naming Best Practices

Good Names:

✓ customer_name
✓ email_address
✓ appointment_date
✓ order_number
✓ shipping_address
✓ phone_number
✓ account_balance

Bad Names:

✗ name              (too vague)
✗ customerName      (use snake_case, not camelCase)
✗ customer-name     (no hyphens)
✗ Customer Name     (no spaces or capitals)
✗ var1              (not descriptive)
✗ temp              (unclear purpose)

Extraction Instructions

Clear Instructions:

✓ "Extract the customer's full name"
✓ "Extract email address in format [email protected]"
✓ "Extract appointment date in MM/DD/YYYY format"
✓ "Extract order number (starts with ORD-)"

✗ "Get the name"
✗ "Extract email"
✗ "Get the date"

With Examples:

Variable: phone_number
Instructions: "Extract 10-digit phone number.
              Examples: 555-123-4567, (555) 123-4567, 5551234567.
              Store in format: XXX-XXX-XXXX"

Variable: appointment_date
Instructions: "Extract date mentioned by caller.
              Examples: 'next Tuesday', 'January 15th', '1/15/2024'.
              Convert to YYYY-MM-DD format."

Complete Collection Workflows

Example 1: Customer Registration

Collect comprehensive customer information.

Node 1: Welcome
  Message: "I'll help you create an account. This will just take a minute."

Node 2: Collect_Name
  Message: "First, what's your full name?"

  Extract Variables:
    - customer_name: "Extract full name (first and last)"

Node 3: Confirm_Name
  Message: "Thank you, {{customer_name}}. What's the best email
           address to reach you?"

  Extract Variables:
    - email_address: "Extract email in format [email protected]"

Node 4: Collect_Phone
  Message: "Great. And what's your phone number?"

  Extract Variables:
    - phone_number: "Extract 10-digit phone number"

Node 5: Verify_Information
  Message: "Let me confirm your information:
           Name: {{customer_name}}
           Email: {{email_address}}
           Phone: {{phone_number}}

           Is everything correct?"

  Extract Variables:
    - confirmation: "Extract yes/no confirmation"

  Transitions:
    - confirmation == "yes" → Create_Account_Tool
    - confirmation == "no" → What_To_Change

Node 6: Create_Account (Tool)
  Tool: create_customer_account
  Parameters:
    name: {{customer_name}}
    email: {{email_address}}
    phone: {{phone_number}}
    source: "phone"
    timestamp: {{current_datetime}}

Node 7: Success
  Message: "Your account is all set, {{customer_name}}!
           You'll receive a confirmation email at {{email_address}}."

Example 2: Secure Authentication

Collect sensitive information securely.

Node 1: Request_Account
  Message: "For security, I'll need to verify your account.
           Please enter your account number using your keypad,
           followed by the pound key."

  DTMF Input Capture:
    Variable: account_number
    Digit Limit: 10
    Termination Key: #

Node 2: Request_PIN
  Message: "Thank you. Now please enter your 4-digit PIN."

  DTMF Input Capture:
    Variable: pin_code
    Digit Limit: 4
    Timeout: 15s

Node 3: Verify_Credentials (Tool)
  Tool: verify_account
  Parameters:
    account: {{account_number}}
    pin: {{pin_code}}
    call_id: {{call_id}}

  Transitions:
    - API returns success → Authenticated_Menu
    - API returns failure → Retry_Authentication
    - After 3 failures → Transfer_Security

Node 4: Retry_Authentication
  Message: "I couldn't verify those credentials.
           Let's try again. Please enter your account number."

Node 5: Authenticated_Menu
  Message: "Thank you for verifying your identity, {{customer_name}}.
           How can I help you today?"

Example 3: Hybrid Collection (DTMF + NL)

Combine DTMF and natural language for optimal UX.

Node 1: Collect_ZIP
  Message: "What's your ZIP code? You can say it or enter it
           on your keypad, followed by pound."

  DTMF Input Capture:
    Variable: zip_code
    Digit Limit: 5
    Termination Key: #

  Extract Variables:
    - zip_code: "Extract 5-digit ZIP code if spoken"

  # Either method populates zip_code variable

Node 2: Collect_Date
  Message: "What date would you like? You can say something like
           'next Tuesday' or 'January 15th'."

  Extract Variables:
    - appointment_date: "Extract date, convert to YYYY-MM-DD"

Node 3: Confirm_Details
  Message: "I have ZIP code {{zip_code}} and date {{appointment_date}}.
           Is that right?"

Example 4: Survey Data Collection

Structured survey with validation.

Survey Flow:

Node 1: Introduction
  Message: "This quick survey takes about 2 minutes.
           Your feedback helps us improve."

Node 2: Question_1
  Message: "On a scale of 1 to 5, with 5 being very satisfied,
           how satisfied are you with our service?
           You can say the number or press it on your keypad."

  DTMF Input Capture:
    Variable: satisfaction_score
    Digit Limit: 1

  Extract Variables:
    - satisfaction_score: "Extract number 1-5"

  Validation:
    - satisfaction_score must be 1-5

Node 3: Question_2
  Message: "Would you recommend us to a friend? Yes or no?"

  Extract Variables:
    - would_recommend: "Extract yes or no"

Node 4: Question_3 (Conditional)
  Condition: satisfaction_score < 3

  Message: "I'm sorry to hear that. Can you tell me what we could
           improve?"

  Extract Variables:
    - improvement_feedback: "Extract detailed feedback"

Node 5: Submit_Survey (Tool)
  Tool: submit_survey_results
  Parameters:
    satisfaction: {{satisfaction_score}}
    recommend: {{would_recommend}}
    feedback: {{improvement_feedback}}
    caller: {{user_number}}
    date: {{current_date}}

Node 6: Thank_You
  Message: "Thank you for your feedback, we really appreciate it!"

Validation Strategies

Format Validation

Ensure data meets expected format. Email Validation:

Node: Collect_Email
  Message: "What's your email address?"

  Extract Variables:
    - email_address: "Extract email in format [email protected]"

Validation Node:
  Condition: email_address contains "@" AND email_address contains "."

  If valid → Continue
  If invalid → "That doesn't look like a valid email. Could you
                spell it out for me? For example, john at example dot com."

Phone Number Validation:

Validation Logic:
  - Length: Must be 10 digits
  - Format: (XXX) XXX-XXXX or XXX-XXX-XXXX or XXXXXXXXXX
  - Area code: First digit cannot be 0 or 1

Error Message: "I need a 10-digit phone number. For example, 555-123-4567.
  What's your phone number?"

Date Validation:

Validation Logic:
  - Must be future date (for appointments)
  - Must be valid calendar date
  - Must be within acceptable range

Error Message:
  "That date doesn't work. I can schedule appointments up to 6 months
  out. What date would you like?"

Range Validation

Ensure values fall within acceptable ranges.

Node: Collect_Age
  Message: "For verification, how old are you?"

  Extract Variables:
    - age: "Extract age as number"

Validation:
  - age >= 18 AND age <= 120 → Valid
  - age < 18 → "I'm sorry, you must be 18 or older."
  - age > 120 → "That doesn't seem right. What's your age?"

Node: Collect_Quantity
  Message: "How many would you like to order?"

  Extract Variables:
    - quantity: "Extract number"

Validation:
  - quantity >= 1 AND quantity <= 100 → Valid
  - quantity < 1 → "I need at least 1 item."
  - quantity > 100 → "For orders over 100, please contact our
                       sales team directly."

Existence Validation

Verify data exists in system.

Node: Collect_Order_Number
  Message: "What's your order number?"

  Extract Variables:
    - order_number: "Extract order number (format ORD-XXXXX)"

Validation Tool:
  Tool: check_order_exists
  Parameters:
    order_number: {{order_number}}

  Transitions:
    - Order found → Display_Order_Info
    - Order not found → "I couldn't find that order number.
                         Can you double-check and try again?"
    - After 3 attempts → "Let me transfer you to customer service."

Handling Collection Errors

Transcription Errors

Speech recognition isn’t perfect. Strategy: Confirmation

Node: Collect_Email
  Message: "What's your email address?"

  Extract: email_address

Node: Confirm_Email
  Message: "I heard {{email_address}}. Is that correct?"

  Extract: confirmation

  Transitions:
    - "yes" → Continue
    - "no" → "Let's try again. Can you spell it out?
               For example, J-O-H-N at G-M-A-I-L dot com."

Strategy: Phonetic Spelling

After Error: "I'm having trouble hearing that. Let me try a different way.
  Can you spell your email letter by letter?
  For example, J for John, O for Oscar, H for Hotel..."

Extract as: Phonetic sequence
Convert to: email_address

Ambiguous Input

User provides unclear information.

Node: Collect_Date
  Message: "What date would you like?"

User: "Soon"

Problem: Too vague

Response: "I can schedule appointments starting tomorrow through the
          next 6 months. What specific date works for you?
          For example, next Monday, or January 15th?"

User: "Monday"

Problem: Which Monday?

Response: "Did you mean Monday, January 15th, or Monday, January 22nd?"

Incomplete Information

User doesn’t provide all needed data.

Node: Collect_Address
  Message: "What's your street address?"

User: "123 Main Street"

Problem: No city, state, ZIP

Solution: Follow-up questions
  "And what city is that in?"
  "What's the ZIP code?"

Or: Structured prompting
  "I'll need your complete address. What's the street address?"
  [collect]
  "And the city?"
  [collect]
  "State?"
  [collect]
  "ZIP code?"

Advanced Techniques

Multi-Slot Extraction

Extract multiple fields from one response.

Node: Collect_All_At_Once
  Message: "To send you a quote, I'll need your email and phone number."

  Extract Variables:
    - email_address: "Extract email address"
    - phone_number: "Extract phone number"

User: "My email is [email protected] and you can reach me at 555-1234"

Extracted:
  email_address: "[email protected]"
  phone_number: "555-1234"

Validation Node:
  Check both variables:
    - If email exists AND phone exists → Continue
    - If only email → "Great! And your phone number?"
    - If only phone → "Got it. And your email address?"
    - If neither → "Let me ask separately. What's your email?"

Conditional Collection

Collect different data based on context.

Router: Account_Type_Check
  Condition: {{account_type}} == "business"
  → Collect business-specific info (EIN, company name)

  Condition: {{account_type}} == "personal"
  → Collect personal info (SSN last 4, DOB)

Business Info Collection:
  - company_name
  - ein_number
  - business_address

Personal Info Collection:
  - ssn_last_four
  - date_of_birth
  - home_address

Progressive Profiling

Collect more data over multiple interactions.

First Call:
  - Collect: name, phone, email
  - Create basic profile

Second Call:
  - Already have: name, phone, email
  - Collect: preferences, interests

Third Call:
  - Already have: basic info, preferences
  - Collect: detailed requirements

Context-Aware Collection

Use available context to skip collection.

Router: Check_Existing_Data

Condition: {{user_number}} in customer_database
  → Lookup customer data
  → "Welcome back, {{customer_name}}! I have your email as
      {{email_address}}. Is that still correct?"

Condition: {{user_number}} NOT in customer_database
  → "I don't have your information yet. What's your name?"
  → Full collection flow

Data Storage & Usage

Storing Collected Data

During Call:

Variables stored in call context:
  - Available throughout conversation
  - Passed between nodes
  - Used in tools and routing
  - Included in call logs

After Call:

Method 1: Webhooks
  - Send data to your server
  - Store in your database
  - Trigger workflows

Method 2: Outcomes
  - Define outcome schema
  - Automatically extract at call end
  - Retrieve via API

Method 3: Tools
  - Call your API during conversation
  - Store data real-time
  - Return confirmation

Using Collected Data

In Prompts:

'Thank you, {{customer_name}}! Your order {{order_number}}
will be shipped to {{shipping_address}}.'

In Tools:

Tool: create_customer
Parameters:
  name: { { customer_name } }
  email: { { email_address } }
  phone: { { phone_number } }
  source: 'phone_call'
  collected_at: { { current_datetime } }

In Routing:

Router: Priority_Check
  Condition: {{customer_tier}} == "VIP"
  → VIP_Fast_Track

  Condition: {{issue_severity}} == "high"
  → Urgent_Support

  Default → Standard_Support

In Webhooks:

{
  "event": "call.ended",
  "data": {
    "customer_name": "{{customer_name}}",
    "email_address": "{{email_address}}",
    "phone_number": "{{phone_number}}",
    "issue_type": "{{issue_type}}",
    "resolution": "{{resolution_status}}"
  }
}

Best Practices

Ask One Question at a Time

Don’t overwhelm users with multiple questions

Confirm Critical Data

Always confirm important information like emails, addresses

Use Appropriate Method

DTMF for numbers, NL for text, prompting for precision

Validate Early

Check format and validity immediately after collection

Provide Examples

Show users what format you expect

Handle Errors Gracefully

Give clear guidance when collection fails

Skip When Possible

Don’t ask for data you already have

Explain Why

Tell users why you need their information

Troubleshooting

Variable Not Extracting

Check:

Extraction instructions are clear
User actually provided the information
Variable name is correct (snake_case)
Extraction is enabled on the node

Solutions:

Make instructions more specific
Add examples to instructions
Ask more directly for the information

Wrong Data Extracted

Check:

Transcription accuracy
Extraction instructions specificity
User response clarity

Solutions:

Add format specifications to instructions
Confirm what was heard
Use DTMF for critical data

DTMF Not Capturing

Check:

DTMF Input Capture is enabled
Variable name is set
At least one completion condition configured
Testing with actual phone (not browser)

Solutions:

Enable DTMF Input Capture toggle
Set variable name
Add termination key or digit limit

Next Steps

Variables System

Learn more about the variable system

DTMF Features

Deep dive into DTMF input capture

API Integration

Send collected data to your systems

Call Routing

Route calls based on collected data

Getting Started

SDKs

APIs Guides

Agent Guides

​Overview

​Collection Methods

​1. Natural Language Extraction

​2. DTMF Input Capture

​3. Guided Prompting

​Variable System

​Defining Variables

​Variable Naming Best Practices

​Extraction Instructions

​Complete Collection Workflows

​Example 1: Customer Registration

​Example 2: Secure Authentication

​Example 3: Hybrid Collection (DTMF + NL)

​Example 4: Survey Data Collection

​Validation Strategies

​Format Validation

​Range Validation

​Existence Validation

​Handling Collection Errors

​Transcription Errors

​Ambiguous Input

​Incomplete Information

​Advanced Techniques

​Multi-Slot Extraction

​Conditional Collection

​Progressive Profiling

​Context-Aware Collection

​Data Storage & Usage

​Storing Collected Data

​Using Collected Data

​Best Practices

Ask One Question at a Time

Confirm Critical Data

Use Appropriate Method

Validate Early

Provide Examples

Handle Errors Gracefully

Skip When Possible

Explain Why

​Troubleshooting

​Next Steps

Variables System

DTMF Features

API Integration

Call Routing

Overview

Collection Methods

1. Natural Language Extraction

2. DTMF Input Capture

3. Guided Prompting

Variable System

Defining Variables

Variable Naming Best Practices

Extraction Instructions

Complete Collection Workflows

Example 1: Customer Registration

Example 2: Secure Authentication

Example 3: Hybrid Collection (DTMF + NL)

Example 4: Survey Data Collection

Validation Strategies

Format Validation

Range Validation

Existence Validation

Handling Collection Errors

Transcription Errors

Ambiguous Input

Incomplete Information

Advanced Techniques

Multi-Slot Extraction

Conditional Collection

Progressive Profiling

Context-Aware Collection

Data Storage & Usage

Storing Collected Data

Using Collected Data

Best Practices

Troubleshooting

Next Steps