Skip to main content

Overview

A well-organized knowledge base improves agent accuracy, reduces hallucinations, and makes maintenance easier. Follow these practices to get the most from your knowledge base.
These practices are based on real-world usage patterns and help ensure your agents provide accurate, helpful responses.

Content Quality

Use Clear, Concise Language

Good:
"Our return window is 30 days from purchase date. Items must be in original condition with tags attached. Refunds are processed within 5-7 business days after we receive the return."
Bad:
"Well, um, we have this policy where you can kinda return stuff, you know, within like 30 days or something, but it has to be in good shape and stuff, and then we'll give you your money back, probably within a week or so."
Why it matters:
  • Clear language is easier for AI to understand and extract
  • Concise content reduces token usage
  • Structured information improves retrieval accuracy

Structure Information Logically

Good:
Shipping Times:
- Standard: 5-7 business days
- Express: 2-3 business days
- Overnight: Next business day

Shipping Costs:
- Standard: $5.99
- Express: $12.99
- Overnight: $24.99
Bad:
We ship standard in about 5-7 days usually, and if you want express that's 2-3 days, oh and overnight gets there next day. Standard shipping costs $5.99, express is $12.99, and overnight is $24.99.
Why it matters:
  • Structured content is easier to parse
  • Lists and headings improve readability
  • Clear organization helps AI find relevant information

Include Specific Details

Good:
"Customer service hours: Monday-Friday 9 AM - 6 PM EST. Closed on weekends and holidays."
Bad:
"We're usually available during business hours."
Why it matters:
  • Specific details provide accurate answers
  • Vague information leads to generic responses
  • Precise data reduces follow-up questions

Avoid Ambiguity

Good:
"Free shipping applies to orders over $50. Applies to standard shipping only. Excludes international orders."
Bad:
"Free shipping on some orders."
Why it matters:
  • Clear conditions prevent confusion
  • Ambiguous statements lead to incorrect answers
  • Specific rules help agents make accurate decisions

Naming Conventions

Be Descriptive

Good:
  • “Return Policy - Electronics 2024”
  • “FAQ - Shipping Information”
  • “Product Catalog - Q1 2024”
Bad:
  • “Returns”
  • “Doc1”
  • “Policy”
  • “file_final_v2”
Why it matters:
  • Descriptive names make searching easier
  • Clear names help identify content quickly
  • Consistent naming improves organization

Include Dates for Versioned Content

Good:
  • “Employee Handbook - January 2024”
  • “Product Catalog - Q1 2024”
  • “Return Policy - 2024”
Why it matters:
  • Dates help identify current vs. outdated content
  • Version tracking prevents confusion
  • Easy to identify what needs updating

Use Consistent Naming

Good:
"FAQ - Shipping"
"FAQ - Returns"
"FAQ - Payments"
"FAQ - Warranty"
Bad:
"Shipping FAQ"
"Returns Information"
"Payment Questions"
"Warranty Details"
Why it matters:
  • Consistent patterns make organization clear
  • Easier to find related items
  • Better for filtering and searching

Include Context

Good:
  • “Return Policy - Electronics”
  • “Return Policy - Clothing”
  • “Return Policy - Software”
Why it matters:
  • Context helps agents choose the right information
  • Prevents confusion between similar topics
  • Improves retrieval accuracy

Organization

Strategy:
  • Create separate items for different topics
  • Don’t combine unrelated information
  • Use clear names to show relationships
Example:
Good Organization:
- "Shipping Policy"
- "Return Policy"
- "Warranty Information"
- "Refund Process"

Bad Organization:
- "All Policies and Information"
Why it matters:
  • Focused items improve retrieval accuracy
  • Easier to update specific topics
  • Better organization reduces maintenance effort

Keep Items Focused

Good:
  • One item per topic
  • Focused, specific content
  • Clear purpose for each item
Bad:
  • Everything in one large item
  • Multiple unrelated topics combined
  • Unclear item purpose
Why it matters:
  • Focused items are easier to retrieve
  • Specific content improves accuracy
  • Easier to maintain and update

Break Down Large Content

Strategy:
  • Split long documents into chapters/sections
  • Create separate items for major topics
  • Keep items under 2,000 words when possible
Example:
Large Manual (100 pages):
- Split into: "Product Manual - Chapter 1: Installation"
- Split into: "Product Manual - Chapter 2: Configuration"
- Split into: "Product Manual - Chapter 3: Troubleshooting"
Why it matters:
  • Smaller items are easier to process
  • Better retrieval accuracy
  • Faster processing times

Update Regularly

Schedule:
  • Review knowledge base quarterly
  • Remove outdated items
  • Update changed information
  • Archive old versions
Why it matters:
  • Outdated information leads to incorrect answers
  • Regular updates maintain accuracy
  • Clean knowledge base improves performance

Content Size Guidelines

Text Items

Recommended:
  • Optimal: 100-2,000 characters
  • Maximum: 5,000 characters
  • Minimum: 50 characters
Strategy:
  • Break long content into multiple items
  • Keep each item focused on one topic
  • Use multiple items for comprehensive coverage
Example:
Instead of one 5,000 character item:
- Create 5 items of ~1,000 characters each
- Each focused on a specific subtopic

File Items

Recommended:
  • Optimal: Under 5MB per file
  • Maximum: 21MB per file
  • Strategy: Split large documents into chapters
Best Practices:
  • Use well-formatted documents
  • Ensure text is extractable (not just images)
  • Avoid complex layouts
  • Include clear headings and structure

URL Items

Recommended:
  • Optimal: Focused pages over entire websites
  • Maximum: 100 URLs per item
  • Strategy: Group related URLs together
Best Practices:
  • Choose content-heavy pages
  • Avoid pages with excessive navigation/ads
  • Use sitemaps for large-scale imports
  • Monitor failed URLs regularly

Content Types by Use Case

Customer Support

Best Content Types:
  • Text items for FAQs
  • File items for policy documents
  • URL items for support articles
Organization:
  • Group by topic (shipping, returns, payments)
  • Use consistent naming
  • Keep content current

Product Information

Best Content Types:
  • File items for product manuals
  • URL items for product pages
  • Text items for specifications
Organization:
  • One item per product or product line
  • Include version numbers
  • Update when products change

Technical Documentation

Best Content Types:
  • File items for technical manuals
  • URL items for API documentation
  • Text items for quick references
Organization:
  • Organize by topic or feature
  • Include version information
  • Link related documentation

Company Policies

Best Content Types:
  • Text items for key policies
  • File items for comprehensive handbooks
  • URL items for policy pages
Organization:
  • One policy per item
  • Include effective dates
  • Archive old versions

Maintenance Practices

Regular Reviews

Weekly:
  • Check for failed items
  • Review processing status
  • Monitor storage usage
Monthly:
  • Review unused items
  • Update outdated content
  • Clean up duplicates
Quarterly:
  • Comprehensive audit
  • Remove obsolete items
  • Optimize organization
  • Review naming conventions

Storage Management

Monitor Usage:
  • Check storage regularly
  • Delete unused items
  • Compress large files
  • Upgrade plan if needed
Optimization:
  • Remove failed items that can’t be fixed
  • Archive old items instead of deleting
  • Use inactive status for temporary content

Quality Assurance

Test Content:
  • Verify items process successfully
  • Test retrieval in agents
  • Check for accuracy
  • Review error messages
Monitor Performance:
  • Track which items are used
  • Identify unused items
  • Review agent responses
  • Update based on feedback

Error Prevention

Before Creating Items

Checklist:
  • ✅ Verify files open correctly
  • ✅ Check file sizes (max 21MB)
  • ✅ Ensure URLs are accessible
  • ✅ Remove password protection
  • ✅ Verify text length (50-5,000 chars)
  • ✅ Check storage quota

During Creation

Best Practices:
  • Use descriptive names
  • Validate content before saving
  • Monitor processing status
  • Review error messages immediately

After Creation

Verification:
  • Confirm PROCESSED status
  • Review item details
  • Test in agent if needed
  • Fix any errors promptly

URL Management

Single URLs

Best Practices:
  • Use focused, content-heavy pages
  • Avoid pages with lots of navigation
  • Choose pages that are likely to remain stable
  • Test URLs before adding

Sitemap Imports

Best Practices:
  • Use for large-scale imports
  • Review fetched URLs before adding
  • Group related URLs together
  • Monitor processing status
Limitations:
  • Maximum 100 URLs per item
  • Some sites may block scraping
  • Large sitemaps take time

URL Maintenance

Regular Tasks:
  • Monitor failed URLs
  • Remove inaccessible URLs
  • Update changed URLs
  • Test URLs periodically

Performance Optimization

Retrieval Accuracy

Improve Accuracy:
  • Use focused, specific items
  • Include relevant keywords in names
  • Structure content logically
  • Keep items appropriately sized

Processing Speed

Optimize Processing:
  • Keep files under 5MB when possible
  • Use well-formatted documents
  • Avoid complex layouts
  • Split large content into smaller items

Storage Efficiency

Maximize Storage:
  • Delete unused items
  • Remove failed items
  • Compress files before uploading
  • Use inactive status for temporary content

Common Mistakes to Avoid

❌ Too Much Content in One Item

Problem:
  • Large items are harder to retrieve accurately
  • Slower processing times
  • Difficult to maintain
Solution:
  • Break into multiple focused items
  • Keep items under 2,000 words
  • One topic per item

❌ Vague or Ambiguous Content

Problem:
  • Leads to incorrect answers
  • Confuses agents
  • Requires follow-up questions
Solution:
  • Use specific, clear language
  • Include exact details
  • Avoid ambiguity

❌ Poor Naming

Problem:
  • Hard to find items
  • Difficult to organize
  • Confusing for team members
Solution:
  • Use descriptive names
  • Follow consistent patterns
  • Include context

❌ Outdated Content

Problem:
  • Incorrect information
  • Confusing responses
  • Poor user experience
Solution:
  • Regular reviews
  • Update promptly
  • Archive old versions

❌ Ignoring Failed Items

Problem:
  • Clutters knowledge base
  • Wastes storage
  • Confusing status
Solution:
  • Review failed items weekly
  • Fix or delete promptly
  • Monitor for patterns

Advanced Tips

Content Versioning

Strategy:
  • Include dates in names
  • Archive old versions
  • Keep current version active
  • Document changes

Multi-Language Support

Strategy:
  • Create separate items per language
  • Use language in name
  • Organize by language
  • Keep translations synchronized

Seasonal Content

Strategy:
  • Use inactive status for off-season
  • Activate when needed
  • Don’t delete seasonal content
  • Update dates in names

Collaborative Management

Strategy:
  • Establish naming conventions
  • Document organization structure
  • Regular team reviews
  • Clear ownership

Measuring Success

Key Metrics

Usage Metrics:
  • Items used in agents
  • Items not used (candidates for cleanup)
  • Processing success rate
Quality Metrics:
  • Agent response accuracy
  • User satisfaction
  • Error rates
Performance Metrics:
  • Processing times
  • Storage usage
  • Retrieval accuracy

Continuous Improvement

Process:
  1. Monitor metrics regularly
  2. Identify areas for improvement
  3. Implement changes
  4. Measure impact
  5. Iterate

Next Steps