ElevenLabs Knowledge Base Guide 2026: RAG for Voice Agents Explained

ElevenLabs Knowledge Base is the document and content store for ElevenAgents voice agents. It implements RAG — Retrieval-Augmented Generation — which is the technical approach of giving an AI model access to a specific set of documents it can search and retrieve from, rather than relying only on information encoded during training.

The practical difference between an agent with and without a knowledge base is significant. An agent without a knowledge base can only answer questions using the LLM’s general training knowledge — which may be outdated, inaccurate for your specific products, or simply unavailable for proprietary company information. An agent with a properly populated knowledge base can accurately answer questions about your specific products, policies, pricing, and procedures — drawing on your actual documentation rather than generalised AI knowledge.

Knowledge Base content is stored and indexed by ElevenLabs. When the agent receives a user query, the RAG system retrieves the most relevant chunks from the knowledge base documents and provides them as context to the LLM for response generation. The retrieval is semantic rather than keyword-based — similar content is retrieved even if the user’s phrasing does not exactly match the document language.

Content Source Types

File Upload

Supported file types include PDF, TXT, DOCX, and other common document formats. ElevenLabs processes the uploaded file — extracting text, chunking it into retrievable segments, and indexing it for semantic search. PDF processing handles both text-based PDFs (where text can be extracted directly) and scanned PDFs (where OCR is applied). Upload size limits and processing time scale with file size — large PDFs may take several minutes to process and index.

Best practices for file upload: use PDFs with extractable text rather than scanned image PDFs where possible (text extraction is more accurate). Ensure your documents have clear headings and structured content — the chunking algorithm benefits from document structure when determining what constitutes a retrievable segment. Avoid uploading documents with significant non-text content (images, charts, diagrams) as the primary information carriers — text accompanying those elements is extracted, but the visual content itself is not searchable.

URL Scraping

ElevenLabs crawls and indexes the content of a provided URL. This is the fastest way to load existing web content — product documentation sites, help centres, FAQ pages, pricing pages, and any other web content that is publicly accessible. The URL scraping does not require re-uploading when content changes — update your webpage and re-run the URL scrape to refresh the indexed content.

URL scraping has limitations: it indexes the content of the provided URL but not necessarily subpages linked from that URL. For multi-page documentation sites, you may need to add multiple URLs or consider using file upload with a compiled document instead. Pages with significant JavaScript-rendered content may not scrape completely — the scraper captures the initial page HTML, not JavaScript-dynamically-loaded content.

Text

Direct text input allows you to paste or type content directly into the knowledge base without a file or URL. This is the most flexible source type — useful for content that is not available as a file or accessible URL, content that needs to be formatted specifically for the knowledge base, or quick additions of specific information without requiring document preparation.

Folder Organisation

The April 2026 API update introduced folder navigation for Knowledge Base documents — each document response now includes a folder_path field showing the path from root to the document’s parent folder. Folders allow large knowledge bases to be organised by topic, product, language, or any other taxonomy relevant to your content structure.

Practical folder organisation examples: a customer service knowledge base might use folders for ‘Billing’, ‘Technical Support’, ‘Product Features’, and ‘Policies’. A multilingual deployment might use language folders (‘English’, ‘Spanish’, ‘German’) containing equivalent documents in each language. A multi-product company might use product folders (‘Product A’, ‘Product B’) containing separate documentation sets. Folder organisation does not affect retrieval — the RAG system searches across all documents regardless of folder structure — but makes knowledge base management significantly more practical at scale.

Knowledge Base Search API

The April 2026 API update added a knowledge base content search endpoint — GET /v1/convai/knowledge-base/{documentation_id}/search — allowing direct search of knowledge base content from external applications without routing through an agent conversation. This enables:

Testing and validation — search the knowledge base directly to verify that relevant content is retrievable before deploying the agent to production users.
External application integration — integrate knowledge base search into applications that are not ElevenLabs voice agents, using ElevenLabs as a document intelligence backend.
Quality assurance workflows — systematically test whether the knowledge base returns correct results for a battery of representative questions before launch.

Optimising Knowledge Base Performance

Content quality determines retrieval quality

The Knowledge Base’s RAG system retrieves the most semantically relevant content for each query. If the knowledge base content is poorly written, uses inconsistent terminology, or answers questions in a roundabout way, retrieval quality suffers. Write knowledge base content the way a clear, direct FAQ is written — question and answer format where possible, with consistent terminology that matches how users phrase their questions. If your product documentation uses jargon that users would not use, add FAQ-style content that bridges the gap between user language and documentation language.

Chunk size and document structure

ElevenLabs’ chunking algorithm divides documents into retrievable segments. Documents with clear headings, short paragraphs, and organised structure produce better chunk boundaries than dense, unstructured prose. For optimal retrieval: use headings to separate distinct topics, keep individual paragraphs focused on a single point, and avoid very long continuous sections without internal structure. The shorter and more focused each chunk, the more precisely it can be retrieved for specific queries.

Keep content current

Knowledge base content that is out of date produces incorrect agent responses. Establish a content review cadence — monthly for product documentation that changes frequently, quarterly for stable policy documents — and re-upload or re-scrape updated content when changes occur. Stale pricing information, discontinued product references, or outdated policy descriptions in the knowledge base are a direct source of agent errors that damage user trust.

Three Insights Most Knowledge Base Guides Miss

1. The System Prompt and Knowledge Base Must Be Designed Together

Most documentation treats the system prompt and knowledge base as separate concerns — set up the system prompt, then add the knowledge base separately. In practice, they must be designed together to work effectively. The system prompt should explicitly instruct the agent to use the knowledge base for factual questions: ‘When answering questions about our products, pricing, or policies, always refer to the knowledge base content rather than making assumptions.’ Without this instruction, agents may answer from LLM general knowledge even when knowledge base content is available, producing inconsistencies between the knowledge base content and agent responses.

2. URL Scraping Refresh Is Manual — Build This Into Your Maintenance Workflow

URL-scraped content does not auto-refresh when the underlying web page changes. If you load your product documentation via URL scraping, updates to that documentation do not automatically update the knowledge base. Your agent will continue answering from the old content until you manually trigger a re-scrape. For any knowledge base content that changes regularly — pricing, feature documentation, policy pages — build a scheduled re-scrape process into your content update workflow. The ElevenLabs API supports programmatic knowledge base document deletion and re-creation, making this automatable.

3. The Search API Enables Knowledge Base Quality Assurance Before Launch

The knowledge base content search endpoint — added in April 2026 — enables a quality assurance workflow that most knowledge base deployments skip: systematically testing whether the RAG retrieval returns correct results for representative user questions before the agent goes live. Create a test set of 20-30 representative questions users will ask. Run each through the knowledge base search API. Verify that the correct content is retrieved. Fix gaps — add missing content, improve existing content clarity, or add FAQ entries that bridge query phrasing to documentation language. This pre-launch testing dramatically reduces the frequency of agent errors from knowledge base retrieval failures in production.

Knowledge Base in 2027

The ElevenLabs Knowledge Base will likely add capabilities in three directions over the next 12 months. Auto-refresh for URL sources — scheduled automatic re-scraping of URL-based documents eliminating the manual update requirement. Multi-modal content — the April 2026 changelog introduced asset transcription support, suggesting ElevenLabs is building toward audio and video content as knowledge base sources, not just text documents. And deeper RAG customisation — control over chunk size, retrieval strategy (dense vs sparse retrieval), and hybrid search approaches for more sophisticated knowledge base deployments.

Key Takeaways

ElevenLabs Knowledge Base is a RAG system that gives voice agents access to your specific documents, websites, and text — enabling accurate answers from your actual content rather than LLM general knowledge.
Three source types: file upload (PDF, DOCX, TXT), URL scraping (public web pages), and direct text input.
Folder organisation (April 2026) makes large knowledge bases manageable. Knowledge base search API enables pre-launch quality assurance testing.
Design system prompt and knowledge base together — explicitly instruct the agent to use knowledge base content for factual questions.
URL-scraped content does not auto-refresh — build a manual or automated re-scrape process into your content update workflow.

Conclusion

ElevenLabs Knowledge Base is the difference between a voice agent that knows about AI in general and a voice agent that knows about your products, your policies, and your customers’ actual questions. The setup investment is a few hours of content preparation and upload. The operational impact is agents that answer accurately, consistently, and from authoritative sources rather than AI general knowledge that may be wrong, outdated, or simply not specific enough for your business context. Start with your most important FAQ content, test with the knowledge base search API before launch, and expand the knowledge base as you identify gaps from production conversation transcripts.

Frequently Asked Questions

What is ElevenLabs Knowledge Base?

A RAG (Retrieval-Augmented Generation) document store for ElevenAgents voice agents. Load documents, web pages, or text into the knowledge base, and your agent searches and retrieves relevant content when answering user questions — drawing on your specific content rather than general LLM knowledge.

What file types does ElevenLabs Knowledge Base support?

PDF, TXT, and DOCX are confirmed supported file types. URL scraping supports publicly accessible web pages. Direct text input accepts any plain text content.

Does ElevenLabs Knowledge Base auto-update when my website changes?

No — URL-scraped content does not auto-refresh. When your web page content changes, you must manually delete the old knowledge base document and re-scrape the URL to update the indexed content. Build this into your content maintenance workflow.

How do I test my Knowledge Base before deploying the agent?

Use the knowledge base content search endpoint (GET /v1/convai/knowledge-base/{documentation_id}/search) added in April 2026 to directly search knowledge base content with representative user questions. Verify correct content is retrieved before the agent goes live.

Can I organise Knowledge Base documents into folders?

Yes — folder navigation was added to the Knowledge Base API in April 2026. Documents include a folder_path field, and the dashboard supports folder-based organisation for managing large knowledge bases.

Methodology

Knowledge Base features from ElevenLabs official ElevenAgents documentation. Folder navigation from ElevenLabs API changelog (January 2026 — folder_path field added). Content search endpoint from ElevenLabs changelog (April 2026 — knowledge base content search added). Asset transcription support from ElevenLabs API changelog (April 21, 2026). RAG architecture from ElevenLabs Conversational AI documentation and ElevenLabs MCP server knowledge base tool documentation. This article was drafted with AI assistance and reviewed by the editorial team at ElevenLabsMagazine.com.

References

ElevenLabs. (2026). ElevenAgents Knowledge Base documentation. https://elevenlabs.io/docs/eleven-agents

ElevenLabs. (2026). Changelog — knowledge base folder navigation. https://elevenlabs.io/docs/changelog/2026/1/12

ElevenLabs. (2026). Changelog — knowledge base content search. https://elevenlabs.io/docs/changelog/2026/4/13

ElevenLabs Knowledge Base 2026: Complete Guide for Voice Agent Builders