Skip to content

Chat with documents and knowledge bases

Open WebUI supports retrieval-augmented generation (RAG) to help local AI models answer questions based on your uploaded documents or curated knowledge bases.

This guide explains how to analyze individual documents during a chat session and how to build persistent knowledge collections for reuse.

Learning objectives

In this guide, you will learn how to:

  • Configure an embedding model to process document text.
  • Upload and analyze individual documents in a chat session.
  • Build and manage a persistent knowledge base.
  • (Optional) Configure an advanced content extraction engine for complex document layouts.

Prerequisites

Before you begin, ensure you have the following in place:

  • Open WebUI installed and configured with at least one active model backend.
  • An embedding model application installed, such as Qwen3 Embedding 0.6B (Ollama).
  • Administrator privileges for the Open WebUI instance.

Configure embedding model

Document understanding requires an embedding model to convert text into vector data. To configure Open WebUI, you must first retrieve your embedding model details.

Get embedding model details

  1. Open Qwen3 Embedding 0.6B (Ollama) from the Launchpad.

  2. Note down the exact model name displayed on the main page. For example, qwen3-embedding:0.6b.

    Qwen3 Embedding 0.6B

  3. Open Olares Settings, and then go to Applications > Qwen3 Embedding 0.6B (Ollama).

  4. Under Shared entrances, click Qwen3 Embedding 0.6B, and then copy the endpoint URL. For example, http://eae5afcf0.shared.olares.com.

Apply embedding settings in Open WebUI

  1. In Open WebUI, select your profile icon, and then go to Admin Panel > Settings > Documents.

  2. Under the Embedding section, specify the following settings:

    • Embedding Model Engine: Select Ollama.
    • API Base URL: Enter the embedding model endpoint URL you noted earlier.
    • Embedding Model: Enter the embedding model name you noted earlier.
  3. Scroll down to the bottom of the page, and then click Reindex in the lower-right corner to apply the changes.

  4. Select Save.

Analyze individual documents

Attach documents directly to a chat session for one-off analysis and summarization.

  1. Start a new chat.

  2. Select the model.

  3. Click add_2 under the message input field, and then select Upload Files.

    Upload files in Open WebUI

  4. Upload a PDF or a text file.

  5. Enter a prompt asking the model to analyze the document. For example:

    plain
    Summarize the main points of this document.
  6. Submit the prompt. If the generated response includes file citations, Open WebUI successfully added the document to the context.

    File summary

Build a knowledge base

For documents you want to reuse across multiple chats, create a persistent knowledge base.

  1. In Open WebUI, click your profile icon, and then go to Workspace > Knowledge.

  2. Click New Knowledge.

  3. In the What are you working on field, enter a name for your knowledge base. For example: Product FAQs.

  4. In the What are you trying to achieve field, enter a description. For example: Frequently asked questions and support guides for Olares products.

    Create knowledge

  5. Click Create Knowledge to save the collection.

  6. Click add > Upload files, and then upload your files to populate the knowledge base.

    Populate knowledge base

Attach a knowledge base to a chat

  1. Start a new chat.

  2. Select the model.

  3. Click add_2 under the message input field, and then select Attach Knowledge.

  4. Choose the knowledge collection you want to use.

    Attach knowledge base to chat

  5. Ask questions related to the knowledge base content. The model will retrieve relevant passages and cite them in its response.

    Search results from attached knowledge base

(Optional) Configure an advanced extraction engine

By default, Open WebUI uses a simple text extraction engine. For complex document layouts containing tables or complicated formatting, switch to PaddleOCR for better accuracy.

Performance impact

PaddleOCR requires more GPU VRAM and processes documents slower than the default engine. Use this engine only when document layout quality is critical.

  1. Install the PaddleOCR app from Market.

    PaddleOCR installation

  2. Get the PaddleOCR endpoint URL:

    a. Open Olares Settings, and then go to Applications > PaddleOCR > Shared entrances > PaddleOCR API.

    b. Copy the endpoint URL. For example, http://6b2a6fc50.shared.olares.com.

  3. In Open WebUI, go to Admin Panel > Settings > Documents.

  4. In the General section, select PaddleOCR-vl for Content Extraction Engine.

  5. In API Base URL, enter the PaddleOCR endpoint URL.

  6. In API Token, enter any text. Do not leave this field empty.

    PaddleOCR config in Open WebUI

  7. Click Save.