RAG

RAG (Retrieval Augmented Generation) helps models answer questions about long documents by breaking them into chunks and using only the most relevant parts.

For example, when given a long annual report and asked "Does this company have an ESG policy?", RAG will:

Split the document into chunks, e.g. of ~2000 characters
Find chunks most relevant to ESG and policy
Send only the top chunks to the model, e.g. top 10 chunks
Generate an answer based on those chunks

This helps the model:

Handle documents longer than its context window
Focus on relevant information
Provide more accurate answers
Cite sources correctly

Using RAG

On the Playground, enable the RAG checkbox and:

Paste your (long) document in the User message box
Set your Task (e.g. "Does this company have an ESG policy?")
Adjust settings if needed:
- Chunk size: Length of text segments (default: 2000)
- # chunks: Number of relevant chunks to use (default: 10)
- System prompt: Instructions for the model (default: "Assist the user with the task using ONLY the context")

Screenshot showing the RAG interface with text chunks highlighted and settings panel

You can use this to:

Query long documents. Ask questions about books, reports, or documentation
Summarize sections. Get summaries of specific topics from a large document
Find relevant parts. Locate sections most relevant to your question
Compare sections. Ask about relationships between different parts of a document

Note: RAG disables system instructions and attachments since it uses its own prompting strategy.