Skip to main content

File Preview in the Model’s Context Window

Users get the best results when the model can process the entire text of the knowledge. That’s why as much text as possible is sent to the model as a preview. For small files, this can be the entire document. Additionally, the model can use an embedding search. During upload, the file is first split into smaller sections (chunks), which are then converted into embeddings. Each embedding receives a vector — a numerical sequence that describes various thematic dimensions. When a question is asked, the system performs a vector search to identify the relevant vectors and their underlying sections. This doesn’t search for specific words (like a keyword search with Ctrl + F), but for sections with semantic similarity to the query.
Example: If the embedding search is looking for “bread,” it will also find sections about “baguette,” even if the word “bread” doesn’t appear.
Only these relevant parts are sent to the model in context. This enables working with very large documents that exceed the model’s context window.

Our Parameters

Embedding Dimension

The vector dimension is 1536.

Chunk Size

Documents are split into sections of 2,000 characters.

Retrieval Parameter (k-value)

Up to 50 chunks are retrieved per query.

Land and Expand

We send not only the relevant chunks but also surrounding chunks to provide sufficient context.