1. Set up the models and the keys in Langdock
You need different models for the platform to work. To add models, you need to add the models and the according keys here in the workspace settings. The keys can be used for multiple models from the same provider, so for example GPT-5 and GPT-4.1 can use the same key if they are both from the same deployment in e.g., Microsoft Azure. Here are the models necessary to cover all functionalities:1.1 Embedding Model
- Embedding models process documents and allow the model to search uploaded documents
- We currently require the provision of ADA v2 (text-embedding-ada-002)
1.2 Backbone Model
- The backbone model has three purposes:
- It generates chat titles in the sidebar on the left (to generate a summary in 3 words, you do not need the main model)
- I defines and executes planning steps of models that are not efficient in tool calling (e.g., Llama or Mistral)
- If the main model fails, the backbone model jumps in to finish a response for the user.
- We recommend GPT-4.1 mini (gpt-4.1-mini) for this purpose.
Important: The backbone model is a separate model you need to set up. If you already added one GPT-4.1 mini model, please set up another one. This model will be set as a backbone model afterward.
1.3 Image Generation Model
- We support dall-e-3, gpt-image-1, Google Imagen, and Flux models from Black Forest Labs.
- For Google Imagen 3, follow the same setup process as Gemini using model ID
imagen-3.0-generate-001
1.4 Completion Models
- For users to select different models in the chat, you can add the completion models for your users like GPT-4o, o1, Claude 3.5 Sonnet, Gemini 1.5 Pro etc.
- Please also add the models needed for Deep Research (o3, o4 mini and GPT-4.1 mini). They do not need 2 deployments like the backbone model.
- We support all models hosted by Microsoft Azure, AWS, Google Vertex and OpenAI
- For quotas, anything between 200k and 500k should be good to cover usage of ~200 users. For GPT-4o, the most-used model, you might need a quota of 500k to 1 million tokens.
For the main models, we recommend setting up multiple deployments in different regions. If a model has an error in one region, Langdock automatically retries to call the model in a different region.
- 1x Embedding model (Ada)
- 2x GPT-4.1 mini (one as a completion model and one as a backbone model)
- 1 or more image generation models
- 1x o3
- 1x o4 mini
- Current major models from OpenAI, Anthropic and Google (and others) as Completion models
2. Reach out to the Langdock team
After you have set up all the models you need, reach out to the Langdock team. We will align with you on a timeslot to turn on BYOK on our side. Usually, this should be done in the late afternoon or evening when fewer users are active. There should not be any downtime; this is a precautionary step to ensure no disruptions during the switch. Please ensure that you or someone who can set up the models is available. We will check that an engineer is also available on our side.3. Test the models
Please make sure that all of the models work correctly. Here is how you can test the models:- Completion models: Send a prompt to each model you can select in the interface (e.g., “write a story about dogs”).
- Embedding model: Upload a file and ask a question about it (e.g., “what is in the file”). The upload should work and you should receive an answer based on the file.
- Image model: Ask any model to generate an image. You should see an image generated by the model in the background.
- Backbone model: Write a message in a new chat and check whether a chat title is generated after sending the prompt. (Please ensure that strict mode is disabled for this model)