To use your own models and not the flat fee of Langdock, BYOK needs to be activated. This section guides you through the process of adding your own models.
On top of these must-have models, you can add the completion models for your users like GPT-4o, o1, Claude 3.5 Sonnet, Gemini 1.5 Pro etc.
We support all models hosted by Microsoft Azure, AWS, Google Vertex and OpenAI
For quotas, anything between 200k and 500k should be good to cover usage of ~200 users. For GPT-4o, the most-used model, you might need a quota of 500k to 1 million tokens
After you have received all keys you need, reach out to the Langdock team. We will agree with you on a timeslot to turn on BYOK on our side. After that, you can add the different models. Usually, this is done in the evening when fewer users are active, as there might be a downtime of a few minutes during the model switch.
Please make sure that all of the models work correctly. Here is how you can test the models:
Completion models: Send a prompt to each model you can select in the interface (e.g., “write a story about dogs”).
Embedding model: Upload a file and ask a question about it (e.g., “what is in the file”). The upload should work and you should receive an answer based on the file.
Image model: Ask any model to generate an image. You should see an image generated by the model in the background.
Backbone model: Select a model which does not support native tool calling (like Llama or Mistral) and ask to search the web (e.g., “search the web for Langdock”). If you do not have any model without tool calling, send a message in a new chat and observe if the conversation name in the bar on the left changes after a few seconds.
Please contact the Langdock team if there are any issues here.