POST
/
openai
/
{region}
/
v1
/
chat
/
completions
curl --request POST \
  --url https://api.langdock.com/openai/{region}/v1/chat/completions \
  --header 'Authorization: <authorization>' \
  --header 'Content-Type: application/json' \
  --data '{
  "model": "gpt-4o-mini",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "Write a short poem about cats."
    }
  ]
}'
{
  "choices": [
    {
      "message": {
        "content": "In moonlit shadows soft they prowl,\nWith eyes aglow in night's dark cowl.",
        "role": "assistant"
      },
      "index": 0,
      "finish_reason": "stop",
      "logprobs": null
    }
  ],
  "created": 1721722200,
  "id": "chatcmpl-8o4sq3sSzGVqS0aQyjlXuuEGVZnSj",
  "model": "gpt-4o-2024-05-13",
  "object": "chat.completion",
  "system_fingerprint": "fp_asd28019bf",
  "usage": {
    "completion_tokens": 34,
    "prompt_tokens": 14,
    "total_tokens": 48
  }
}

In dedicated deployments, api.langdock.com maps to <Base URL>/api/public

Creates a model response for the given chat conversation. This endpoint follows the OpenAI API specification and the requests are sent to the Azure OpenAI endpoint.

To use the API you need an API key. Admins can create API keys in the settings.

All parameters from the OpenAI Chat Completion endpoint are supported according to the OpenAI specifications, with the following exceptions:

  • model: Currently only the o3-mini, o1-preview gpt-4o, gpt-4o-mini, gpt-4 and gpt-35-turbo models are supported.

    • The list of available models might differ if you are using your own API keys in Langdock (“Bring-your-own-keys / BYOK”, see here for details). In this case, please reach out to your admin to understand which models are available in the API.
  • n: Not supported.

  • service_tier: Not supported.

  • parallel_tool_calls: Not supported.

  • stream_options: Not supported.

Rate limits

The rate limit for the Chat Completion endpoint is 500 RPM (requests per minute) and 60.000 TPM (tokens per minute). Rate limits are defined at the workspace level - and not at an API key level. Each model has its own rate limit. If you exceed your rate limit, you will receive a 429 Too Many Requests response.

Please note that the rate limits are subject to change, refer to this documentation for the most up-to-date information. In case you need a higher rate limit, please contact us at support@langdock.com.

Using OpenAI-compatible libraries

As the request and response format is the same as the OpenAI API, you can use popular libraries like the OpenAI Python library or the Vercel AI SDK to use the Langdock API.

Example using the OpenAI Python library

from openai import OpenAI
client = OpenAI(
  base_url="https://api.langdock.com/openai/eu/v1",
  api_key="<YOUR_LANGDOCK_API_KEY>"
)

completion = client.chat.completions.create(
  model="gpt-4o-mini",
  messages=[
    {"role": "user", "content": "Write a short poem about cats."}
  ]
)

print(completion.choices[0].message.content)

Example using the Vercel AI SDK in Node.js

import { streamText } from "ai";
import { createOpenAI } from "@ai-sdk/openai";

const langdockProvider = createOpenAI({
  baseURL: "https://api.langdock.com/openai/eu/v1",
  apiKey: "<YOUR_LANGDOCK_API_KEY>",
});

const result = await streamText({
  model: langdockProvider("gpt-4o-mini"),
  prompt: "Write a short poem about cats",
});

for await (const textPart of result.textStream) {
  process.stdout.write(textPart);
}

Headers

Authorization
string
required

API key as Bearer token. Format "Bearer YOUR_API_KEY"

Path Parameters

region
enum<string>
required

The region of the API to use.

Available options:
eu,
us

Body

application/json

Response

200 - application/json
OK

Represents a chat completion response returned by model, based on the provided input.