Google Completion Endpoint (v1beta)

This endpoint exposes Google Gemini models that are hosted in Google Vertex AI.
It mirrors the structure of the official Vertex generateContent API. To use it, you need to:
1

Get available models

Call GET /{region}/v1beta/models/ to retrieve the list of Gemini models.
2

Pick a model & action

Choose a model ID and decide between generateContent or streamGenerateContent.
3

Send your request

POST to /{region}/v1beta/models/{model}:{action} with your prompt in contents.
4

Handle the response

Parse the JSON response for normal calls or consume the SSE events for streaming.
• Region selection (eu or us)
• Optional Server-Sent Event (SSE) streaming with the same event labels used by the Google Python SDK (message_start, message_delta, message_stop)
• A models discovery endpoint

Base URL

https://api.langdock.com/api/public/google

Authentication

Send one of the following headers while using the Langdock API Key: All headers are treated identically. Missing or invalid keys return 401 Unauthorized. Authorization header example:
curl -H "Authorization: Bearer $LD_API_KEY" \
     https://api.langdock.com/api/public/google/eu/v1beta/models
x-api-key header example:
curl -H "x-api-key: $LD_API_KEY" \
     https://api.langdock.com/api/public/google/eu/v1beta/models
x-goog-api-key header example:
curl -H "x-goog-api-key: $LD_API_KEY" \
     https://api.langdock.com/api/public/google/eu/v1beta/models

1. List available models

GET /{region}/v1beta/models

region must be eu or us.

Successful response

models
array
List of objects with the following shape:
  • name – Fully-qualified model name (e.g. models/gemini-2.5-flash).
  • displayName – Human-readable name shown in the Langdock UI.
  • supportedGenerationMethods – Always ["generateContent", "streamGenerateContent"].
curl -H "Authorization: Bearer $LD_API_KEY" \
     https://api.langdock.com/api/public/google/eu/v1beta/models

2. Generate content

POST /{region}/v1beta/models/{model}:{action}

model – The model ID as returned by the models endpoint (without the models/ prefix).
actiongenerateContent or streamGenerateContent depending on whether you want to use streaming or not.
Example path: google/eu/v1beta/models/gemini-2.5-flash:streamGenerateContent

Request body

The request body follows the official GenerateContentRequest structure.

Required fields

contents (Content[], required)
Conversation history. Each object has a role (string) and parts array containing objects with text (string).
"contents": [
  {
    "role": "user", 
    "parts": [
      {
        "text": "What's the weather like?"
      }
    ]
  }
]
model (string, required)
The model to use for generation (e.g., “gemini-2.5-pro”, “gemini-2.5-flash”).

Optional fields

generationConfig (object, optional)
Configuration for text generation. Supported fields:
  • temperature (number): Controls randomness (0.0-2.0)
  • topP (number): Nucleus sampling parameter (0.0-1.0)
  • topK (number): Top-k sampling parameter
  • candidateCount (number): Number of response candidates to generate
  • maxOutputTokens (number): Maximum number of tokens to generate
  • stopSequences (string[]): Sequences that will stop generation
  • responseMimeType (string): MIME type of the response
  • responseSchema (object): Schema for structured output
"generationConfig": {
  "temperature": 0.7,
  "topP": 0.9,
  "topK": 40,
  "maxOutputTokens": 1000,
  "stopSequences": ["END", "STOP"]
}
safetySettings (SafetySetting[], optional)
Array of safety setting objects. Each object contains:
  • category (string): The harm category (e.g., “HARM_CATEGORY_HARASSMENT”)
  • threshold (string): The blocking threshold (e.g., “BLOCK_MEDIUM_AND_ABOVE”)
"safetySettings": [
  {
    "category": "HARM_CATEGORY_HARASSMENT",
    "threshold": "BLOCK_MEDIUM_AND_ABOVE"
  }
]
tools (Tool[], optional)
Array of tool objects for function calling. Each tool contains functionDeclarations array with:
  • name (string): Function name
  • description (string): Function description
  • parameters (object): JSON schema defining function parameters
"tools": [
  {
    "functionDeclarations": [
      {
        "name": "get_weather",
        "description": "Get current weather information",
        "parameters": {
          "type": "object",
          "properties": {
            "location": {
              "type": "string",
              "description": "City name"
            }
          }
        }
      }
    ]
  }
]
toolConfig (object, optional)
Configuration for function calling. Contains functionCallingConfig with:
  • mode (string): Function calling mode (“ANY”, “AUTO”, “NONE”)
  • allowedFunctionNames (string[]): Array of allowed function names
"toolConfig": {
  "functionCallingConfig": {
    "mode": "ANY",
    "allowedFunctionNames": ["get_weather"]
  }
}
systemInstruction (string | Content, optional)
System instruction to guide the model’s behavior. Can be a string or Content object with role and parts.
"systemInstruction": {
  "role": "system",
  "parts": [
    {
      "text": "You are a weather assistant. Use the weather tool when asked about weather."
    }
  ]
}
If toolConfig.functionCallingConfig.allowedFunctionNames is provided, mode must be ANY.

Minimal example

curl -X POST \
     -H "Content-Type: application/json" \
     -H "Authorization: Bearer $LD_API_KEY" \
     https://api.langdock.com/api/public/google/us/v1beta/models/gemini-2.5-pro:generateContent \
     -d '{
       "contents": [{
         "role": "user",
         "parts": [{"text": "Write a short poem about the ocean."}]
       }]
     }'

Streaming

When action is streamGenerateContent the endpoint returns an text/event-stream with compatible events: message_start – first chunk that contains content
message_delta – subsequent chunks
message_stop – last chunk (contains finishReason and usage metadata)
Example message_delta event:
event: message_delta
data: {
  "candidates": [
    {
      "index": 0,
      "content": {
        "role": "model",
        "parts": [{ "text": "The ocean whispers..." }]
      }
    }
  ]
}
Python SDK example with function calling:
import google.generativeai as genai

def get_current_weather(location):
    """Get the current weather in a given location"""
    return f"The current weather in {location} is sunny with a temperature of 70 degrees and a wind speed of 5 mph."

genai.configure(
    api_key="<YOUR_LANGDOCK_API_KEY>",

    transport="rest",  
    client_options={"api_endpoint": "https://api.langdock.com/api/public/google/<REGION>/"},
)

model = genai.GenerativeModel("gemini-2.5-flash", tools=[get_current_weather])

response = model.generate_content(
    "Please tell me the weather in San Francisco, then tell me a story on the history of the city"
)

print(response)
Python SDK streaming example:
model = genai.GenerativeModel("gemini-2.5-flash")

response = model.generate_content(
    "Tell me an elaborate story on the history of the city of San Francisco",
    stream=True,
)

for chunk in response:
    if chunk.text:
        print(chunk.text)

Using Google-compatible libraries

The endpoint is fully compatible with official Google SDKs including the Vertex AI Node SDK (@google-cloud/vertexai), Google Generative AI Python library (google-generative-ai), and the Vercel AI SDK for edge streaming.