Generate Content
Native Gemini generateContent API for text generation. Point the Google GenAI SDK at RouterHub and your request is dispatched to any registered text model — Gemini, Claude, or GPT — from a single endpoint shape.
Dispatch model. When the resolved model is backed by Vertex Gemini, your request is passed through to the SDK unchanged, preserving every top-level field (tools, toolConfig, safetySettings, systemInstruction, cachedContent, labels, serviceTier, and thinkingConfig.thinkingLevel). When the resolved model is Claude, GPT, or a generic OpenAI-compatible provider, the request is converted to the internal OpenAI shape before dispatch. Some Gemini-only fields are dropped on that cross-provider path.
Available Models
All registered text models are reachable through this endpoint, regardless of the underlying provider. Use the same model IDs documented in Models.
The google/ prefix is optional when calling Gemini models. Both /v1beta/models/gemini-2.5-pro:generateContent and /v1beta/models/google/gemini-2.5-pro:generateContent resolve the same way. For non-Gemini models, use the full model ID (e.g. anthropic/claude-sonnet-4.5, openai/gpt-5).
Authentication
This endpoint supports two authentication methods:
| Method | Header | Example |
|---|---|---|
| Bearer token | Authorization |
Authorization: Bearer rh_your_api_key |
| Google-style API key | x-goog-api-key |
x-goog-api-key: rh_your_api_key |
Request Body
| Field | Type | Description | |
|---|---|---|---|
| contents | array | Required | Array of Content objects describing the conversation history. |
| systemInstruction | object | Optional | System-level instruction as a Content object. Role is ignored — only parts is read. |
| tools | array | Optional | Array of Tool definitions the model may call. |
| toolConfig | object | Optional | ToolConfig controlling how the model uses the declared tools. |
| safetySettings | array | Optional | Gemini safety filter thresholds. Native Gemini only — dropped on the cross-provider path. |
| generationConfig | object | Optional | GenerationConfig: temperature, token limits, response format, thinking configuration, and more. |
| cachedContent | string | Optional | Resource name of a Gemini cached content entry (e.g. projects/my-project/cachedContents/abc123). Native Gemini only. |
| labels | object | Optional | Key–value string metadata for billing breakdown. Vertex Gemini only. |
| serviceTier | string | Optional | Gemini service tier (standard, flex, priority). Native Gemini only. |
Content
| Field | Type | Description | |
|---|---|---|---|
| role | string | Optional | "user" or "model". RouterHub also accepts "function" for legacy tool-response turns. Omit for the first user message. |
| parts | array | Required | Array of Part objects. A part carries exactly one of text, inlineData, fileData, functionCall, or functionResponse. |
Part
| Field | Type | Description |
|---|---|---|
| text | string | Plain text content. |
| inlineData | object | Inline media: {"mimeType": "image/png", "data": "<base64>"}. |
| fileData | object | URI-based media: {"mimeType": "image/png", "fileUri": "gs://..."}. |
| functionCall | object | Model-emitted tool invocation: {"name": "...", "args": {...}}. Appears on role: "model" turns. |
| functionResponse | object | Tool execution result: {"name": "...", "response": {...}}. Appears inside a role: "user" turn per Gemini SDK convention. |
| thought | boolean | true when the part is an extended-thinking block. Paired with text. |
| thoughtSignature | string | Opaque base64 signature required for thought round-trips. Echo it back on follow-up turns exactly as received. See Reasoning. |
Tool
| Field | Type | Description |
|---|---|---|
| functionDeclarations | array | Array of FunctionDeclaration: {name, description, parameters} where parameters is a Gemini-flavoured JSON Schema object. |
ToolConfig
| Field | Type | Description |
|---|---|---|
| functionCallingConfig.mode | string | "AUTO" (default), "ANY" (must call a function), "NONE" (never call), or "VALIDATED" (call-or-text with schema validation). |
| functionCallingConfig.allowedFunctionNames | array | Restrict the model to this subset of declared function names. Required to have a non-empty intersection with tools; otherwise RouterHub returns INVALID_ARGUMENT. |
GenerationConfig
| Field | Type | Description |
|---|---|---|
| temperature | number | Sampling temperature (0.0 – 2.0). |
| topP | number | Nucleus sampling threshold. |
| topK | number | Top-K sampling. |
| maxOutputTokens | integer | Maximum tokens to generate per candidate. |
| stopSequences | array | Strings that stop generation when encountered. |
| candidateCount | integer | Number of response variants to return. Cross-provider path only accepts 1 — higher values are rejected with INVALID_ARGUMENT. Native Gemini accepts the provider's allowed range. |
| responseMimeType | string | Set to "application/json" together with responseSchema for structured output. See Structured Output. |
| responseSchema | object | Gemini Schema describing the expected JSON shape. Translated to OpenAI json_schema for cross-provider dispatch. |
| responseModalities | array | Must not contain "IMAGE" on this route. For image output use Image Generation. |
| thinkingConfig | object | Extended-thinking knobs: includeThoughts, thinkingBudget, thinkingLevel. thinkingLevel is preserved on the native path and dropped cross-provider (where only includeThoughts + thinkingBudget map to our internal reasoning config). See Reasoning. |
Response Body
| Field | Type | Description |
|---|---|---|
| candidates | array | One candidate per candidateCount. Each has content (with parts), finishReason (STOP, MAX_TOKENS, SAFETY, OTHER), and optional safetyRatings. |
| modelVersion | string | Model identifier used to serve the request. |
| responseId | string | Server-issued request identifier. |
| usageMetadata | object | Token usage: promptTokenCount, candidatesTokenCount, thoughtsTokenCount, cachedContentTokenCount, totalTokenCount. |
| promptFeedback | object | Present if the prompt was blocked. Contains blockReason and safetyRatings. |
Streaming
Call the :streamGenerateContent action to receive the response as Server-Sent Events. Each event is a line of the form data: <partial GenerateContentResponse> followed by a blank line.
Unlike the OpenAI SSE format, the Gemini stream has no [DONE] terminator. The stream simply ends when the connection closes. Detect completion via the presence of finishReason on the final chunk, or by the closed connection itself.
Chunk contents:
- Text chunks carry
candidates[0].content.parts[0].textwith the incremental text delta. - Thought chunks carry a part with
thought: trueand the accumulated signature. - Tool-call chunks emit a single part with a complete
functionCall(arguments are not streamed incrementally). - The final chunk carries
candidates[0].finishReasonand the fullusageMetadata.
Example SSE stream
data: {"candidates":[{"content":{"role":"model","parts":[{"text":"Hel"}]}}],"modelVersion":"gemini-2.5-pro"}
data: {"candidates":[{"content":{"role":"model","parts":[{"text":"lo"}]}}],"modelVersion":"gemini-2.5-pro"}
data: {"candidates":[{"content":{"role":"model","parts":[{"text":"!"}]},"finishReason":"STOP"}],"modelVersion":"gemini-2.5-pro","usageMetadata":{"promptTokenCount":5,"candidatesTokenCount":3,"totalTokenCount":8}}Cross-Provider Routing
When the resolved model is not Gemini-backed, RouterHub converts your request into the internal OpenAI shape before dispatching. This lets you use the Google GenAI SDK against Claude, GPT, and generic backends — at the cost of a few Gemini-only fields:
| Field | Behavior |
|---|---|
| safetySettings | Dropped silently (no equivalent on Claude / GPT). |
| cachedContent | Dropped silently. See Prompt Caching for per-provider caching. |
| labels | Dropped silently. |
| serviceTier | Dropped silently. |
| generationConfig.thinkingConfig.thinkingLevel | Dropped. includeThoughts and thinkingBudget are mapped to the internal reasoning config. |
| generationConfig.candidateCount > 1 | Rejected with INVALID_ARGUMENT. |
generationConfig.responseModalities containing "IMAGE" |
Rejected with INVALID_ARGUMENT (use the image endpoint). |
| tools / toolConfig | Translated into OpenAI tools and tool_choice. allowedFunctionNames filters the tool list; see Tool Calling for the full mode mapping. |
| functionResponse parts | Accepted inside role: "user" or role: "function" turns. RouterHub mints stable synthetic tool_call_ids internally so function responses bind to their originating call even when the underlying provider requires an explicit ID. |
Examples
Non-Streaming — Gemini (native pass-through)
curl https://api.routerhub.ai/v1beta/models/gemini-2.5-pro:generateContent \
-H "Authorization: Bearer $ROUTERHUB_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"contents": [
{"role": "user", "parts": [{"text": "What is 127 * 389?"}]}
],
"generationConfig": {
"temperature": 0.2,
"thinkingConfig": {
"includeThoughts": true,
"thinkingBudget": 1024
}
}
}'from google import genai
from google.genai import types
client = genai.Client(
api_key="YOUR_API_KEY",
http_options={"base_url": "https://api.routerhub.ai"},
)
response = client.models.generate_content(
model="gemini-2.5-pro",
contents="What is 127 * 389?",
config=types.GenerateContentConfig(
temperature=0.2,
thinking_config=types.ThinkingConfig(
include_thoughts=True,
thinking_budget=1024,
),
),
)
for part in response.candidates[0].content.parts:
if part.thought:
print("Thinking:", part.text)
elif part.text:
print("Answer:", part.text)Non-Streaming — Claude / GPT (cross-provider)
Same request shape, different model ID — RouterHub converts to the internal OpenAI shape and dispatches to the resolved backend.
curl https://api.routerhub.ai/v1beta/models/anthropic/claude-sonnet-4.5:generateContent \
-H "Authorization: Bearer $ROUTERHUB_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"contents": [
{"role": "user", "parts": [{"text": "Write a haiku about coding"}]}
],
"generationConfig": {"temperature": 0.7, "maxOutputTokens": 128}
}'from google import genai
from google.genai import types
client = genai.Client(
api_key="YOUR_API_KEY",
http_options={"base_url": "https://api.routerhub.ai"},
)
# Any non-Gemini model works — Claude, GPT, etc.
response = client.models.generate_content(
model="anthropic/claude-sonnet-4.5",
contents="Write a haiku about coding",
config=types.GenerateContentConfig(temperature=0.7, max_output_tokens=128),
)
print(response.candidates[0].content.parts[0].text)Streaming — streamGenerateContent
curl https://api.routerhub.ai/v1beta/models/gemini-2.5-pro:streamGenerateContent \
-H "Authorization: Bearer $ROUTERHUB_API_KEY" \
-H "Content-Type: application/json" \
--no-buffer \
-d '{
"contents": [
{"role": "user", "parts": [{"text": "Count from 1 to 5."}]}
]
}'from google import genai
client = genai.Client(
api_key="YOUR_API_KEY",
http_options={"base_url": "https://api.routerhub.ai"},
)
stream = client.models.generate_content_stream(
model="gemini-2.5-pro",
contents="Count from 1 to 5.",
)
for chunk in stream:
if chunk.text:
print(chunk.text, end="", flush=True)
# The final chunk also carries usage_metadata and finish_reason.Function Calling
Declare tools at the top level and use toolConfig.functionCallingConfig to constrain the model. Function responses are sent back inside a role: "user" turn with a functionResponse part.
curl https://api.routerhub.ai/v1beta/models/gemini-2.5-pro:generateContent \
-H "Authorization: Bearer $ROUTERHUB_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"contents": [
{"role": "user", "parts": [{"text": "What is the weather in San Francisco?"}]}
],
"tools": [{
"functionDeclarations": [{
"name": "get_weather",
"description": "Get the current weather for a location",
"parameters": {
"type": "object",
"properties": {"location": {"type": "string"}},
"required": ["location"]
}
}]
}],
"toolConfig": {
"functionCallingConfig": {
"mode": "ANY",
"allowedFunctionNames": ["get_weather"]
}
}
}'from google import genai
from google.genai import types
client = genai.Client(
api_key="YOUR_API_KEY",
http_options={"base_url": "https://api.routerhub.ai"},
)
get_weather = types.FunctionDeclaration(
name="get_weather",
description="Get the current weather for a location",
parameters={
"type": "OBJECT",
"properties": {"location": {"type": "STRING"}},
"required": ["location"],
},
)
response = client.models.generate_content(
model="gemini-2.5-pro",
contents="What is the weather in San Francisco?",
config=types.GenerateContentConfig(
tools=[types.Tool(function_declarations=[get_weather])],
tool_config=types.ToolConfig(
function_calling_config=types.FunctionCallingConfig(
mode="ANY",
allowed_function_names=["get_weather"],
),
),
),
)
call = response.candidates[0].content.parts[0].function_call
print(call.name, call.args) # get_weather {'location': 'San Francisco'}Sample Response
{
"candidates": [
{
"content": {
"role": "model",
"parts": [
{"text": "127 * 389 = 49,403."}
]
},
"finishReason": "STOP"
}
],
"modelVersion": "gemini-2.5-pro",
"responseId": "req_abc123",
"usageMetadata": {
"promptTokenCount": 10,
"candidatesTokenCount": 14,
"thoughtsTokenCount": 64,
"totalTokenCount": 88
}
}Error Format
Errors follow Google API format — distinct from the OpenAI and Anthropic formats used elsewhere.
{
"error": {
"code": 400,
"message": "candidateCount > 1 is not supported when routing to non-Gemini backends",
"status": "INVALID_ARGUMENT"
}
}| HTTP | status | Meaning |
|---|---|---|
| 400 | INVALID_ARGUMENT |
Validation error: bad field, rejected cross-provider feature, or mismatched allowedFunctionNames. |
| 401 | UNAUTHENTICATED |
Missing or invalid API key. |
| 404 | NOT_FOUND |
Unknown model, or action other than :generateContent / :streamGenerateContent. |
| 429 | RESOURCE_EXHAUSTED |
Rate limit exceeded. |
| 502 | INTERNAL |
Downstream provider error. |
| 503 | UNAVAILABLE |
Provider temporarily unavailable, or the resolved Gemini model is behind a multi-provider priority chain (rare configuration; see callout below). |
| 504 | DEADLINE_EXCEEDED |
Upstream timeout. |
The 503 “gemini-backed model in multi-provider chain not yet supported on /v1beta endpoint” only fires when a Gemini model is configured as one of multiple alternate providers for the same slug — an uncommon setup. Direct Vertex Gemini registrations and multi-account Gemini pools both work on the native path.
See Errors for the full retry guidance.