Chat Completions

POST /v1/chat/completions

Create a chat completion. Compatible with the OpenAI Chat Completions API format.


Request Body

Field Type Description
model string Required Model ID (e.g., anthropic/claude-sonnet-4, google/gemini-2.5-pro, openai/gpt-4.1)
messages array Required Array of message objects. See Message Object below.
max_tokens integer Optional Maximum number of tokens to generate.
max_completion_tokens integer Optional Alternative to max_tokens (for o1 models).
temperature number Optional Sampling temperature (0–2). nil = model default.
top_p number Optional Nucleus sampling. nil = model default.
stream boolean Optional Stream response via SSE. See the Streaming guide.
stream_options object Optional {"include_usage": true} to receive token usage in the final streamed chunk.
stop string or array Optional Up to 4 stop sequences.
tools array Optional List of tool definitions. See the Tool Calling guide.
tool_choice string or object Optional "auto", "none", "required", or {"type":"function","function":{"name":"..."}}
parallel_tool_calls boolean Optional Allow the model to make multiple tool calls in parallel.
response_format object Optional {"type":"json_object"} or {"type":"json_schema","json_schema":{...}}. See the Structured Output guide.
reasoning object Optional {"effort":"low"|"medium"|"high", "max_tokens": N, "exclude": bool}. See the Reasoning guide.
reasoning_effort string Optional Shorthand: "low", "medium", or "high".
n integer Optional Number of completions to generate.
seed integer Optional Deterministic sampling seed.
user string Optional End-user identifier for abuse monitoring.
presence_penalty number Optional Presence penalty (−2 to 2).
frequency_penalty number Optional Frequency penalty (−2 to 2).
logit_bias object Optional Map of token IDs to bias values (−100 to 100).
logprobs boolean Optional Return log probabilities of output tokens.
top_logprobs integer Optional Number of most likely tokens to return at each position (0–20). Requires logprobs: true.
service_tier string Optional Service tier preference.

Message Object

Field Type Description
role string Required "system", "user", "assistant", "tool", or "developer"
content string or array Required Text content or array of content parts (for multimodal inputs such as images).
name string Optional An optional name for the participant.
tool_calls array Optional Tool calls generated by the model (present in assistant messages).
tool_call_id string Conditional Required for role="tool" messages. The ID of the tool call this message is responding to.
reasoning_details array Optional Reasoning details from the model's extended thinking process.

Response Body

Field Type Description
id string Unique response identifier (e.g., chatcmpl-abc123).
object string Always "chat.completion".
created integer Unix timestamp of when the response was created.
model string The model used for the completion.
choices array Array of choice objects. See Choice Object below.
usage object Token usage statistics. See Usage Object below.
system_fingerprint string System fingerprint (if available).
service_tier string The service tier used for the request.

Choice Object

Field Type Description
index integer The index of this choice in the array.
message object The assistant's message, containing role and content fields.
finish_reason string "stop", "length", "tool_calls", or "content_filter".

Usage Object

Field Type Description
prompt_tokens integer Number of tokens in the prompt.
completion_tokens integer Number of tokens in the generated completion.
total_tokens integer Total tokens used (prompt_tokens + completion_tokens).
prompt_tokens_details object Breakdown of prompt tokens. Contains cached_tokens (number of tokens served from cache).
completion_tokens_details object Breakdown of completion tokens. Contains reasoning_tokens (tokens used for internal reasoning).

Examples

Basic Request

curl https://api.routerhub.ai/v1/chat/completions \
  -H "Authorization: Bearer $ROUTERHUB_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "anthropic/claude-sonnet-4",
    "messages": [
      {"role": "user", "content": "What is the capital of France?"}
    ],
    "max_tokens": 256
  }'
import requests

response = requests.post(
    "https://api.routerhub.ai/v1/chat/completions",
    headers={
        "Authorization": "Bearer YOUR_API_KEY",
        "Content-Type": "application/json",
    },
    json={
        "model": "anthropic/claude-sonnet-4",
        "messages": [
            {"role": "user", "content": "What is the capital of France?"}
        ],
        "max_tokens": 256,
    },
)
data = response.json()
print(data["choices"][0]["message"]["content"])
from openai import OpenAI

client = OpenAI(
    base_url="https://api.routerhub.ai/v1",
    api_key="YOUR_API_KEY",
)

response = client.chat.completions.create(
    model="anthropic/claude-sonnet-4",
    messages=[
        {"role": "user", "content": "What is the capital of France?"}
    ],
    max_tokens=256,
)
print(response.choices[0].message.content)
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    base_url="https://api.routerhub.ai/v1",
    api_key="YOUR_API_KEY",
    model="anthropic/claude-sonnet-4",
    max_tokens=256,
)

response = llm.invoke("What is the capital of France?")
print(response.content)

Multi-turn Conversation

curl https://api.routerhub.ai/v1/chat/completions \
  -H "Authorization: Bearer $ROUTERHUB_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "google/gemini-2.5-pro",
    "messages": [
      {"role": "user", "content": "What is photosynthesis?"},
      {"role": "assistant", "content": "Photosynthesis is the process by which plants convert sunlight, water, and carbon dioxide into glucose and oxygen."},
      {"role": "user", "content": "What is the chemical equation for it?"}
    ],
    "max_tokens": 512
  }'
import requests

response = requests.post(
    "https://api.routerhub.ai/v1/chat/completions",
    headers={
        "Authorization": "Bearer YOUR_API_KEY",
        "Content-Type": "application/json",
    },
    json={
        "model": "google/gemini-2.5-pro",
        "messages": [
            {"role": "user", "content": "What is photosynthesis?"},
            {"role": "assistant", "content": "Photosynthesis is the process by which plants convert sunlight, water, and carbon dioxide into glucose and oxygen."},
            {"role": "user", "content": "What is the chemical equation for it?"},
        ],
        "max_tokens": 512,
    },
)
print(response.json()["choices"][0]["message"]["content"])
from openai import OpenAI

client = OpenAI(
    base_url="https://api.routerhub.ai/v1",
    api_key="YOUR_API_KEY",
)

response = client.chat.completions.create(
    model="google/gemini-2.5-pro",
    messages=[
        {"role": "user", "content": "What is photosynthesis?"},
        {"role": "assistant", "content": "Photosynthesis is the process by which plants convert sunlight, water, and carbon dioxide into glucose and oxygen."},
        {"role": "user", "content": "What is the chemical equation for it?"},
    ],
    max_tokens=512,
)
print(response.choices[0].message.content)
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage, AIMessage

llm = ChatOpenAI(
    base_url="https://api.routerhub.ai/v1",
    api_key="YOUR_API_KEY",
    model="google/gemini-2.5-pro",
    max_tokens=512,
)

response = llm.invoke([
    HumanMessage(content="What is photosynthesis?"),
    AIMessage(content="Photosynthesis is the process by which plants convert sunlight, water, and carbon dioxide into glucose and oxygen."),
    HumanMessage(content="What is the chemical equation for it?"),
])
print(response.content)

System Message

curl https://api.routerhub.ai/v1/chat/completions \
  -H "Authorization: Bearer $ROUTERHUB_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-4.1",
    "messages": [
      {"role": "system", "content": "You are a helpful math tutor. Explain concepts step by step."},
      {"role": "user", "content": "Solve: 2x + 5 = 13"}
    ],
    "temperature": 0.2,
    "max_tokens": 1024
  }'
import requests

response = requests.post(
    "https://api.routerhub.ai/v1/chat/completions",
    headers={
        "Authorization": "Bearer YOUR_API_KEY",
        "Content-Type": "application/json",
    },
    json={
        "model": "openai/gpt-4.1",
        "messages": [
            {"role": "system", "content": "You are a helpful math tutor. Explain concepts step by step."},
            {"role": "user", "content": "Solve: 2x + 5 = 13"},
        ],
        "temperature": 0.2,
        "max_tokens": 1024,
    },
)
print(response.json()["choices"][0]["message"]["content"])
from openai import OpenAI

client = OpenAI(
    base_url="https://api.routerhub.ai/v1",
    api_key="YOUR_API_KEY",
)

response = client.chat.completions.create(
    model="openai/gpt-4.1",
    messages=[
        {"role": "system", "content": "You are a helpful math tutor. Explain concepts step by step."},
        {"role": "user", "content": "Solve: 2x + 5 = 13"},
    ],
    temperature=0.2,
    max_tokens=1024,
)
print(response.choices[0].message.content)
from langchain_openai import ChatOpenAI
from langchain_core.messages import SystemMessage, HumanMessage

llm = ChatOpenAI(
    base_url="https://api.routerhub.ai/v1",
    api_key="YOUR_API_KEY",
    model="openai/gpt-4.1",
    temperature=0.2,
    max_tokens=1024,
)

response = llm.invoke([
    SystemMessage(content="You are a helpful math tutor. Explain concepts step by step."),
    HumanMessage(content="Solve: 2x + 5 = 13"),
])
print(response.content)

Sample Response

{
  "id": "chatcmpl-abc123def456",
  "object": "chat.completion",
  "created": 1709251200,
  "model": "anthropic/claude-sonnet-4",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The capital of France is Paris. It is the largest city in France and serves as the country's political, economic, and cultural center."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 14,
    "completion_tokens": 32,
    "total_tokens": 46,
    "prompt_tokens_details": {
      "cached_tokens": 0
    },
    "completion_tokens_details": {
      "reasoning_tokens": 0
    }
  },
  "system_fingerprint": null,
  "service_tier": "default"
}