Reasoning
Enable extended thinking for complex tasks. The model can reason through problems step-by-step before generating a final response.
OpenAI Format
Use the /v1/chat/completions endpoint with reasoning parameters to enable extended thinking.
Using reasoning object
Pass a reasoning object in the request body to control the model's thinking behavior:
reasoning.effort:"low","medium", or"high"— controls how much thinking the model doesreasoning.max_tokens: integer — set a specific token budget for reasoning (mutually exclusive witheffort)reasoning.exclude: boolean — iftrue, reasoning tokens are not included in the response
reasoning.effort and reasoning.max_tokens are mutually exclusive. Do not set both.
reasoning fields
| Field | Type | Description |
|---|---|---|
| effort | string | "low", "medium", or "high". Controls reasoning depth |
| max_tokens | integer | Maximum tokens for reasoning. Mutually exclusive with effort |
| exclude | boolean | Exclude reasoning from response (default: false) |
Using reasoning_effort shorthand
Set reasoning_effort at the top level of the request body: "low", "medium", or "high". This is equivalent to reasoning.effort.
Response
Reasoning appears in a reasoning_details array on the assistant message:
{
"role": "assistant",
"content": "The answer is 42.",
"reasoning_details": [
{
"type": "thinking",
"text": "Let me work through this step by step..."
}
]
}Usage
Reasoning tokens are tracked in completion_tokens_details.reasoning_tokens.
Anthropic Format
Use the /v1/messages endpoint with the thinking parameter to enable extended thinking.
Using thinking object
Pass a thinking object in the request body:
thinking.type:"enabled"or"disabled"thinking.budget_tokens: integer — required when type is"enabled", sets the thinking token budget
thinking fields
| Field | Type | Required | Description |
|---|---|---|---|
| type | string | Required | "enabled" or "disabled" |
| budget_tokens | integer | Required when enabled | Token budget for thinking |
Response
Thinking appears as content blocks in the response:
{
"content": [
{
"type": "thinking",
"thinking": "Let me work through this step by step..."
},
{
"type": "text",
"text": "The answer is 42."
}
]
}Model Support
Reasoning is supported on most models. See the Models page for a full compatibility matrix.
| Family | Models with Reasoning |
|---|---|
| Claude | All Claude 4.x and 3.5 Sonnet models |
| Gemini | All Gemini 2.5+ and 3.x models |
| GPT | gpt-5, gpt-5.x series, o1 |
Examples
curl https://api.routerhub.ai/v1/chat/completions \
-H "Authorization: Bearer $ROUTERHUB_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "anthropic/claude-sonnet-4",
"messages": [
{"role": "user", "content": "What is 127 * 389? Think step by step."}
],
"reasoning": {
"effort": "high"
}
}'from openai import OpenAI
client = OpenAI(
base_url="https://api.routerhub.ai/v1",
api_key="YOUR_API_KEY",
)
response = client.chat.completions.create(
model="anthropic/claude-sonnet-4",
messages=[
{"role": "user", "content": "What is 127 * 389? Think step by step."}
],
extra_body={
"reasoning": {"effort": "high"}
},
)
# Access reasoning
for detail in response.choices[0].message.reasoning_details or []:
if detail.type == "thinking":
print("Thinking:", detail.text)
print("Answer:", response.choices[0].message.content)from anthropic import Anthropic
client = Anthropic(
base_url="https://api.routerhub.ai",
api_key="YOUR_API_KEY",
)
response = client.messages.create(
model="anthropic/claude-sonnet-4",
max_tokens=8192,
thinking={
"type": "enabled",
"budget_tokens": 4096,
},
messages=[
{"role": "user", "content": "What is 127 * 389? Think step by step."}
],
)
for block in response.content:
if block.type == "thinking":
print("Thinking:", block.thinking)
elif block.type == "text":
print("Answer:", block.text)Streaming with Reasoning
Reasoning is fully supported in streaming mode for both API formats:
- OpenAI format: Reasoning details appear as
reasoning_detailsdeltas in streamed chunks. - Anthropic format: Thinking appears as
thinking_deltaevents in the SSE stream.
See the Streaming guide for full details on consuming streaming responses.