Reasoning

Enable extended thinking for complex tasks. The model can reason through problems step-by-step before generating a final response.


OpenAI Format

Use the /v1/chat/completions endpoint with reasoning parameters to enable extended thinking.

Using reasoning object

Pass a reasoning object in the request body to control the model's thinking behavior:

reasoning.effort and reasoning.max_tokens are mutually exclusive. Do not set both.

reasoning fields

Field Type Description
effort string "low", "medium", or "high". Controls reasoning depth
max_tokens integer Maximum tokens for reasoning. Mutually exclusive with effort
exclude boolean Exclude reasoning from response (default: false)

Using reasoning_effort shorthand

Set reasoning_effort at the top level of the request body: "low", "medium", or "high". This is equivalent to reasoning.effort.

Response

Reasoning appears in a reasoning_details array on the assistant message:

{
  "role": "assistant",
  "content": "The answer is 42.",
  "reasoning_details": [
    {
      "type": "thinking",
      "text": "Let me work through this step by step..."
    }
  ]
}

Usage

Reasoning tokens are tracked in completion_tokens_details.reasoning_tokens.


Anthropic Format

Use the /v1/messages endpoint with the thinking parameter to enable extended thinking.

Using thinking object

Pass a thinking object in the request body:

thinking fields

Field Type Required Description
type string Required "enabled" or "disabled"
budget_tokens integer Required when enabled Token budget for thinking

Response

Thinking appears as content blocks in the response:

{
  "content": [
    {
      "type": "thinking",
      "thinking": "Let me work through this step by step..."
    },
    {
      "type": "text",
      "text": "The answer is 42."
    }
  ]
}

Model Support

Reasoning is supported on most models. See the Models page for a full compatibility matrix.

Family Models with Reasoning
Claude All Claude 4.x and 3.5 Sonnet models
Gemini All Gemini 2.5+ and 3.x models
GPT gpt-5, gpt-5.x series, o1

Examples

curl https://api.routerhub.ai/v1/chat/completions \
  -H "Authorization: Bearer $ROUTERHUB_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "anthropic/claude-sonnet-4",
    "messages": [
      {"role": "user", "content": "What is 127 * 389? Think step by step."}
    ],
    "reasoning": {
      "effort": "high"
    }
  }'
from openai import OpenAI

client = OpenAI(
    base_url="https://api.routerhub.ai/v1",
    api_key="YOUR_API_KEY",
)

response = client.chat.completions.create(
    model="anthropic/claude-sonnet-4",
    messages=[
        {"role": "user", "content": "What is 127 * 389? Think step by step."}
    ],
    extra_body={
        "reasoning": {"effort": "high"}
    },
)

# Access reasoning
for detail in response.choices[0].message.reasoning_details or []:
    if detail.type == "thinking":
        print("Thinking:", detail.text)

print("Answer:", response.choices[0].message.content)
from anthropic import Anthropic

client = Anthropic(
    base_url="https://api.routerhub.ai",
    api_key="YOUR_API_KEY",
)

response = client.messages.create(
    model="anthropic/claude-sonnet-4",
    max_tokens=8192,
    thinking={
        "type": "enabled",
        "budget_tokens": 4096,
    },
    messages=[
        {"role": "user", "content": "What is 127 * 389? Think step by step."}
    ],
)

for block in response.content:
    if block.type == "thinking":
        print("Thinking:", block.thinking)
    elif block.type == "text":
        print("Answer:", block.text)

Streaming with Reasoning

Reasoning is fully supported in streaming mode for both API formats:

See the Streaming guide for full details on consuming streaming responses.