Reasoning

Enable extended thinking for complex tasks. The model can reason through problems step-by-step before generating a final response.

OpenAI Format

Use the /v1/chat/completions endpoint with reasoning parameters to enable extended thinking.

Using `reasoning` object

Pass a reasoning object in the request body to control the model's thinking behavior:

reasoning.effort: "none", "minimal", "low", "medium", "high", or "xhigh" — controls how much thinking the model does
reasoning.max_tokens: integer — set a specific token budget for reasoning (mutually exclusive with effort)
reasoning.exclude: boolean — if true, reasoning tokens are not included in the response

reasoning.effort and reasoning.max_tokens are mutually exclusive. Do not set both.

reasoning fields

Field	Type	Description
effort	string	`"none"`, `"minimal"`, `"low"`, `"medium"`, `"high"`, or `"xhigh"`. Controls reasoning depth
max_tokens	integer	Maximum tokens for reasoning. Mutually exclusive with `effort`
exclude	boolean	Exclude reasoning from response (default: false)

Using `reasoning_effort` shorthand

Set reasoning_effort at the top level of the request body: "none", "minimal", "low", "medium", "high", or "xhigh". This is equivalent to reasoning.effort.

Response

Reasoning appears in a reasoning_details array on the assistant message:

{
  "role": "assistant",
  "content": "The answer is 42.",
  "reasoning_details": [
    {
      "type": "thinking",
      "text": "Let me work through this step by step..."
    }
  ]
}

Usage

Reasoning tokens are tracked in completion_tokens_details.reasoning_tokens.

Anthropic Format

Use the /v1/messages endpoint with the thinking parameter to enable extended thinking.

Using `thinking` object

Pass a thinking object in the request body:

thinking.type: "enabled" or "disabled"
thinking.budget_tokens: integer — required when type is "enabled", sets the thinking token budget

thinking fields

Field	Type	Required	Description
type	string	Required	`"enabled"` or `"disabled"`
budget_tokens	integer	Required when enabled	Token budget for thinking

Response

Thinking appears as content blocks in the response:

{
  "content": [
    {
      "type": "thinking",
      "thinking": "Let me work through this step by step..."
    },
    {
      "type": "text",
      "text": "The answer is 42."
    }
  ]
}

Gemini Format

Use the /v1beta/models/{model}:generateContent endpoint with generationConfig.thinkingConfig to enable extended thinking.

Using `thinkingConfig` object

Field	Type	Description
includeThoughts	boolean	When `true`, the response includes thought parts with `thought: true` and a `thoughtSignature`.
thinkingBudget	integer	Maximum tokens the model may spend on thinking.
thinkingLevel	string	Thinking depth: `"LOW"`, `"MEDIUM"`, or `"HIGH"`. Native Gemini only — dropped on the cross-provider path.

Response

When includeThoughts is enabled, each thought appears as a part with thought: true, text, and an opaque thoughtSignature:

{
  "candidates": [{
    "content": {
      "role": "model",
      "parts": [
        {
          "thought": true,
          "text": "Let me work through this step by step...",
          "thoughtSignature": "Aab..."
        },
        {"text": "The answer is 49,403."}
      ]
    },
    "finishReason": "STOP"
  }],
  "usageMetadata": {
    "promptTokenCount": 10,
    "candidatesTokenCount": 14,
    "thoughtsTokenCount": 64,
    "totalTokenCount": 88
  }
}

Thought round-trip contract. If you include a thought part on a follow-up turn, echo the thoughtSignature back exactly as received. Unsigned replayed function calls and thought parts are dropped silently on Gemini thinking models.

Cross-Provider Mapping

When the resolved model is Claude or GPT, RouterHub maps thinkingConfig into the internal reasoning config:

includeThoughts: true → reasoning enabled
thinkingBudget: N → reasoning token budget
thinkingLevel → dropped (the internal config only carries enabled + budget)

Model Support

Reasoning is supported on most models. See the Models page for a full compatibility matrix.

Family	Models with Reasoning
Claude	All Claude 4.x models
Gemini	All Gemini 2.5+ and 3.x models
GPT	gpt-5, gpt-5.x series

Examples

curl https://api.routerhub.ai/v1/chat/completions \
  -H "Authorization: Bearer $ROUTERHUB_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "anthropic/claude-opus-4.6",
    "messages": [
      {"role": "user", "content": "What is 127 * 389? Think step by step."}
    ],
    "reasoning": {
      "effort": "high"
    }
  }'

from openai import OpenAI

client = OpenAI(
    base_url="https://api.routerhub.ai/v1",
    api_key="YOUR_API_KEY",
)

response = client.chat.completions.create(
    model="anthropic/claude-opus-4.6",
    messages=[
        {"role": "user", "content": "What is 127 * 389? Think step by step."}
    ],
    extra_body={
        "reasoning": {"effort": "high"}
    },
)

# Access reasoning
for detail in response.choices[0].message.reasoning_details or []:
    if detail.type == "thinking":
        print("Thinking:", detail.text)

print("Answer:", response.choices[0].message.content)

from anthropic import Anthropic

client = Anthropic(
    base_url="https://api.routerhub.ai",
    api_key="YOUR_API_KEY",
)

response = client.messages.create(
    model="anthropic/claude-opus-4.6",
    max_tokens=8192,
    thinking={
        "type": "enabled",
        "budget_tokens": 4096,
    },
    messages=[
        {"role": "user", "content": "What is 127 * 389? Think step by step."}
    ],
)

for block in response.content:
    if block.type == "thinking":
        print("Thinking:", block.thinking)
    elif block.type == "text":
        print("Answer:", block.text)

Streaming with Reasoning

Reasoning is fully supported in streaming mode for both API formats:

OpenAI format: Reasoning details appear as reasoning_details deltas in streamed chunks.
Anthropic format: Thinking appears as thinking_delta events in the SSE stream.

See the Streaming guide for full details on consuming streaming responses.

Reasoning

OpenAI Format

Using reasoning object

reasoning fields

Using reasoning_effort shorthand

Response

Usage

Anthropic Format

Using thinking object

thinking fields

Response

Gemini Format

Using thinkingConfig object

Response

Cross-Provider Mapping

Model Support

Examples

Streaming with Reasoning

Using `reasoning` object

Using `reasoning_effort` shorthand

Using `thinking` object

Using `thinkingConfig` object