Skip to main content

Responses API

POST /v1/responses

The Responses API is compatible with OpenAI's Responses API format, providing a streamlined interface for generating AI responses. This endpoint accepts the input field (instead of messages) and automatically converts it to the internal format.

Input Format

The Responses API uses input instead of messages. You can provide either a string for simple queries or an array of message objects for conversations.

HTTP Request

curl https://api.apertis.ai/v1/responses \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <APERTIS_API_KEY>" \
-d '{
"model": "gpt-4.1",
"input": "What is the capital of France?"
}'

Authentication

HeaderFormatExample
AuthorizationBearer tokenAuthorization: Bearer sk-your-api-key

Parameters

Required Parameters

ParameterTypeDescription
modelstringThe model to use for generating the response
inputstring/arrayThe input text or array of message objects

Optional Parameters

ParameterTypeDescription
instructionsstringHigh-level instructions for model behavior
streambooleanEnable streaming responses. Default: false
temperaturenumberSampling temperature (0-2). Default: 1
max_tokensintegerMaximum tokens in the response
max_output_tokensintegerUpper bound for tokens including reasoning tokens
toolsarrayList of tools the model can use (including web search)
tool_choicestring/objectControls tool selection behavior
max_tool_callsintegerMaximum number of tool calls allowed
parallel_tool_callsbooleanWhether to allow parallel tool execution
reasoningobjectReasoning configuration (see below)
textobjectText output configuration (see below)
storebooleanWhether to store the response for state management
previous_response_idstringID of previous response for multi-turn conversations
truncationstringContext overflow handling (e.g., "auto")
metadataobjectRequest metadata (up to 16 key-value pairs)

Reasoning Parameter

The reasoning parameter configures the model's reasoning behavior:

OptionTypeDescription
effortstringReasoning effort level: low, medium, high
summarystringSummary style: auto, concise, detailed
response = client.responses.create(
model="o1-preview",
input="Solve this complex math problem...",
reasoning={
"effort": "high",
"summary": "detailed"
}
)

Text Parameter

The text parameter configures text output:

OptionTypeDescription
formatobjectOutput format configuration (e.g., {"type": "json_object"})
verbositystringOutput verbosity: concise, normal, verbose

Input Formats

String Input (simple query):

{
"model": "gpt-4.1",
"input": "What is the capital of France?"
}

Array Input (conversation):

{
"model": "gpt-4.1",
"input": [
{"role": "user", "content": "Hello!"},
{"role": "assistant", "content": "Hi there!"},
{"role": "user", "content": "What's 2+2?"}
]
}

Example Usage

Python

from openai import OpenAI

client = OpenAI(
api_key="sk-your-api-key",
base_url="https://api.apertis.ai/v1"
)

response = client.responses.create(
model="gpt-4.1",
input="Explain quantum computing in simple terms."
)

print(response.output_text)

JavaScript

import OpenAI from 'openai';

const client = new OpenAI({
apiKey: 'sk-your-api-key',
baseURL: 'https://api.apertis.ai/v1'
});

const response = await client.responses.create({
model: 'gpt-4.1',
input: 'Explain quantum computing in simple terms.'
});

console.log(response.output_text);

Use the tools parameter to enable web search:

response = client.responses.create(
model="gpt-4.1",
input="What are the latest news about AI?",
tools=[{"type": "web_search_preview"}]
)

With Reasoning

response = client.responses.create(
model="o1-preview",
input="Prove the Pythagorean theorem step by step.",
reasoning={
"effort": "high",
"summary": "detailed"
}
)

Streaming

stream = client.responses.create(
model="gpt-4.1",
input="Write a short story about a robot.",
stream=True
)

for event in stream:
if event.type == "response.output_text.delta":
print(event.delta, end="", flush=True)

Multi-turn Conversation

response = client.responses.create(
model="gpt-4.1",
input=[
{"role": "user", "content": "What is Python?"},
{"role": "assistant", "content": "Python is a high-level programming language..."},
{"role": "user", "content": "How do I install it?"}
]
)

With Instructions

Use instructions to provide high-level guidance for model behavior:

response = client.responses.create(
model="gpt-4.1",
instructions="You are a helpful coding assistant. Always provide code examples.",
input="How do I read a file in Python?"
)

Stateful Conversations

Use store and previous_response_id for multi-turn conversations with state:

# First request - store the response
response1 = client.responses.create(
model="gpt-4.1",
input="My name is Alice.",
store=True
)

# Follow-up request - reference the previous response
response2 = client.responses.create(
model="gpt-4.1",
input="What's my name?",
previous_response_id=response1.id
)

With Function Calling

tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get the current weather",
"parameters": {
"type": "object",
"properties": {
"location": {"type": "string"}
},
"required": ["location"]
}
}
}
]

response = client.responses.create(
model="gpt-4.1",
input="What's the weather in Tokyo?",
tools=tools,
tool_choice="auto"
)

Response Format

{
"id": "resp_abc123",
"object": "response",
"created_at": 1699000000,
"model": "gpt-4.1",
"output": [
{
"type": "message",
"role": "assistant",
"content": [
{
"type": "output_text",
"text": "The capital of France is Paris."
}
]
}
],
"usage": {
"input_tokens": 12,
"output_tokens": 8,
"total_tokens": 20
}
}

Supported Models

All chat models available through Apertis are supported with the Responses API. The API intelligently routes requests based on model capabilities:

Models with Native /v1/responses Support

These models natively support the Responses API format on upstream providers:

Model SeriesExamples
o1 Serieso1, o1-preview, o4-mini, o1-2024-12-17
o3 Serieso3, o3-mini, o3-2025-04-16
o4 Serieso4-mini, o4-mini-2025-04-16
GPT-5 Seriesgpt-5, gpt-5.1, gpt-5.2, gpt-5-*
Codex Modelsgpt-5-codex, gpt-5-codex-*

Responses-Only Models

Some advanced models only support the /v1/responses endpoint and cannot be used with /v1/chat/completions or /v1/messages:

ModelDescription
gpt-5-proGPT-5 Pro variant
gpt-5-chat-latestLatest GPT-5 chat model
gpt-5-miniGPT-5 Mini
gpt-5-nanoGPT-5 Nano
gpt-5-codex-*GPT-5 Codex variants
o1-proO1 Pro
codex-miniCodex Mini
Responses-Only Models

When using responses-only models, you must use the /v1/responses endpoint. Requests to /v1/chat/completions or /v1/messages will return an error for these models.

All Other Models

For models that don't natively support /v1/responses (like Claude, Gemini, GPT-4o, etc.), Apertis automatically:

  1. Routes the request via /v1/chat/completions internally
  2. Converts the response back to Responses API format
  3. Returns the response in the expected format

This means you can use any model with the Responses API - the conversion is handled transparently.

ProviderExample ModelsNative Support
OpenAIgpt-4.1, gpt-4.1-miniVia fallback
Anthropicclaude-sonnet-4.5, claude-opus-4Via fallback
Googlegemini-3-pro-preview, gemini-3-flash-previewVia fallback
Metallama-3.1-70b, llama-3.1-8bVia fallback
xAIgrok-3, grok-3-reasoningVia fallback

Differences from Chat Completions

FeatureResponses APIChat Completions
Input fieldinputmessages
String inputSupportedNot supported
Instructionsinstructions parameterSystem message in messages
Reasoning configreasoning objectreasoning_effort string
Text configtext objectNot available
State managementstore, previous_response_idManual message array
Token limitmax_output_tokensmax_tokens
Tool call limitsmax_tool_calls, parallel_tool_callsNot available
Truncationtruncation parameterNot available