If you're building an application that talks to a large language model, chances are you'll be integrating with either OpenAI's /v1/chat/completions or Anthropic's /v1/messages endpoint. While both serve the same fundamental purpose—sending a conversation to an LLM and getting a response—they differ in meaningful ways across authentication, request structure, response format, tool use, streaming, and more.
This guide covers every major difference so you can make informed decisions when choosing between them or building an abstraction layer that supports both.
Authentication
| Aspect | OpenAI | Anthropic |
|---|---|---|
| Auth header | Authorization: Bearer sk-... | x-api-key: sk-ant-... |
| Version header | None required | anthropic-version: 2023-06-01 (mandatory) |
| Org/project headers | OpenAI-Organization, OpenAI-Project (optional) | N/A |
Anthropic's mandatory anthropic-version header pins the API behavior to a specific version, decoupling API versioning from model names. OpenAI versions through model names and endpoint changes instead.
System Prompts
This is one of the most visible architectural differences.
OpenAI places the system prompt inside the messages array:
{
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello"}
]
}
Anthropic uses a dedicated top-level system parameter, separate from the messages array:
{
"system": "You are a helpful assistant.",
"messages": [
{"role": "user", "content": "Hello"}
]
}
Anthropic's system field also accepts an array of content blocks (not just a string), which enables features like prompt caching on the system prompt via cache_control.
Message Roles and Ordering
| OpenAI | Anthropic |
|---|---|
system / developer (o-series) | N/A (system is top-level) |
user | user |
assistant | assistant |
tool (for tool results) | N/A (tool results go inside user messages) |
OpenAI has 4 distinct roles. Anthropic has only 2 (user and assistant). Anthropic strictly requires alternating user/assistant messages—you cannot have two consecutive messages of the same role. OpenAI is more flexible and will concatenate consecutive same-role messages.
Response Format
OpenAI wraps the response in a choices array (supporting the n parameter for multiple completions):
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"choices": [{
"index": 0,
"message": {"role": "assistant", "content": "Hello!"},
"finish_reason": "stop"
}],
"usage": {
"prompt_tokens": 13,
"completion_tokens": 7,
"total_tokens": 20
}
}
Anthropic returns content directly as an array of typed content blocks:
{
"id": "msg_abc123",
"type": "message",
"role": "assistant",
"content": [
{"type": "text", "text": "Hello!"}
],
"stop_reason": "end_turn",
"usage": {
"input_tokens": 13,
"output_tokens": 7,
"cache_creation_input_tokens": 0,
"cache_read_input_tokens": 0
}
}
| Aspect | OpenAI | Anthropic |
|---|---|---|
| Content type | message.content is a string | content is an array of typed blocks |
| Stop indicator | finish_reason: "stop" | stop_reason: "end_turn" |
| Length stop | "length" | "max_tokens" |
| Tool call stop | "tool_calls" | "tool_use" |
| Multiple completions | Yes (via n parameter) | No (always returns 1) |
| Total tokens | Provided | Must be calculated |
| Cache stats | Not in response | Built-in (cache_creation_input_tokens, cache_read_input_tokens) |
Key Parameters
| Parameter | OpenAI | Anthropic |
|---|---|---|
| Max output tokens | max_completion_tokens (optional) | max_tokens (required) |
| Temperature | 0–2 (default 1) | 0–1 (default 1) |
| Top P | top_p | top_p |
| Top K | Not available | top_k |
| Frequency penalty | frequency_penalty (-2 to 2) | Not available |
| Presence penalty | presence_penalty (-2 to 2) | Not available |
| Stop sequences | stop (string or array) | stop_sequences (array) |
| Seed (reproducibility) | seed | Not available |
| Log probabilities | logprobs | Not available |
| User ID | user | metadata.user_id |
| Extended thinking | N/A (o-series reason internally) | thinking object with budget_tokens |
Two things often trip up developers migrating between them: Anthropic requires max_tokens in every request (OpenAI defaults to the model maximum), and Anthropic's temperature range caps at 1.0 while OpenAI goes up to 2.0.
Tool Use / Function Calling
This is one of the largest architectural divergences between the two APIs.
Tool Definition
OpenAI wraps tools in a type/function structure:
{
"tools": [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get weather for a location",
"parameters": {
"type": "object",
"properties": {"location": {"type": "string"}},
"required": ["location"]
}
}
}]
}
Anthropic uses a flatter structure with input_schema:
{
"tools": [{
"name": "get_weather",
"description": "Get weather for a location",
"input_schema": {
"type": "object",
"properties": {"location": {"type": "string"}},
"required": ["location"]
}
}]
}
Tool Calls in Response
OpenAI returns tool calls on a separate tool_calls array with arguments as a JSON string that must be parsed:
"tool_calls": [{
"id": "call_abc123",
"type": "function",
"function": {
"name": "get_weather",
"arguments": "{\"location\":\"Paris\"}"
}
}]
Anthropic returns tool calls as content blocks with input as a parsed JSON object:
"content": [{
"type": "tool_use",
"id": "toolu_01D7FLrfh4GYq7yT1ULFeyMV",
"name": "get_weather",
"input": {"location": "Paris"}
}]
Returning Tool Results
OpenAI uses a dedicated tool role:
{"role": "tool", "tool_call_id": "call_abc123", "content": "Sunny, 22C"}
Anthropic places tool results as content blocks inside a user message, with an explicit is_error flag:
{
"role": "user",
"content": [{
"type": "tool_result",
"tool_use_id": "toolu_01D7...",
"content": "Sunny, 22C",
"is_error": false
}]
}
Tool Choice
| Behavior | OpenAI | Anthropic |
|---|---|---|
| Model decides | "auto" | {"type": "auto"} |
| Must use a tool | "required" | {"type": "any"} |
| Specific tool | {"type": "function", "name": "X"} | {"type": "tool", "name": "X"} |
| No tools | "none" | {"type": "none"} |
Vision / Multimodal
OpenAI uses the data URL scheme for base64 images:
{
"type": "image_url",
"image_url": {
"url": "data:image/jpeg;base64,{BASE64_DATA}",
"detail": "high"
}
}
Anthropic uses separate fields for media type and data:
{
"type": "image",
"source": {
"type": "base64",
"media_type": "image/jpeg",
"data": "BASE64_DATA"
}
}
| Aspect | OpenAI | Anthropic |
|---|---|---|
| Detail control | detail: "high"/"low"/"auto" | None |
| PDF support | Not in Chat Completions | Native document content block |
| Audio support | Yes | No |
Anthropic has a unique first-class document block type for PDFs and text files with optional citation support—a feature OpenAI's Chat Completions endpoint doesn't offer.
Streaming
Both APIs use Server-Sent Events, but with fundamentally different structures.
OpenAI uses a flat stream of unnamed data: lines, ending with data: [DONE]:
data: {"choices":[{"delta":{"content":"Hello"}}]}
data: {"choices":[{"delta":{},"finish_reason":"stop"}]}
data: [DONE]
Anthropic uses named event types with a structured lifecycle:
event: message_start
data: {"type":"message_start","message":{...}}
event: content_block_start
data: {"type":"content_block_start","index":0,...}
event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":"Hello"}}
event: content_block_stop
data: {"type":"content_block_stop","index":0}
event: message_stop
data: {"type":"message_stop"}
Anthropic's streaming is more granular with 6+ named event types covering the full message lifecycle. This makes mixed content (text interleaved with tool calls) easier to handle but adds parsing complexity. OpenAI's approach is simpler—essentially one event type plus a sentinel.
Error Handling
OpenAI includes a param field indicating which parameter caused the error:
{
"error": {
"message": "Incorrect API key",
"type": "invalid_request_error",
"param": null,
"code": "invalid_api_key"
}
}
Anthropic uses a type-based discrimination pattern:
{
"type": "error",
"error": {
"type": "authentication_error",
"message": "Invalid API key"
}
}
Anthropic distinguishes overloaded_error from api_error, making it easier to implement backoff logic specifically for capacity issues.
Rate Limiting
| Aspect | OpenAI | Anthropic |
|---|---|---|
| Header prefix | x-ratelimit- (lowercase) | RateLimit- (IETF draft format) |
| Reset format | Relative duration (1s, 6m0s) | ISO 8601 timestamp |
| Retry header | Not standard | Retry-After |
Both return HTTP 429 for rate limit errors and recommend exponential backoff with jitter.
Unique Features
OpenAI-only
- Multiple completions — the
nparameter generates N alternative responses per request - Log probabilities —
logprobsreturns token-level probability information - Structured output —
response_formatwithjson_schemafor guaranteed JSON structure - Frequency/presence penalties for controlling repetition
- Audio input/output support in multimodal messages
- Seed parameter for reproducible outputs
Anthropic-only
- Extended thinking — explicit
thinkingparameter withbudget_tokens, returns visible thinking blocks - Prompt caching —
cache_controlon content blocks with TTL options, with cache hit/miss reporting in usage - PDF/document processing — native
documentcontent blocks with citation support - Top K sampling —
top_kparameter for controlling token selection - Built-in server tools —
web_search,code_execution,text_editor, etc. that run on Anthropic's infrastructure - Tool error flag —
is_errorfield on tool results
SDK Quick Reference
# OpenAI
from openai import OpenAI
client = OpenAI()
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello"}]
)
print(response.choices[0].message.content)
# Anthropic
import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-sonnet-4-5-20250514",
max_tokens=1024,
messages=[{"role": "user", "content": "Hello"}]
)
print(response.content[0].text)
Migration Checklist
If you're switching between the two or building a unified abstraction, here are the key things to watch for:
- Move system prompts — from inside
messages(OpenAI) to the top-levelsystemfield (Anthropic), or vice versa - Set
max_tokens— it's required in Anthropic, optional in OpenAI - Clamp temperature — Anthropic caps at 1.0; OpenAI allows up to 2.0
- Restructure tool definitions —
parametersvsinput_schema, wrapper object differences - Handle tool results differently —
toolrole (OpenAI) vs content blocks inusermessage (Anthropic) - Parse tool call arguments — JSON string (OpenAI) vs parsed object (Anthropic)
- Enforce message alternation — required for Anthropic, flexible in OpenAI
- Update auth headers —
Authorization: Bearervsx-api-key+anthropic-version - Adapt streaming parsers — flat chunks vs named lifecycle events
- Unwrap responses —
choices[0].message.contentvscontent[0].text
Both APIs are powerful and well-designed, but they reflect different philosophies. OpenAI's Chat Completions API leans toward flexibility and backwards compatibility, while Anthropic's Messages API favors explicitness and structured data. Understanding these differences will help you build robust integrations regardless of which provider you choose.