What is the difference between Claude API and ChatGPT API?

Claude API (Anthropic) offers up to 200K token context, prompt caching (up to 90% cost savings), and Constitutional AI training for safer responses. ChatGPT API (OpenAI) has a broader ecosystem and plugin support.

How is Claude API priced?

Claude Haiku 4.5 is the cheapest at $0.80/MTok input and $4/MTok output. Claude Sonnet 4.5 is $3/MTok input and $15/MTok output. Claude Opus 4.5 is $15/MTok input and $75/MTok output. Prompt caching can cut repeated input costs by up to 90%.

How do I implement streaming with the Claude API?

Use the client.messages.stream() method. Iterate over text_delta events with an async for loop for real-time output. Both the Python SDK and TypeScript SDK support this identically.

What is Tool Use (function calling) in the Claude API?

It lets the model call external functions. Define function schemas in the tools parameter and the model will call them when needed, then use the results to respond. It is the same concept as OpenAI Function Calling.

When should I use prompt caching?

It is most effective when your system prompt is long (5000+ tokens) or when you repeatedly reference the same context (documents, codebases, etc.). Add cache_control: {type: ephemeral} to cache for up to 5 minutes.

Claude API Complete Guide | Anthropic SDK Usage, Pricing & Function Calling

2026년 4월 15일 · 20분 읽기 · 수정 2026년 4월 15일 beginner tutorial

이 글의 핵심

The Claude API is Anthropic's conversational AI API. Similar to the OpenAI API but with a longer context window (200K tokens), powerful prompt caching, and superior code generation capabilities.

What is the Claude API?

The Claude API is Anthropic’s conversational AI API. Alongside GPT-4 and Gemini, it is one of the most widely used LLM APIs in 2026 — especially strong at long document processing, code generation, and safety.

This guide covers everything from API key setup to advanced features, with Python and TypeScript examples you can use in production right away.

# Claude API in 30 seconds
import anthropic

client = anthropic.Anthropic(api_key="your-api-key")

message = client.messages.create(
    model="claude-sonnet-4-5",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Implement the Fibonacci sequence in Python"}]
)
print(message.content[0].text)

Model Comparison & Pricing
Getting Your API Key
Installing the SDK
Basic Messages API
Streaming
System Prompts
Multi-turn Conversations
Tool Use (Function Calling)
Vision (Image Input)
Prompt Caching
Production Pattern: Error Handling & Retries

Model Comparison & Pricing

Model	Context	Input ($/MTok)	Output ($/MTok)	Notes
claude-haiku-4-5	200K	$0.80	$4.00	Fastest, lowest cost
claude-sonnet-4-5	200K	$3.00	$15.00	Balanced performance
claude-opus-4-5	200K	$15.00	$75.00	Highest capability
claude-sonnet-4-6	200K	$3.00	$15.00	Latest, improved reasoning

Production tip: Use Haiku for development and testing, Sonnet for production to cut costs significantly.

Getting Your API Key

Go to console.anthropic.com
Create an account (Google/GitHub login supported)
API Keys → Create Key
Copy and store the key securely (it cannot be viewed again)

# Store in a .env file
ANTHROPIC_API_KEY=sk-ant-api03-...

Installing the SDK

Python

pip install anthropic

TypeScript / Node.js

npm install @anthropic-ai/sdk
# or
pnpm add @anthropic-ai/sdk

Basic Messages API

Python

Import the required modules and set up the dependencies:

import anthropic

client = anthropic.Anthropic()  # Reads ANTHROPIC_API_KEY from environment automatically

message = client.messages.create(
    model="claude-sonnet-4-5",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "What is the capital of France?"}
    ]
)

# Response structure
print(message.content[0].text)  # Paris.
print(message.model)             # claude-sonnet-4-5
print(message.usage)             # token usage

TypeScript

Import the required modules and set up the dependencies:

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

const message = await client.messages.create({
  model: "claude-sonnet-4-5",
  max_tokens: 1024,
  messages: [{ role: "user", content: "What is the capital of France?" }],
});

console.log(message.content[0].type === "text" ? message.content[0].text : "");

Response Object Structure

Configuration file:

{
  "id": "msg_01XFDUDYJgAACzvnptvVoYEL",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "Paris."
    }
  ],
  "model": "claude-sonnet-4-5",
  "stop_reason": "end_turn",
  "usage": {
    "input_tokens": 14,
    "output_tokens": 5
  }
}

Streaming

Essential for displaying real-time responses in conversational apps.

Python (Synchronous)

Import the required modules and set up the dependencies:

import anthropic

client = anthropic.Anthropic()

with client.messages.stream(
    model="claude-sonnet-4-5",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Explain Python's GIL"}]
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)
    
    # Access the final message object after completion
    final_message = stream.get_final_message()
    print(f"\n\nTotal tokens: {final_message.usage.input_tokens + final_message.usage.output_tokens}")

Python (Asynchronous)

The stream_response function is implemented below. It handles the core logic described above:

import asyncio
import anthropic

client = anthropic.AsyncAnthropic()

async def stream_response():
    async with client.messages.stream(
        model="claude-sonnet-4-5",
        max_tokens=1024,
        messages=[{"role": "user", "content": "What is async programming?"}]
    ) as stream:
        async for text in stream.text_stream:
            print(text, end="", flush=True)

asyncio.run(stream_response())

TypeScript (Streaming)

Import the required modules and set up the dependencies:

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

const stream = await client.messages.stream({
  model: "claude-sonnet-4-5",
  max_tokens: 1024,
  messages: [{ role: "user", content: "Explain JavaScript's event loop" }],
});

for await (const chunk of stream) {
  if (
    chunk.type === "content_block_delta" &&
    chunk.delta.type === "text_delta"
  ) {
    process.stdout.write(chunk.delta.text);
  }
}

System Prompts

Define the model’s role and behavior.

Import the required modules and set up the dependencies:

import anthropic

client = anthropic.Anthropic()

message = client.messages.create(
    model="claude-sonnet-4-5",
    max_tokens=1024,
    system="""You are a senior backend engineer with 10 years of experience.
    - Proficient in Python, Go, and Rust
    - You prioritize performance optimization and system design
    - You provide practical and specific feedback during code reviews
    - You answer in English""",
    messages=[
        {"role": "user", "content": "Fix the N+1 query problem in this Django code:\n\n```python\ndef get_posts():\n    posts = Post.objects.all()\n    return [(p.title, p.author.name) for p in posts]\n```"}
    ]
)

print(message.content[0].text)

Multi-turn Conversations

Maintain conversation history for contextual dialogue.

import anthropic

client = anthropic.Anthropic()

conversation_history = []

def chat(user_message: str) -> str:
    conversation_history.append({
        "role": "user",
        "content": user_message
    })
    
    response = client.messages.create(
        model="claude-sonnet-4-5",
        max_tokens=2048,
        system="You are a friendly programming tutor.",
        messages=conversation_history
    )
    
    assistant_message = response.content[0].text
    conversation_history.append({
        "role": "assistant",
        "content": assistant_message
    })
    
    return assistant_message

# Example conversation
print(chat("What is Python?"))
print(chat("Show me a Hello World example in Python"))
print(chat("Now add a variable to that example"))

Tool Use (Function Calling)

Let Claude call external functions to process real-time information.

import anthropic
import json

client = anthropic.Anthropic()

# Tool definitions
tools = [
    {
        "name": "get_weather",
        "description": "Get current weather for a city",
        "input_schema": {
            "type": "object",
            "properties": {
                "city": {
                    "type": "string",
                    "description": "City name (e.g. London, Tokyo)"
                },
                "unit": {
                    "type": "string",
                    "enum": ["celsius", "fahrenheit"],
                    "description": "Temperature unit"
                }
            },
            "required": ["city"]
        }
    },
    {
        "name": "get_stock_price",
        "description": "Get stock price",
        "input_schema": {
            "type": "object",
            "properties": {
                "ticker": {
                    "type": "string",
                    "description": "Stock ticker symbol (e.g. AAPL, TSLA)"
                }
            },
            "required": ["ticker"]
        }
    }
]

# Actual function implementations
def get_weather(city: str, unit: str = "celsius") -> dict:
    # In production, call a real weather API
    return {"city": city, "temperature": 22, "unit": unit, "condition": "Sunny"}

def get_stock_price(ticker: str) -> dict:
    # In production, call a real stock API
    return {"ticker": ticker, "price": 180.50, "currency": "USD"}

def process_tool_call(tool_name: str, tool_input: dict) -> str:
    if tool_name == "get_weather":
        result = get_weather(**tool_input)
    elif tool_name == "get_stock_price":
        result = get_stock_price(**tool_input)
    else:
        result = {"error": f"Unknown tool: {tool_name}"}
    return json.dumps(result)

def chat_with_tools(user_message: str) -> str:
    messages = [{"role": "user", "content": user_message}]
    
    while True:
        response = client.messages.create(
            model="claude-sonnet-4-5",
            max_tokens=1024,
            tools=tools,
            messages=messages
        )
        
        # No tool calls — return final response
        if response.stop_reason == "end_turn":
            return response.content[0].text
        
        # Handle tool calls
        if response.stop_reason == "tool_use":
            messages.append({"role": "assistant", "content": response.content})
            
            tool_results = []
            for block in response.content:
                if block.type == "tool_use":
                    result = process_tool_call(block.name, block.input)
                    tool_results.append({
                        "type": "tool_result",
                        "tool_use_id": block.id,
                        "content": result
                    })
            
            messages.append({"role": "user", "content": tool_results})

# Run
print(chat_with_tools("What's the weather in London and what's Apple's stock price?"))

Vision (Image Input)

Claude can understand and analyze images.

import anthropic
import base64
from pathlib import Path

client = anthropic.Anthropic()

# Method 1: Base64 encoding
def analyze_image_file(image_path: str, question: str) -> str:
    image_data = Path(image_path).read_bytes()
    base64_image = base64.standard_b64encode(image_data).decode("utf-8")
    
    ext = Path(image_path).suffix.lower()
    media_types = {".jpg": "image/jpeg", ".png": "image/png",
                   ".gif": "image/gif", ".webp": "image/webp"}
    media_type = media_types.get(ext, "image/jpeg")
    
    message = client.messages.create(
        model="claude-sonnet-4-5",
        max_tokens=1024,
        messages=[
            {
                "role": "user",
                "content": [
                    {
                        "type": "image",
                        "source": {
                            "type": "base64",
                            "media_type": media_type,
                            "data": base64_image,
                        },
                    },
                    {"type": "text", "text": question}
                ],
            }
        ],
    )
    return message.content[0].text

# Method 2: URL (public images)
def analyze_image_url(image_url: str, question: str) -> str:
    message = client.messages.create(
        model="claude-sonnet-4-5",
        max_tokens=1024,
        messages=[
            {
                "role": "user",
                "content": [
                    {
                        "type": "image",
                        "source": {"type": "url", "url": image_url},
                    },
                    {"type": "text", "text": question}
                ],
            }
        ],
    )
    return message.content[0].text

# Usage
result = analyze_image_file("screenshot.png", "What does this error message mean?")
print(result)

Prompt Caching

Reduce costs by up to 90% for repeated long contexts.

import anthropic

client = anthropic.Anthropic()

# Cache a long system prompt or document
LARGE_DOCUMENT = """
[Very long technical document — 5000+ tokens...]
"""  # In practice, thousands of tokens

def ask_about_document(question: str) -> str:
    message = client.messages.create(
        model="claude-sonnet-4-5",
        max_tokens=1024,
        system=[
            {
                "type": "text",
                "text": "You are a technical documentation expert. Answer questions based on the document below.",
            },
            {
                "type": "text",
                "text": LARGE_DOCUMENT,
                "cache_control": {"type": "ephemeral"}  # Enable caching (5 minutes)
            }
        ],
        messages=[{"role": "user", "content": question}]
    )
    
    # Check cache hit
    usage = message.usage
    print(f"Input tokens: {usage.input_tokens}")
    print(f"Cache created: {getattr(usage, 'cache_creation_input_tokens', 0)}")
    print(f"Cache read: {getattr(usage, 'cache_read_input_tokens', 0)}")
    
    return message.content[0].text

# First call: creates cache
print(ask_about_document("Summarize the main points of this document"))

# Second call: cache hit (90% cost savings)
print(ask_about_document("What are the performance optimization techniques in this document?"))

Production Pattern: Error Handling & Retries

import anthropic
import time
from typing import Optional

client = anthropic.Anthropic()

def create_message_with_retry(
    messages: list,
    model: str = "claude-sonnet-4-5",
    max_tokens: int = 1024,
    system: Optional[str] = None,
    max_retries: int = 3,
    base_delay: float = 1.0,
) -> str:
    """Reliable API call with retry logic"""
    
    for attempt in range(max_retries):
        try:
            kwargs = {
                "model": model,
                "max_tokens": max_tokens,
                "messages": messages,
            }
            if system:
                kwargs["system"] = system
            
            response = client.messages.create(**kwargs)
            return response.content[0].text
            
        except anthropic.RateLimitError:
            # Rate limit exceeded — exponential backoff
            if attempt < max_retries - 1:
                delay = base_delay * (2 ** attempt)
                print(f"Rate limit hit. Retrying in {delay}s ({attempt + 1}/{max_retries})")
                time.sleep(delay)
            else:
                raise
                
        except anthropic.APIStatusError as e:
            if e.status_code == 529:  # Overloaded
                if attempt < max_retries - 1:
                    delay = base_delay * (2 ** attempt)
                    print(f"Server overloaded. Retrying in {delay}s")
                    time.sleep(delay)
                else:
                    raise
            else:
                raise  # Propagate other errors immediately
                
        except anthropic.APIConnectionError:
            if attempt < max_retries - 1:
                time.sleep(base_delay)
            else:
                raise

# Usage
result = create_message_with_retry(
    messages=[{"role": "user", "content": "Hello!"}],
    system="You are a helpful assistant.",
)
print(result)

ChatGPT API vs Claude API

Feature	Claude API	ChatGPT API
Max Context	200K tokens	128K tokens
Prompt Caching	Yes (90% savings)	No
Code Generation	⭐⭐⭐⭐⭐	⭐⭐⭐⭐
Document Analysis	⭐⭐⭐⭐⭐	⭐⭐⭐⭐
JSON Mode	Yes	Yes
Vision	Yes	Yes
Ecosystem	Growing	Very broad
Price (Sonnet level)	$3/$15	$5/$15

Conclusion

The Claude API excels at long document processing, code review, and complex reasoning. Using prompt caching aggressively can significantly cut your costs.

Next steps:

이 글의 핵심

What is the Claude API?

Table of Contents

Model Comparison & Pricing

Getting Your API Key

Installing the SDK

Python

TypeScript / Node.js

Basic Messages API

Python

TypeScript

Response Object Structure

Streaming

Python (Synchronous)

Python (Asynchronous)

TypeScript (Streaming)

System Prompts

Multi-turn Conversations

Tool Use (Function Calling)

Vision (Image Input)

Prompt Caching

Production Pattern: Error Handling & Retries

ChatGPT API vs Claude API

Conclusion