How to use the ChatGPT API in Python

Learn how to use the ChatGPT API in Python. Explore different methods, tips, real-world applications, and how to debug common errors.

How to use the ChatGPT API in Python
Published on: 
Tue
Feb 24, 2026
Updated on: 
Mon
Apr 6, 2026
The Replit Team

The ChatGPT API lets you integrate powerful language models into your Python applications. You can build advanced AI features for content generation, chatbots, and automated workflows with just a few lines of code.

Here, you'll learn essential techniques and practical tips for the API. You'll explore real-world applications and get debugging advice to help you overcome common challenges and build robust solutions.

Using the OpenAI client for a basic ChatGPT request

from openai import OpenAI

client = OpenAI(api_key="your-api-key-here")

response = client.chat.completions.create(
model="gpt-3.5-turbo",
messages=[{"role": "user", "content": "Hello, how are you?"}]
)

print(response.choices[0].message.content)--OUTPUT--I'm doing well, thank you for asking! How can I assist you today?

This example uses the client.chat.completions.create() method to send a request. The two most important arguments you'll pass are model and messages.

The messages parameter is a list of dictionaries, which lets you build a conversation history. This is key for creating context-aware interactions. Each message object needs:

  • role: Who is speaking—typically "user", "assistant", or "system".
  • content: The text of the message itself.

The API's reply isn't just plain text. You access the generated message content through the response object at response.choices[0].message.content.

Foundation techniques

Building on that example, you'll need to handle API key authentication, parse response data correctly, and manage conversation context for more advanced interactions. These skills build on the fundamentals of calling APIs in Python.

Setting up authentication with API keys

import os
from openai import OpenAI

# Load API key from environment variable
api_key = os.environ.get("OPENAI_API_KEY")
client = OpenAI(api_key=api_key)

print(f"OpenAI client initialized with API key: {api_key[:5]}...")--OUTPUT--OpenAI client initialized with API key: sk_te...

Instead of writing your API key directly into your code—a major security risk—it's best practice to load it from an environment variable. This approach keeps your API keys safe by keeping your secret key separate from your source code, so you can safely share your project without exposing your credentials. Learn more about using environment variables in Python for secure configuration management.

  • The code uses Python's built-in os module to read system environment variables.
  • The os.environ.get("OPENAI_API_KEY") call specifically looks for and retrieves the value you've set for your key.

Parsing response data from the API

response = client.chat.completions.create(
model="gpt-3.5-turbo",
messages=[{"role": "user", "content": "Tell me a joke"}]
)

content = response.choices[0].message.content
tokens_used = response.usage.total_tokens
print(f"Response: {content}\nTokens used: {tokens_used}")--OUTPUT--Response: Why don't scientists trust atoms? Because they make up everything!
Tokens used: 28

The response object is more than just the text reply; it's a structured object with useful metadata. Besides the message content, you can access important details like token usage.

  • The response.usage object contains the token count for your request.
  • Tracking this with response.usage.total_tokens is crucial for managing costs, since you're billed per token.

Managing conversation context with messages

messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Who won the World Cup in 2022?"},
]
response = client.chat.completions.create(model="gpt-3.5-turbo", messages=messages)
messages.append({"role": "assistant", "content": response.choices[0].message.content})
messages.append({"role": "user", "content": "Who was the top scorer?"})
print(client.chat.completions.create(model="gpt-3.5-turbo", messages=messages).choices[0].message.content)--OUTPUT--Kylian Mbappé of France was the top scorer in the 2022 World Cup with 8 goals.

The messages list is the key to creating stateful conversations. By appending each new user query and the assistant's response back into the list, you build a running history. Understanding accessing dictionaries in Python is essential since each message is a dictionary with role and content keys. When you send this updated list in your next API call, the model has the full context to understand follow-up questions, like asking "Who was the top scorer?" after the initial query.

  • The system role sets the assistant's overall behavior from the start.
  • The user and assistant roles build the turn-by-turn dialogue.
  • This method is how you give the model a memory of the conversation.

Advanced features and optimizations

Beyond simple requests, you can improve user experience and application stability by streaming responses with stream=True, adjusting model parameters, and implementing smart error handling.

Streaming responses with stream=True

stream = client.chat.completions.create(
model="gpt-3.5-turbo",
messages=[{"role": "user", "content": "Count to 5"}],
stream=True
)

for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")--OUTPUT--1, 2, 3, 4, 5

By setting stream=True in your API call, you instruct the model to send its response in small pieces, or chunks, instead of making you wait for the full text. This approach dramatically improves the user experience by displaying text as it's generated.

  • The response object becomes an iterable, so you can loop through it to process each chunk as it arrives.
  • You access the new text in each piece using chunk.choices[0].delta.content.
  • It's perfect for building applications that need a real-time, typewriter-like effect.

Customizing behavior with model parameters

creative_response = client.chat.completions.create(
model="gpt-3.5-turbo",
messages=[{"role": "user", "content": "Write a short poem about Python"}],
temperature=0.9,
max_tokens=50
)
print(creative_response.choices[0].message.content)--OUTPUT--In code's embrace, Python slithers bright,
Elegant, simple, a programmer's delight.
With indentation's grace, logic flows clear,
A language beloved, both far and near.

You can fine-tune the model's output using optional parameters in your API call. This gives you more control over the creativity and length of the response.

  • temperature: This parameter controls randomness. A higher value like 0.9 encourages more creative or unexpected outputs, which is great for tasks like writing a poem.
  • max_tokens: This sets a strict limit on the response length. Using max_tokens=50 ensures the reply is concise and helps you manage token usage and costs.

Implementing error handling and retries

import time
from openai import OpenAI

client = OpenAI(api_key="your-api-key-here")
for attempt in range(3):
try:
response = client.chat.completions.create(
model="gpt-3.5-turbo",
messages=[{"role": "user", "content": "Hello"}]
)
print(response.choices[0].message.content)
break
except Exception as e:
wait_time = 2 ** attempt
print(f"Rate limit hit. Retrying in {wait_time} seconds...")
time.sleep(wait_time)--OUTPUT--Hello! How can I assist you today?

API calls can fail from network issues or rate limits, so building in resilience is crucial. This code wraps the request in a try...except block within a for loop to automatically retry a failed request. For more robust error handling, you can learn about handling multiple exceptions in Python. This simple pattern prevents temporary errors from crashing your application.

  • If an error occurs, the code waits before trying again using an exponential backoff strategy.
  • The delay, calculated with 2 ** attempt, doubles after each failure. It’s a standard and respectful way to handle API rate limits.
  • Once the request succeeds, break exits the loop.

Move faster with Replit

Replit is an AI-powered development platform that comes with all Python dependencies pre-installed, so you can skip setup and start coding instantly. This lets you move from learning individual techniques, like managing API calls, to building complete applications.

With Agent 4, you can take an idea to a working product by describing what you want to build. Instead of piecing together code, you can create entire tools that leverage the ChatGPT API, such as:

  • A customer support chatbot that maintains conversation history to provide context-aware replies.
  • A creative writing assistant that uses a high temperature for unique story ideas and stream=True to show text as it's generated.
  • A technical documentation summarizer that uses max_tokens to ensure concise, easy-to-read outputs.

Simply describe your app, and Replit will write the code, test it, and fix issues automatically, all within your browser.

Common errors and challenges

When using the ChatGPT API, you'll likely encounter a few common issues, but most have simple solutions you can quickly apply.

  • Handling invalid or expired api_key errors. This error typically pops up if your key is incorrect, has been revoked, or isn't being loaded properly by your code. You should first confirm the key is copied correctly and that your environment variable is set where your script can find it. It’s a frequent slip-up but an easy one to fix.
  • Resolving incorrect messages format errors. The API is strict about the messages parameter—it must be a list of dictionaries, where each dictionary needs both a role and a content key. If you see this error, check that you’re not just sending a plain string or a list with missing keys. Every message has to follow the `{"role": "user", "content": "..."}` pattern.
  • Fixing invalid model parameter errors. An invalid model error means the name you passed to the model parameter either doesn't exist or isn't available to your account. A simple typo is a common cause. Always cross-reference the model name with OpenAI's official list of available models to be sure.

Handling invalid or expired api_key errors

Handling invalid or expired api_key errors

An authentication error is one of the first hurdles you'll likely face. It happens when your API key is incorrect, expired, or not loaded properly. This prevents your application from connecting to OpenAI's services, stopping your code before it even starts.

The code below shows what happens when you try to make a request with an invalid key, resulting in a clear authentication failure.

from openai import OpenAI

client = OpenAI(api_key="invalid-or-expired-key")

response = client.chat.completions.create(
model="gpt-3.5-turbo",
messages=[{"role": "user", "content": "Hello"}]
)
print(response.choices[0].message.content)

The OpenAI client is initialized with the placeholder string "invalid-or-expired-key", which causes an authentication error because it's not a real key. The following code demonstrates how to catch this specific error and provide clearer feedback.

from openai import OpenAI
from openai import AuthenticationError

try:
client = OpenAI(api_key="invalid-or-expired-key")
response = client.chat.completions.create(
model="gpt-3.5-turbo",
messages=[{"role": "user", "content": "Hello"}]
)
print(response.choices[0].message.content)
except AuthenticationError as e:
print(f"Authentication error: {e}")

By wrapping the API call in a try...except block, you can gracefully handle authentication failures without crashing your app. This approach specifically catches the AuthenticationError from the OpenAI library, letting you print a clear error message instead.

  • This error often appears when your API key is wrong, has expired, or isn't loaded correctly from your environment variables—so always double-check those first when you see this issue.

Resolving incorrect messages format errors

Resolving incorrect messages format errors

The API is strict about the messages parameter's structure. It requires a list of dictionaries, not a single dictionary. This common mistake happens when you forget to wrap the message object in brackets, causing an error. The code below shows this in action.

from openai import OpenAI

client = OpenAI(api_key="your-api-key-here")

# Incorrect message format
response = client.chat.completions.create(
model="gpt-3.5-turbo",
messages={"role": "user", "content": "Hello"} # Wrong format
)

The messages parameter is given a dictionary directly, but the API is built to process a list of message objects. This mismatch is what triggers the error. The corrected implementation below shows how to structure the data correctly.

from openai import OpenAI

client = OpenAI(api_key="your-api-key-here")

# Correct message format (list of dictionaries)
response = client.chat.completions.create(
model="gpt-3.5-turbo",
messages=[{"role": "user", "content": "Hello"}] # Correct format
)

The fix is simple: the messages parameter must always be a list. Even when sending a single message, you need to wrap the dictionary in square brackets ([]). The API requires this structure to handle a sequence of messages for conversational context. This error is a common slip-up when starting a new chat with a single prompt, so always double-check that your message dictionary is inside a list.

Fixing invalid model parameter errors

Fixing invalid model parameter errors

This error occurs when the model parameter specifies a model that doesn't exist or isn't available to your account. It's often caused by a simple typo or using an outdated model name. The code below shows this error in action.

from openai import OpenAI

client = OpenAI(api_key="your-api-key-here")

response = client.chat.completions.create(
model="gpt-5", # Non-existent model
messages=[{"role": "user", "content": "Hello"}]
)

The API call fails because the model parameter is set to "gpt-5", a model that doesn't exist. The API can only process requests for valid, available models. The corrected implementation below shows the simple fix.

from openai import OpenAI
from openai import BadRequestError

client = OpenAI(api_key="your-api-key-here")

try:
response = client.chat.completions.create(
model="gpt-4", # Use valid model name
messages=[{"role": "user", "content": "Hello"}]
)
except BadRequestError as e:
print(f"Model error: {e}")

The corrected code wraps the API call in a try...except BadRequestError block to gracefully handle model-related issues. This prevents your application from crashing if you use an invalid model name.

  • This error is often just a typo in the model parameter or an attempt to use a model that's been deprecated.
  • Always cross-reference your model name with OpenAI's official list to ensure it's valid and available to you.

Real-world applications

Now that you know how to handle common API errors, you can apply these techniques to build applications like chatbots and prompt generators.

Building a simple customer support chatbot with OpenAI

You can build a simple chatbot by using a system message to set the model's context, which guides it to respond like a customer support agent.

from openai import OpenAI

client = OpenAI(api_key="your-api-key-here")

support_context = "You are a customer support agent for a software company."

def get_support_response(user_question):
messages = [
{"role": "system", "content": support_context},
{"role": "user", "content": user_question}
]
response = client.chat.completions.create(
model="gpt-3.5-turbo", messages=messages
)
return response.choices[0].message.content

print(get_support_response("How do I reset my password?"))

This code wraps the API logic in a reusable function called get_support_response. It prepares the conversation by combining two key pieces of information in the messages list before sending the request:

  • A system message sets the chatbot's persona using the support_context string.
  • A user message provides the specific question for the model to answer.

This structure instructs the model on how to behave before it generates a reply. The function then calls the API and returns only the text content of the response, making it easy to integrate into an application or extend with vibe coding.

Creating a text-to-image prompt generator with client.chat.completions

You can also use the chat completions API to build creative tools, like a generator that turns simple concepts into detailed, JSON-formatted prompts for text-to-image models.

from openai import OpenAI
import json

client = OpenAI(api_key="your-api-key-here")

def generate_image_prompt(concept):
system_message = "Convert concepts into detailed image prompts. Format as JSON with 'prompt' and 'style' fields."

response = client.chat.completions.create(
model="gpt-4",
messages=[
{"role": "system", "content": system_message},
{"role": "user", "content": f"Create an image prompt for: {concept}"}
],
response_format={"type": "json_object"}
)

return json.loads(response.choices[0].message.content)

result = generate_image_prompt("a futuristic library")
print(json.dumps(result, indent=2))

This code defines a function, generate_image_prompt, that gets structured data from the API. By setting response_format={"type": "json_object"}, you instruct the model that it must return a valid JSON object. This demonstrates how AI coding with Python is far more reliable than simply asking for JSON in a normal prompt.

  • The system message guides the model on the JSON's structure, specifying the prompt and style keys.
  • Since the API returns a JSON string, json.loads parses it into a Python dictionary you can use programmatically. Understanding converting dictionaries to JSON helps when working with structured API responses.

Get started with Replit

Turn your knowledge into a real tool. Tell Replit Agent to "build a Python app that summarizes articles" or "create a tool that generates JSON-formatted image prompts."

The Agent will write the code, test for errors, and deploy your application for you. Start building with Replit.

Build your first app today

Describe what you want to build, and Replit Agent writes the code, handles the infrastructure, and ships it live. Go from idea to real product, all in your browser.

Build your first app today

Describe what you want to build, and Replit Agent writes the code, handles the infrastructure, and ships it live. Go from idea to real product, all in your browser.