OpenAI Assistants API v2 2026: Files, Code Interpreter, and Threads
Build persistent AI assistants with built-in RAG, code execution, function calling
OpenAI Assistants API: Threads, Files, Code Interpreter — and the Migration You Now Need
Important status update first: OpenAI has deprecated the Assistants API in favor of the Responses API — after the Responses API reached feature parity, OpenAI announced the Assistants API's sunset (with a migration window into 2026). If you're starting fresh: build on the Responses API. If you run Assistants in production: this guide covers what you have, what maps to what, and how to migrate without losing the features you adopted it for (threads, file search, code interpreter).
What the Assistants API gave you (and why people used it)
The pitch was *managed state*: instead of resending conversation history each call, you created an Assistant (instructions + model + tools), opened a Thread per user, appended Messages, and triggered Runs. OpenAI persisted the conversation, auto-truncated context, and hosted two killer tools:
That convenience is exactly what the Responses API now provides with a cleaner design.
The migration map
previous_response_id chaining (server-side state) or pass messages yourselffile_search tool, same vector stores — carries overThe big win: the awkward create-run-then-poll loop disappears. Where Assistants made you create a run and poll its status, Responses returns (or streams) the answer directly:
python
from openai import OpenAI
client = OpenAI()Managed RAG, Responses-style
resp = client.responses.create(
model='gpt-5',
input='What does our refund policy say about partial refunds?',
tools=[{'type': 'file_search', 'vector_store_ids': [VECTOR_STORE_ID]}],
)
print(resp.output_text)Multi-turn: chain by ID instead of managing a thread object
followup = client.responses.create(
model='gpt-5',
previous_response_id=resp.id,
input='And for digital goods specifically?',
)
python
Code Interpreter, Responses-style
resp = client.responses.create(
model='gpt-5',
input='Analyze the attached CSV: monthly revenue trend + one chart.',
tools=[{'type': 'code_interpreter', 'container': {'type': 'auto', 'file_ids': [file_id]}}],
)
Migration plan that works in practice
file_search at the same store IDs.response_id per user where you previously stored a thread ID. (Decide your retention policy: chaining stores state with OpenAI; passing full messages yourself keeps state in your DB.)Consult OpenAI's official migration guide for the current sunset timeline — dates have shifted before.
Bigger-picture lesson
The Assistants→Responses transition is a reminder that managed-state APIs are vendor surfaces, not standards — worth abstracting behind your own service layer so the next migration is a module change, not an app rewrite. If you're rebuilding anyway, it's a natural moment to evaluate alternatives: self-managed RAG (pgvector + semantic search) for control, framework orchestration (LangGraph) for complex agents, or the Claude API's equivalents if you're diversifying providers.
FAQ
Do I have to migrate immediately? No — there's a deprecation window with continued support. But new feature work lands on Responses only, so the cost of waiting compounds.
Is Responses cheaper? Same per-token model pricing; you save the wasted polling calls and gain prompt-caching-friendly patterns. Tool usage (file search storage, code interpreter sessions) bills similarly — verify rates on the pricing page.
What about the ChatGPT "GPTs" I built? Different product surface (consumer ChatGPT), unaffected by this API migration.
*Last updated: June 2026. Verify the current deprecation timeline and Responses API parameters against OpenAI's official docs — this transition has moving dates.*
Also available in 中文.