← Back to tutorials

OpenAI Assistants API Complete Tutorial 2026: Building Persistent AI Assistants

Thread Management, Code Interpreter, File Search — Full Hands-On with the Assistants API

OpenAI Assistants API Complete Tutorial 2026

Why Choose the Assistants API?

FeatureChat APIAssistants API

Conversation memoryMust manage yourselfAutomatic (Threads) Code executionNot supportedCode Interpreter File searchNot supportedFile Search (RAG) Asynchronous processingNot supportedRuns state machine

Core Concepts

  • Assistant: Defines the AI persona; created once, reused many times.
  • Thread: A single conversation that stores all message history.
  • Run: Executes a single Assistant call (queued → in_progress → completed).
  • Complete Example

    python
    from openai import OpenAI
    client = OpenAI()

    Create an assistant

    assistant = client.beta.assistants.create( name="Data Analysis Assistant", instructions="You are a professional data analyst.", model="gpt-4o", tools=[{"type": "code_interpreter"}, {"type": "file_search"}] )

    Create a thread and send a message

    thread = client.beta.threads.create() client.beta.threads.messages.create( thread_id=thread.id, role="user", content="Analyze the sales data and find the fastest-growing category" )

    Run and get results

    run = client.beta.threads.runs.create_and_poll( thread_id=thread.id, assistant_id=assistant.id ) if run.status == 'completed': messages = client.beta.threads.messages.list(thread_id=thread.id) print(messages.data[0].content[0].text.value)

    File Search (Built-in RAG)

    python
    file = client.files.create(file=open("docs.pdf", "rb"), purpose="assistants")
    vs = client.beta.vector_stores.create(name="Knowledge Base")
    client.beta.vector_stores.files.create(vector_store_id=vs.id, file_id=file.id)
    client.beta.assistants.update(
        assistant_id=assistant.id,
        tool_resources={"file_search": {"vector_store_ids": [vs.id]}}
    )
    

    Streaming Output

    python
    with client.beta.threads.runs.stream(thread_id=thread.id, assistant_id=assistant.id) as stream:
        for text in stream.text_deltas:
            print(text, end="", flush=True)
    

    Production Considerations

  • Set max_completion_tokens to control costs.
  • Runs can take tens of seconds; use polling or webhooks.
  • Regularly clean up long-inactive threads.
  • Also available in 中文.