OpenAI Assistants API in Production: Building Reliable AI Features for SaaS Applications
Engineering guide to running Assistants API at scale — thread management, tool use, file handling, and cost optimization
OpenAI Assistants API in Production: Building Reliable AI Features for SaaS Applications
Engineering guide to running Assistants API at scale — thread management, tool use, file handling, and cost optimization
Production guide for OpenAI Assistants API — thread lifecycle management, function calling, file search, code interpreter integration, streaming responses, and cost optimization strategies for SaaS products.
OpenAI Assistants API: Production Engineering Guide
When to Use Assistants API vs. Chat Completions
Bottom line: Assistants API trades control for convenience. Use it when you want rapid development; switch to Chat Completions when you need optimization.
Architecture Pattern: Assistants API in SaaS
User login → Create/retrieve Thread for user
↓
User message → Add to Thread → Create Run
↓
Poll/stream Run status
↓
If requires_action → Execute tools → Submit results
↓
Run completes → Retrieve messages
↓
Return to user
Thread Management Best Practices
Thread Lifecycle
javascript
// Create thread on first session
const thread = await openai.beta.threads.create();
await db.users.update({ threadId: thread.id }, { where: { userId } });// Reuse for subsequent conversations
const { threadId } = await db.users.findOne({ where: { userId } });
Thread Cost Management
Threads store all messages (billed as input tokens on each run):truncation_strategy: { type: "last_messages", last_messages: 10 }Function Calling (Tool Use)
Define Tools
javascript
const assistant = await openai.beta.assistants.create({
model: "gpt-4o",
tools: [{
type: "function",
function: {
name: "get_account_balance",
description: "Get the current balance for a user account",
parameters: {
type: "object",
properties: {
account_id: { type: "string", description: "The account ID" }
},
required: ["account_id"]
}
}
}]
});
Handle Tool Calls
javascript
async function handleRun(threadId, runId) {
let run = await openai.beta.threads.runs.retrieve(threadId, runId);
while (run.status === "requires_action") {
const toolCalls = run.required_action.submit_tool_outputs.tool_calls;
const outputs = [];
for (const toolCall of toolCalls) {
if (toolCall.function.name === "get_account_balance") {
const { account_id } = JSON.parse(toolCall.function.arguments);
const balance = await db.accounts.getBalance(account_id);
outputs.push({ tool_call_id: toolCall.id, output: JSON.stringify({ balance }) });
}
}
run = await openai.beta.threads.runs.submitToolOutputs(threadId, runId, {
tool_outputs: outputs
});
}
return run;
}
Streaming Responses
javascript
const stream = openai.beta.threads.runs.stream(threadId, {
assistant_id: assistantId
});for await (const event of stream) {
if (event.event === "thread.message.delta") {
const delta = event.data.delta.content[0]?.text?.value;
if (delta) {
res.write(data: ${JSON.stringify({ text: delta })}\n\n);
}
}
}
File Search (RAG Built-In)
javascript
// Create vector store with documents
const vectorStore = await openai.beta.vectorStores.create({
name: "Company Docs"
});await openai.beta.vectorStores.fileBatches.uploadAndPoll(
vectorStore.id,
[fs.createReadStream("handbook.pdf"), fs.createReadStream("faq.pdf")]
);
// Attach to assistant
const assistant = await openai.beta.assistants.create({
tools: [{ type: "file_search" }],
tool_resources: {
file_search: { vector_store_ids: [vectorStore.id] }
}
});
Cost Optimization
相关工具
相关教程
Replace expensive photo shoots with AI-generated product backgrounds and lifestyle shots
From customer support bots to internal knowledge bases — how to build GPTs your team actually uses
Engineering teams share real productivity gains and workflows after one year of Copilot Enterprise