OpenAI Function Calling 与结构化输出完整指南 2026：让 LLM 稳定返回 JSON

告别 AI 乱返回格式的问题，用官方结构化输出构建可靠的 AI 应用

高级约 16 分钟

OpenAI Function Calling 与结构化输出完整指南 2026：让 LLM 稳定返回 JSON

告别 AI 乱返回格式的问题，用官方结构化输出构建可靠的 AI 应用

Function Calling 和结构化输出（Structured Outputs）是 OpenAI API 中最被低估的功能。正确使用它们，可以让 LLM 100% 按照你定义的 JSON Schema 返回数据，彻底解决解析失败、格式不稳定的问题。

Function Calling结构化输出OpenAI APIPydanticJSON SchemaAI工程化

AI 应用最常见的 bug 之一：LLM 返回的格式不是你期望的 JSON。

OpenAI 的 Structured Outputs 功能在 2024 年末上线，到 2026 年已经成为生产级 AI 应用的标配。

一、为什么需要结构化输出

不稳定的 Prompt 方式：

python
不好的做法：依赖 Prompt 来控制格式
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{
        "role": "user",
        "content": "分析这条评论，用 JSON 格式返回 {sentiment, score, category}"
    }]
)
问题：有时返回 markdown 代码块，有时多返回字段，有时格式完全不对

稳定的结构化输出方式：

python
from openai import OpenAI
from pydantic import BaseModel
client = OpenAI()
定义期望的输出结构
class SentimentAnalysis(BaseModel):
    sentiment: Literal["positive", "neutral", "negative"]
    score: float  # 0-1 之间的置信度
    category: str  # 评论类别
使用 parse() 方法，100% 按结构返回
response = client.beta.chat.completions.parse(
    model="gpt-4o",
    messages=[{
        "role": "user",
        "content": "分析这条评论：'这个产品太棒了！'"
    }],
    response_format=SentimentAnalysis  # 传入 Pydantic 模型
)result = response.choices[0].message.parsed
print(result.sentiment)  # "positive"
print(result.score)       # 0.95
print(result.category)    # "product_review"

二、Function Calling：让 AI 调用工具

Function Calling 让 LLM 能够"调用"外部函数，是 AI Agent 的核心机制。

2.1 定义工具

python
from openai import OpenAI
import json
client = OpenAI()
定义可用工具
tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "获取指定城市的当前天气",
            "parameters": {
                "type": "object",
                "properties": {
                    "city": {
                        "type": "string",
                        "description": "城市名称，如北京、上海"
                    },
                    "unit": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "description": "温度单位"
                    }
                },
                "required": ["city"],
                "additionalProperties": False
            },
            "strict": True  # 严格模式：必须精确匹配 schema
        }
    }
]

2.2 执行工具调用循环

python
def run_agent(user_message):
    messages = [{"role": "user", "content": user_message}]
    
    while True:
        response = client.chat.completions.create(
            model="gpt-4o",
            messages=messages,
            tools=tools,
            tool_choice="auto"
        )
        
        message = response.choices[0].message
        
        # 如果没有工具调用，返回最终答案
        if not message.tool_calls:
            return message.content
        
        # 执行工具调用
        messages.append(message)
        
        for tool_call in message.tool_calls:
            function_name = tool_call.function.name
            function_args = json.loads(tool_call.function.arguments)
            
            # 实际执行函数
            if function_name == "get_weather":
                result = get_weather(**function_args)
            
            # 把结果返回给模型
            messages.append({
                "role": "tool",
                "tool_call_id": tool_call.id,
                "content": json.dumps(result)
            })result = run_agent("北京今天的天气怎么样？")

三、复杂结构化输出

3.1 嵌套结构

python
from pydantic import BaseModel
from typing import List, Optional
class Step(BaseModel):
    step_number: int
    action: str
    expected_result: str
class TroubleshootingGuide(BaseModel):
    problem_summary: str
    root_cause: str
    severity: Literal["low", "medium", "high", "critical"]
    steps: List[Step]
    estimated_time_minutes: int
    requires_restart: bool
    additional_notes: Optional[str] = None
使用
response = client.beta.chat.completions.parse(
    model="gpt-4o",
    messages=[{
        "role": "user",
        "content": f"为以下问题生成排障指南：{problem_description}"
    }],
    response_format=TroubleshootingGuide
)guide = response.choices[0].message.parsed
for step in guide.steps:
    print(f"Step {step.step_number}: {step.action}")

3.2 批量处理与并发

python
import asyncio
from openai import AsyncOpenAI
client = AsyncOpenAI()
async def analyze_single(text: str) -> SentimentAnalysis:
    response = await client.beta.chat.completions.parse(
        model="gpt-4o-mini",  # 批量处理用 mini，节省成本
        messages=[{"role": "user", "content": f"分析：{text}"}],
        response_format=SentimentAnalysis
    )
    return response.choices[0].message.parsed
async def batch_analyze(texts: list[str]) -> list[SentimentAnalysis]:
    # 并发处理，限制最大并发数为 10
    semaphore = asyncio.Semaphore(10)
    
    async def process_with_limit(text):
        async with semaphore:
            return await analyze_single(text)
    
    results = await asyncio.gather(*[process_with_limit(t) for t in texts])
    return results
1000 条评论，几分钟内处理完
texts = load_reviews()  # 加载评论数据
results = asyncio.run(batch_analyze(texts))

四、最佳实践

4.1 选择合适的模型

场景推荐模型

精度优先gpt-4o 速度/成本优先gpt-4o-mini 批量处理gpt-4o-mini + 并发本地/隐私Ollama + Qwen（支持工具调用）

4.2 错误处理

python
try:
    response = client.beta.chat.completions.parse(...)
    result = response.choices[0].message.parsed
    
    if result is None:
        # 模型拒绝了请求（如安全过滤）
        handle_refusal(response.choices[0].message.refusal)
    
except Exception as e:
    # 记录错误，使用降级方案
    logger.error(f"Structured output failed: {e}")
    result = fallback_handler(user_input)

OpenAI Function Calling 与结构化输出完整指南 2026：让 LLM 稳定返回 JSON

OpenAI Function Calling 与结构化输出完整指南 2026：让 LLM 稳定返回 JSON

一、为什么需要结构化输出

不好的做法：依赖 Prompt 来控制格式

问题：有时返回 markdown 代码块，有时多返回字段，有时格式完全不对

定义期望的输出结构

使用 parse() 方法，100% 按结构返回

二、Function Calling：让 AI 调用工具

2.1 定义工具

定义可用工具

2.2 执行工具调用循环

三、复杂结构化输出

3.1 嵌套结构

使用

3.2 批量处理与并发

1000 条评论，几分钟内处理完

四、最佳实践

4.1 选择合适的模型

4.2 错误处理

延伸阅读

Documentation

Getting Started

Learn more