← Back to tutorials

LangGraph State Machine Agent 2026: Building Controllable Complex AI Workflows

Beyond Simple Chains: Using Graph Structures for Truly Debuggable and Maintainable AI Pipelines

LangGraph State Machine Agent Practical Guide 2026: Building Controllable Complex AI Workflows

Writing an agent as a "loop calling the model" quickly spirals out of control: you can't pause for human approval, it restarts from scratch on failure, and you can't pinpoint where it got stuck. LangGraph's answer is to explicitly build the agent as a state machine—nodes are steps, edges are transition rules, and state is persistent. This article walks through the core API with a complete practical Chinese example (an expense approval agent with human review). For framework selection and concept comparison, see LangChain vs LangGraph; for the full English guide, see LangGraph Complete Guide.

Scenario: Expense Approval Agent

Flow: Receive expense request → AI extracts and validates → If amount below threshold, auto-approve; otherwise, pause for human approval → output result. This "wait for human mid-way" requirement covers LangGraph's three core features: state, conditional edges, and interrupt.

Step 1: Define State

python
from typing import TypedDict, Literal
from langgraph.graph import StateGraph, START, END

class ExpenseState(TypedDict): raw_text: str # Original expense description amount: float # AI-extracted amount category: str # Category decision: Literal['approved', 'rejected', 'pending'] reason: str

State is simply a TypedDict—each node in the graph reads it and returns the fields to update. All nodes share this single state, which is the fundamental difference from "chained parameter passing."

Step 2: Write Nodes (Plain Functions)

python
import json
from openai import OpenAI

client = OpenAI()

def extract(state: ExpenseState): resp = client.chat.completions.create( model='gpt-4o-mini', response_format={'type': 'json_object'}, messages=[{'role': 'user', 'content': f'Extract JSON {{"amount": number, "category": "Transport|Meals|Office|Other"}} from expense description: {state["raw_text"]}'}], ) data = json.loads(resp.choices[0].message.content) return {'amount': data['amount'], 'category': data['category']}

def auto_approve(state: ExpenseState): return {'decision': 'approved', 'reason': f'Amount {state["amount"]} below threshold, auto-approved'}

def human_review(state: ExpenseState): from langgraph.types import interrupt # Graph pauses here, throws context to external; on resume, gets human's decision verdict = interrupt({'amount': state['amount'], 'category': state['category']}) if verdict == 'approve': return {'decision': 'approved', 'reason': 'Approved by human review'} return {'decision': 'rejected', 'reason': 'Rejected by human review'}

Step 3: Conditional Edges + Build Graph

python
def route(state: ExpenseState):
    return 'auto_approve' if state['amount'] < 500 else 'human_review'

builder = StateGraph(ExpenseState) builder.add_node('extract', extract) builder.add_node('auto_approve', auto_approve) builder.add_node('human_review', human_review) builder.add_edge(START, 'extract') builder.add_conditional_edges('extract', route) # Branch by amount builder.add_edge('auto_approve', END) builder.add_edge('human_review', END)

Conditional edges are plain Python functions returning the next node name—all business rules (thresholds, branches, retry limits) live in testable code, not buried in prompts hoping the model behaves.

Step 4: Persistence + Run + Resume

python
from langgraph.checkpoint.memory import MemorySaver   # Replace with PostgresSaver in production
from langgraph.types import Command

graph = builder.compile(checkpointer=MemorySaver()) cfg = {'configurable': {'thread_id': 'expense-1024'}}

Submit a large expense → graph pauses at human_review

result = graph.invoke({'raw_text': 'Last week business trip: train plus hotel total 2380 yuan'}, cfg)

result contains __interrupt__ with the thrown context → push to approver (IM/email/dashboard)

— Possibly hours later, another process —

final = graph.invoke(Command(resume='approve'), cfg) # Resume via thread_id print(final['decision'], final['reason']) # approved Approved by human review

This snippet captures LangGraph's core value: pausing can span processes and days—state lives in the checkpointer, and resume requests can come from anywhere. Building this capability with a while-loop agent would take a week of wheel-reinventing.

Production Checklist

  • Use PostgresSaver as checkpointer (in-memory version loses state on restart); each business session gets a unique thread_id
  • Validate AI-extracted structured output (negative amount? out-of-range category?)—see schema validation approaches in Zod vs Pydantic
  • Node-level timeout and retry: network/model hiccups shouldn't crash the entire graph; retry failed nodes individually (state persists)
  • Integrate tracing to monitor each node's duration and input/output (LangSmith vs Langfuse); a state machine without tracing is impossible to debug
  • Wrap with FastAPI for external service, using SSE streaming (implementation recipe)
  • FAQ

    Q: When should I NOT use this? When the flow is a straight line (extract → transform → output) and doesn't need pause/resume—plain function calls are simpler. Decision criterion: Does the runtime need to change path based on results or wait for external input?

    Q: How to handle multiple agents? A supervisor is also a graph: a routing node decides which sub-graph to delegate work to, and each sub-graph maintains its own state. For trade-offs with frameworks like CrewAI, see Multi-Agent Framework Comparison.

    Q: How to choose models? Use mini-tier models for extraction/classification nodes, and flagship models for reasoning/decision nodes—assigning models per node is a hidden benefit of state machine architecture (Model Library).


    *Last updated: June 2026. API subject to LangGraph official documentation.*

    Also available in 中文.