Loop Engineering: The New Paradigm for AI Coding Agents
From Manual Prompting to Automated Loops: Systematically Building Agent Collaboration
From Manual Prompting to Automated Loops
Over the past two years, the mainstream way of using coding agents has been simple: craft detailed prompts, provide full project context, send instructions, review results, and issue new commands based on the output. Throughout this process, humans remain in charge of every interaction, and AI is merely a tool.
But this model is being disrupted. Now, you can build an automated system that autonomously discovers tasks, assigns work, verifies results, archives completed items, and plans next steps—replacing manual AI agent orchestration entirely. This is Loop Engineering—the new paradigm for AI coding agent collaboration.
"I don't manually prompt Claude anymore. I have loops running that prompt Claude and decide what to do next. My job is to write loops."
— Boris Cherny, Head of Claude Code at Anthropic
At its core, Loop Engineering shifts engineers from "writing prompts" to "designing loop systems." These systems automatically discover tasks, assign them, check results, record status, and decide next steps, while the engineer becomes the architect and maintainer of the loop.
The Six Core Components of a Loop System
A complete automated loop consists of five core functional modules plus one essential persistent memory capability:
Both Claude Code and Codex fully support these six capabilities. Their functional logic is identical, with only minor differences in naming and operation.
Automated Scheduling: The Heart of the Loop
Automation is what distinguishes a "one-off manual run" from a "continuous loop."
In Codex, you can create scheduled tasks in the automation panel, select a target project, configure execution instructions and frequency, and choose to run in the local directory or an isolated worktree. After execution, valuable issues are added to the todo list, while normal runs are automatically archived. Internally, OpenAI uses such automation for daily issue triage, CI error summaries, writing commit messages, and debugging regressions introduced in historical iterations.
Claude Code uses the /loop command for periodic repeated execution, supports Cron scheduling, can trigger shell commands at certain points in the agent lifecycle, and can offload tasks to GitHub Actions for background execution even when your computer is off.
Both tools also feature the core primitive /goal: it continuously iterates until a preset termination condition is met. After each iteration, a separate lightweight model verifies task status, rather than the main agent executing the code. You can set rules like "all tests in test/auth pass and linter checks out," then let the system run autonomously.
Worktrees: Enabling Parallel Agent Collaboration
When multiple AI agents operate simultaneously, file editing conflicts are common. Git worktrees solve this perfectly: they are independent working directories that create dedicated branches from the repository history, isolating each agent's edits without interference.
Codex natively integrates worktree support, allowing multiple threads to operate on the same codebase concurrently. Claude Code uses the --worktree flag to start a session in an isolated directory, or you can enable isolation: worktree in sub-agent configuration to assign each helper agent a fresh workspace that is cleaned up after task completion.
Project Skills: Embedding Knowledge, Eliminating Repetition
Project Skills eliminate the inefficiency of having to re-explain project context every time you start a session.
Both tools use a consistent format: a folder containing a SKILL.md main file (with instructions and metadata), plus optional scripts, reference documents, and resource files. Skills essentially solidify project conventions, development workflows, historical lessons, and forbidden rules into a reusable asset that every loop iteration can call upon.
Anthropic internally categorizes Skills into 9 types, covering the full software workflow from knowledge supplementation to coding, verification, deployment, troubleshooting, and operations:
How to correctly use the internal auth SDKRun a full signup flow in a headless browserStandard SQL for querying user retention dataGenerate a standup that only outputs incremental changesCreate a new microservice templateAdversarial-review sub-agent finds bugsBabysit-pr monitors the entire PR processWhen an alert comes in, give a structured conclusionClean up unused cloud resourcesPlugins & Connectors: Bridging the Full Tool Ecosystem
A loop that can only read local files is severely limited. Connectors built on the MCP protocol allow AI agents to interact with issue trackers, query databases, call staging environment APIs, push messages to Slack, and more. Both Codex and Claude Code support MCP, so connectors written for one tool can often be used directly in the other.
Plugins package connectors and project skills together, enabling team members to install the entire configuration with one click. Connectors make loops truly integrated into existing development workflows, rather than just producing "paper solutions."
Sub-agents: Separating Execution and Verification for Quality Control
Sub-agents are the most valuable architectural design in the loop system. The core idea is to split code writing and code review into two independent roles.
AI models that write code often struggle to find their own errors. A verification sub-agent with independent instructions—or even a different model—can accurately catch the main agent's omissions and defects.
Codex supports creating sub-agents on demand, running multiple agents synchronously and merging results automatically. You can define sub-agents in .codex/agents/ using TOML files, configuring name, role, instructions, and even choosing different models and reasoning intensity. Claude Code manages sub-agents and agent teams in .claude/agents/, supporting multi-role collaboration.
Persistent Memory: Giving Loops Long-Term Recall
Memory may seem trivial, but it is indispensable for long-running AI agents. AI models lose all context outside a single session, so task records cannot reside in temporary conversations—they must be persisted to local files (e.g., Markdown documents, Linear boards, etc.).
Anthropic's internal practice is to have Skills log their own history. For example, a standup-post Skill can write each output to standups.log, and on the next run, it reads the history first to determine what changed compared to yesterday. This memory can be simple—append-only text or JSON—or more complex, using SQLite directly.
How to Write High-Quality Skills
Anthropic has shared 5 key details for writing Skills internally:
stuck-jobs.md when a task is stuck, or split API usage examples into references/api.md.config.json. If configuration is not yet built, Claude should ask the user first. For structured, multi-choice questions, you can even call the AskUserQuestion tool directly.Limitations and Challenges of Loop Engineering
Despite its promise, Loop Engineering currently has clear limitations:
Practitioners must balance automation with human review, and maintain control over architecture and quality. As Richard Sutton's "bitter lesson" applies to the Agent version: stop solving everything yourself; focus on systems that scale with more agents, such as goal setting and orchestration, expanding one person's capability into the execution power of a swarm of agents.
From Loop Engineering to Agent Orchestration
Loop Engineering builds on earlier agent tool engineering (building runtime environments for single agents) and software factory models (systematically constructing software projects), operating at a higher level. It moves us from "manually driving AI" to "system-driven AI," shifting the engineer's core competency from prompt writing to loop system design.
If you want to explore more patterns of agent collaboration, refer to the tutorials on AI Agents and Multi-Agent Systems and Workflow Orchestration.
FAQ
What is Loop Engineering? Loop Engineering is a new collaboration paradigm for AI coding agents. Instead of manually typing prompts to command agents, engineers design an automated loop system. This system autonomously discovers, assigns, executes, and verifies tasks, driving agents to iteratively work toward a goal. The engineer's primary work shifts from writing prompts to building and optimizing the loop.
What is the main purpose of deploying sub-agents in a loop system?
The core purpose is to separate execution from verification. The main coding agent struggles to self-check its own output, while an independent sub-agent can take on code review and rule validation, catching defects early and improving output quality. This also underlies the /goal command, where a separate model judges task completion, avoiding bias from the executor's self-assessment.
What issues should practitioners watch out for when using Loop Engineering, and how should they respond? Three major issues: First, AI cannot perform absolute verification, so errors may propagate through automation. Second, long-term reliance can lead to project knowledge gaps and "understanding debt." Third, there is a risk of cognitive laziness, blindly accepting agent output. Responses: Insist on human final review and periodically inspect agent outputs; use automation judiciously, balancing loop runs with manual interaction; design loops with an engineer's mindset, not as a mere trigger operator.
Also available in 中文.