Loop Engineering: The New Paradigm for AI Coding Agents

From Manual Prompting to Automated Loops: Systematically Building Agent Collaboration

By AI Skill Navigation Editorial Team

From Manual Prompting to Automated Loops

Over the past two years, the mainstream way of using coding agents has been simple: craft detailed prompts, provide full project context, send instructions, review results, and issue new commands based on the output. Throughout this process, humans remain in charge of every interaction, and AI is merely a tool.

But this model is being disrupted. Now, you can build an automated system that autonomously discovers tasks, assigns work, verifies results, archives completed items, and plans next steps—replacing manual AI agent orchestration entirely. This is Loop Engineering—the new paradigm for AI coding agent collaboration.

"I don't manually prompt Claude anymore. I have loops running that prompt Claude and decide what to do next. My job is to write loops."

— Boris Cherny, Head of Claude Code at Anthropic

At its core, Loop Engineering shifts engineers from "writing prompts" to "designing loop systems." These systems automatically discover tasks, assign them, check results, record status, and decide next steps, while the engineer becomes the architect and maintainer of the loop.

The Six Core Components of a Loop System

A complete automated loop consists of five core functional modules plus one essential persistent memory capability:

Automated Scheduling: Timed task initiation, autonomous problem discovery and task classification

Worktrees: Support parallel agent operations while avoiding file conflicts

Project Skills: Capture project-specific knowledge, eliminating repetitive context explanations

Plugins & Connectors: Bridge AI agents with existing development and office tools

Sub-agents: Split "executor" and "verifier" roles for specialized duties

Persistent Memory: Store task progress, results, and todos outside of sessions

Both Claude Code and Codex fully support these six capabilities. Their functional logic is identical, with only minor differences in naming and operation.

Automated Scheduling: The Heart of the Loop

Automation is what distinguishes a "one-off manual run" from a "continuous loop."

In Codex, you can create scheduled tasks in the automation panel, select a target project, configure execution instructions and frequency, and choose to run in the local directory or an isolated worktree. After execution, valuable issues are added to the todo list, while normal runs are automatically archived. Internally, OpenAI uses such automation for daily issue triage, CI error summaries, writing commit messages, and debugging regressions introduced in historical iterations.

Claude Code uses the /loop command for periodic repeated execution, supports Cron scheduling, can trigger shell commands at certain points in the agent lifecycle, and can offload tasks to GitHub Actions for background execution even when your computer is off.

Both tools also feature the core primitive /goal: it continuously iterates until a preset termination condition is met. After each iteration, a separate lightweight model verifies task status, rather than the main agent executing the code. You can set rules like "all tests in test/auth pass and linter checks out," then let the system run autonomously.

Worktrees: Enabling Parallel Agent Collaboration

When multiple AI agents operate simultaneously, file editing conflicts are common. Git worktrees solve this perfectly: they are independent working directories that create dedicated branches from the repository history, isolating each agent's edits without interference.

Codex natively integrates worktree support, allowing multiple threads to operate on the same codebase concurrently. Claude Code uses the --worktree flag to start a session in an isolated directory, or you can enable isolation: worktree in sub-agent configuration to assign each helper agent a fresh workspace that is cleaned up after task completion.

Project Skills: Embedding Knowledge, Eliminating Repetition

Project Skills eliminate the inefficiency of having to re-explain project context every time you start a session.

Both tools use a consistent format: a folder containing a SKILL.md main file (with instructions and metadata), plus optional scripts, reference documents, and resource files. Skills essentially solidify project conventions, development workflows, historical lessons, and forbidden rules into a reusable asset that every loop iteration can call upon.

Anthropic internally categorizes Skills into 9 types, covering the full software workflow from knowledge supplementation to coding, verification, deployment, troubleshooting, and operations:

CategoryDescriptionExample

library and API referenceExplain internal usage and gotchas of libraries/CLIs/SDKsHow to correctly use the internal auth SDK product verificationDetermine if output actually worksRun a full signup flow in a headless browser data fetching and analysisEncapsulate data retrieval methods and common analysis pathsStandard SQL for querying user retention data business process and team automationCompress repetitive team processes into a single commandGenerate a standup that only outputs incremental changes code scaffolding and templatesGenerate fixed skeletons with natural language constraintsCreate a new microservice template code quality and reviewEnsure code meets team quality standardsAdversarial-review sub-agent finds bugs CI/CD and deploymentMove from development to productionBabysit-pr monitors the entire PR process runbooksMap symptoms to troubleshooting pathsWhen an alert comes in, give a structured conclusion infrastructure operationsResource cleanup, dependency management, cost investigationClean up unused cloud resources

Plugins & Connectors: Bridging the Full Tool Ecosystem

A loop that can only read local files is severely limited. Connectors built on the MCP protocol allow AI agents to interact with issue trackers, query databases, call staging environment APIs, push messages to Slack, and more. Both Codex and Claude Code support MCP, so connectors written for one tool can often be used directly in the other.

Plugins package connectors and project skills together, enabling team members to install the entire configuration with one click. Connectors make loops truly integrated into existing development workflows, rather than just producing "paper solutions."

Sub-agents: Separating Execution and Verification for Quality Control

Sub-agents are the most valuable architectural design in the loop system. The core idea is to split code writing and code review into two independent roles.

AI models that write code often struggle to find their own errors. A verification sub-agent with independent instructions—or even a different model—can accurately catch the main agent's omissions and defects.

Codex supports creating sub-agents on demand, running multiple agents synchronously and merging results automatically. You can define sub-agents in .codex/agents/ using TOML files, configuring name, role, instructions, and even choosing different models and reasoning intensity. Claude Code manages sub-agents and agent teams in .claude/agents/, supporting multi-role collaboration.

Persistent Memory: Giving Loops Long-Term Recall

Memory may seem trivial, but it is indispensable for long-running AI agents. AI models lose all context outside a single session, so task records cannot reside in temporary conversations—they must be persisted to local files (e.g., Markdown documents, Linear boards, etc.).

Anthropic's internal practice is to have Skills log their own history. For example, a standup-post Skill can write each output to standups.log, and on the next run, it reads the history first to determine what changed compared to yesterday. This memory can be simple—append-only text or JSON—or more complex, using SQLite directly.

How to Write High-Quality Skills

Anthropic has shared 5 key details for writing Skills internally:

Don't restate the obvious: A Skill is not a summary for humans; it should fill in information that the model cannot easily access or tends to get wrong. The most valuable content is often gotchas—details that "everyone on the team knows, but the model doesn't know by default."

SKILL.md is more like a table of contents, not a dump: Better to make SKILL.md a directory and signpost, distributing specific materials to other files as needed. For example, read stuck-jobs.md when a task is stuck, or split API usage examples into references/api.md.

Don't write Skills too rigidly: Give Claude key rules, but also enough flexibility to adapt; otherwise, a Skill may get stuck in specific contexts when reused.

Plan setup in advance: Put user context in config.json. If configuration is not yet built, Claude should ask the user first. For structured, multi-choice questions, you can even call the AskUserQuestion tool directly.

Description should directly serve triggering: The description is not a summary but a trigger condition. What keywords might the user say? What files might they upload? Under what scenarios should this Skill be activated? Write these directly.

Limitations and Challenges of Loop Engineering

Despite its promise, Loop Engineering currently has clear limitations:

AI cannot perform absolute verification: Errors can propagate through automation, requiring human final review.

Project knowledge gaps: Long-term reliance on automation can lead to "understanding debt" for developers.

Cognitive laziness: Blindly accepting agent output without critical thinking.

Token consumption: Loops consume significant tokens, and usage patterns vary by individual.

Practitioners must balance automation with human review, and maintain control over architecture and quality. As Richard Sutton's "bitter lesson" applies to the Agent version: stop solving everything yourself; focus on systems that scale with more agents, such as goal setting and orchestration, expanding one person's capability into the execution power of a swarm of agents.

From Loop Engineering to Agent Orchestration

Loop Engineering builds on earlier agent tool engineering (building runtime environments for single agents) and software factory models (systematically constructing software projects), operating at a higher level. It moves us from "manually driving AI" to "system-driven AI," shifting the engineer's core competency from prompt writing to loop system design.

If you want to explore more patterns of agent collaboration, refer to the tutorials on AI Agents and Multi-Agent Systems and Workflow Orchestration.

FAQ

What is Loop Engineering? Loop Engineering is a new collaboration paradigm for AI coding agents. Instead of manually typing prompts to command agents, engineers design an automated loop system. This system autonomously discovers, assigns, executes, and verifies tasks, driving agents to iteratively work toward a goal. The engineer's primary work shifts from writing prompts to building and optimizing the loop.

What is the main purpose of deploying sub-agents in a loop system? The core purpose is to separate execution from verification. The main coding agent struggles to self-check its own output, while an independent sub-agent can take on code review and rule validation, catching defects early and improving output quality. This also underlies the /goal command, where a separate model judges task completion, avoiding bias from the executor's self-assessment.

What issues should practitioners watch out for when using Loop Engineering, and how should they respond? Three major issues: First, AI cannot perform absolute verification, so errors may propagate through automation. Second, long-term reliance can lead to project knowledge gaps and "understanding debt." Third, there is a risk of cognitive laziness, blindly accepting agent output. Responses: Insist on human final review and periodically inspect agent outputs; use automation judiciously, balancing loop runs with manual interaction; design loops with an engineer's mindset, not as a mere trigger operator.

Also available in 中文.