AI Coding Agents: A Practical Workflow Guide for Real Projects
11 March, 2026 AI
Six months ago I started using Claude Code on a real Symfony project — not a tutorial, not a toy, but a production codebase with routing, DI wiring, Twig templates, and a Markdown-based content pipeline. The first week was rough. I'd give the agent a task, it would explore half the codebase, make changes I didn't ask for, and I'd spend more time reviewing than I would have spent coding. I nearly gave up.
Then I changed how I worked with it. Not the tool — my workflow. The agent didn't get smarter. I got better at using it. This article is everything I learned about working with AI coding agents productively, drawn from months of daily use on a real project.
How Coding Agents Actually Work
Before optimising your workflow, you need to understand the loop. Every AI coding agent — Claude Code, Cursor, Windsurf, Copilot Workspace — follows the same fundamental cycle:
Read context → Reason about the task → Take an action → Observe the result → Repeat
Read context means the agent examines files, searches the codebase, reads error messages, or inspects tool output. This is where it builds understanding.
Reason means the model processes everything in its context window and decides what to do next. This is the expensive part — every token in the context costs compute.
Take an action means calling a tool: editing a file, running a command, searching for something, creating a file. Each action is a full round-trip through the model.
Observe means reading the result of the action and incorporating it into the next reasoning step.
The critical insight: every iteration of this loop costs tokens. A task that takes 5 iterations is dramatically cheaper than one that takes 20. Your job as the human in the loop is to help the agent complete tasks in fewer iterations — not by doing the work yourself, but by providing better inputs.
The Most Important Habit: Clear Task Scoping
The single biggest factor in agent productivity is task clarity. Vague tasks produce expensive exploration. Specific tasks produce direct action.
Vague (expensive):
Fix the article system
The agent doesn't know what's broken. It will read multiple files, grep for patterns, explore the directory structure, and maybe ask you what you mean. That's 10+ tool calls before any actual work.
Specific (efficient):
In ArticleRepository::findBySlug, the method returns null
when the file exists but has a future publishedAt date.
It should return the article regardless of publishedAt -
the controller handles access control.
The agent knows exactly which file to read, what method to look at, what the current behaviour is, and what the desired behaviour is. That's 2-3 tool calls: read the file, make the edit, done.
The rule: the more context you provide upfront, the fewer tokens the agent spends discovering it.
When to Delegate vs When to Direct
Not every interaction with an agent should be "do this for me." I've found three distinct modes that work:
Full delegation
Give the agent a complete task and let it figure out the approach.
Add a new route /tools/base32 that works exactly like
the existing /tools/base64 page but for Base32 encoding.
Follow the same controller, template, and translation patterns.
Best for: tasks where the codebase has clear patterns to follow. The agent reads an existing example, copies the pattern, and adapts it.
Directed implementation
You decide the approach; the agent executes it.
Create ArticleParser with these methods:
- parse(string $path): array{frontmatter: array, body: string}
- discoverPaths(): array
- buildPath(string $category, string $slug): string
Extract splitFrontmatter() from ArticleRepository as a
private method. Constructor takes string $contentDir.
Best for: architectural changes where you've already planned the design. You know what you want; the agent is faster at typing it.
Pair programming
Work through the problem together, reviewing each step.
Let's refactor ArticleRepository. First, read the current
file and tell me what responsibilities it has. Then we'll
decide how to split them.
Best for: complex tasks where you're not sure of the approach yet. The agent's analysis helps you plan, and your review prevents it from going off track.
The mistake I made early on was using full delegation for everything. That works for simple, pattern-following tasks. For anything architectural or unfamiliar, directed implementation or pair programming produces better results in fewer tokens.
Context Window Management
The context window is the agent's working memory. Everything — system instructions, conversation history, file contents, tool outputs — competes for space in it. When the window fills up, older context gets compressed or dropped, and the agent loses track of earlier decisions.
Practical strategies:
Start new conversations for new tasks. Don't reuse a conversation that already has 50 messages about a different feature. The old context adds noise and costs tokens on every subsequent message.
Read files through the agent, not around it. When the agent reads a file, that content enters its context and informs subsequent decisions. If you paste file contents manually, you control what it sees, but you also lose the agent's ability to navigate back to that file later.
Don't dump entire files when a section is enough. If you know the problem is in lines 80-120, say so. The agent will read just that section instead of loading 500 lines into context.
Keep custom instructions lean. Your project's instruction file is loaded on every single request. A 2,000-token instruction file on a 100-message conversation costs 200,000 tokens just for instructions. As the ETH Zurich research confirmed, lean instructions don't just save tokens — they improve quality.
Multi-Step Tasks: Plan First, Execute Second
For tasks with more than 2-3 steps, I always have the agent plan before executing. This is the most consistently valuable pattern in my workflow.
I want to add a sitemap index that splits sitemaps by
content type (tools, articles, pages). Enter plan mode
and outline the approach before changing any code.
The plan serves two purposes:
-
You catch mistakes early. If the agent's plan misunderstands the requirement, you correct it before it writes any code. Fixing a plan costs one message. Fixing implemented code costs reverting changes and starting over.
-
The agent stays focused. With a plan in its context, each step is clearly scoped. Without a plan, the agent might start implementing, discover a complication, change approach mid-stream, and leave half-finished code behind.
What Agents Are Good At (and What They're Not)
After months of daily use, I've developed a clear sense of what works:
Agents excel at
Pattern replication. "Add a new tool page following the same pattern as the existing UUID generator." The agent reads the example, understands the pattern, and replicates it accurately. This is where agents save the most time.
Boilerplate and wiring. Controllers, routes, translations, service configuration, template scaffolding — anything that follows framework conventions. The agent does this faster than you and rarely makes mistakes.
Refactoring with clear rules. "Extract this method into a new service" or "rename all occurrences of X to Y." Mechanical transformations are the agent's sweet spot.
Codebase exploration. "How does the article caching work?" or "Find all places where this service is injected." The agent can search, read, and summarise faster than you can grep.
Writing tests for existing code. Given a clear function signature and expected behaviour, agents generate solid test cases — including edge cases you might not think of.
Agents struggle with
Ambiguous requirements. If you don't know what you want, the agent can't guess. It will produce something, but it'll be the average of all possible interpretations.
Novel architecture. Agents follow patterns; they don't invent them. For a new architectural approach that doesn't exist in the codebase, you need to design it and let the agent implement it.
Cross-cutting concerns. Changes that touch many files with different patterns are harder. The agent handles each file individually and can lose consistency across a large changeset.
Knowing when to stop. Agents will keep going until the task seems done. If you don't define "done" clearly, you'll get over-engineering — extra error handling, unnecessary abstractions, unwanted documentation.
The Review Habit
Every change the agent makes should be reviewed before you move on. Not because the agent is unreliable — but because catching issues early is exponentially cheaper than catching them late.
My review checklist:
- Does this change only what was asked? Agents sometimes "improve" nearby code. If I asked for a bug fix and the agent also reformatted the file, I revert the formatting.
- Are there new dependencies? Check for added imports, new packages, or service injections that weren't discussed.
- Does the code match project conventions? Even with good instructions, agents occasionally deviate — wrong naming, different patterns from the rest of the codebase.
- Is there unnecessary complexity? Agents tend toward defensive code. Extra null checks, try-catch blocks, and fallback values for scenarios that can't happen in practice.
This takes 30 seconds for a simple change and 2-3 minutes for a larger one. It's the cheapest quality gate in the entire workflow.
Tool Integration with MCP
One of the biggest workflow improvements I've made is connecting the agent to external tools through MCP (Model Context Protocol). Instead of the agent constructing shell commands to query a database or call an API, it uses typed tools with proper schemas.
The practical benefit is reliability. postgres.query(sql: "SELECT ...") is less error-prone than the agent constructing a psql one-liner with correct quoting and escaping. And when something fails, the error message comes from the MCP server, not from a shell parsing error.
Comparison: Workflow Patterns
| Pattern | When to use | Token cost | Quality |
|---|---|---|---|
| Full delegation | Pattern-following tasks | Medium | High if pattern exists |
| Directed implementation | Architectural changes | Low | High (you control design) |
| Pair programming | Unclear requirements | Higher | Highest |
| Plan-then-execute | Multi-step tasks | Medium (plan saves re-work) | High |
| Single-shot | Trivial changes | Lowest | High |
The Habits That Matter
Working effectively with AI coding agents is a skill, not a product feature. The agent is the same regardless of how you use it — what changes is the quality of your inputs and the structure of your workflow.
Three habits make the biggest difference: scope tasks clearly before starting, plan multi-step work before executing, and review every change before moving on. Everything else — context management, delegation modes, tool integration — builds on those fundamentals.
The goal isn't to automate yourself out of the loop. It's to spend your time on decisions and design while the agent handles implementation and exploration. That split is where the real productivity gain lives.