Beyond AI Slop
Five Practices for Building Real Products with Agentic Code
TL;DR
After a year of building with agentic coding tools, I’ve distilled five practices from hundreds of fast failures that dramatically improved my code quality:
Problem Statement Speed Bump - Use a 6-question framework before building new features to avoid solution bias and ensure you’re solving the right problem
Atomic Breakdown Workflow - Familiarize AI with codebase → define outcome → generate plan → interview phase (one question at a time) → document as GitHub issue → enforce TDD
Git Discipline + PR Security Review - Commit frequently for rollback points, then run automated security agents (Claude Code’s GitHub Action, OpenAI’s Aardvark) on pull requests before merging
MCP Servers for Context - Use Supabase, MongoDB, GitHub, Context7, and Netlify MCPs to give AI real-time access to actual systems, eliminating copy-paste debugging cycles
Claude Skills for Repeatable Processes - Codify hard-won lessons (TDD, Git Worktrees, Problem Statements) and domain workflows (funding research, email replies) into persistent instructions
Meta-practice: Run AI-assisted post-mortems on every dead end by asking “where did our path go wrong?” to extract portable lessons.
These aren’t rules - they’re scar tissue from rapid experimentation. Share what’s working for you so we can all learn faster.
A year ago, my attempts at coding mostly involved trying to wrap my head around the endless layers of setup - configuring environments, understanding build tools, debugging dependency conflicts before I could even start on the actual problem I wanted to solve. Today, I’m shipping real applications and actually solving problems instead of fighting with tooling.
The difference isn’t that I suddenly learned to code the traditional way. It’s that I learned how to fail productively with agentic coding tools.
Here’s what nobody tells you about building with Claude Code or Cursor: you’ll hit dead ends constantly. The difference is that these dead ends happen in hours instead of weeks. A path that would have taken a traditional developer three days to realize was wrong? You’ll know in two hours. The trick isn’t avoiding the dead ends - it’s recognizing the patterns in them fast enough to extract the lesson before you repeat the mistake.
This post shares five practices I’ve distilled from hundreds of those small failures over the past year. These aren’t rules handed down from on high. They’re scar tissue - patterns that kept emerging every time I looked at a mess of brittle code and asked myself “how did I get here again?”
I’m sharing this for two reasons: to help you avoid some of the dead ends I’ve already explored, and selfishly, to learn what’s working for you. Because here’s the truth - we’re all figuring this out in real time. The best practices for agentic coding are being written right now, in conversations like this one.
So consider this the start of a conversation, not a prescription. What follows is what’s working for me today. Tomorrow, I’ll probably learn something that makes me rethink half of it.
Practice 1: The Problem Statement Speed Bump
The dead end that taught me this:
Early on, I’d get excited about a feature idea and immediately start prompting the AI to build it. “Create a dashboard that shows user analytics.” “Build a notification system for overdue tasks.” The AI would happily comply, generating clean-looking code that worked... technically.
The problem surfaced days later. The analytics dashboard showed metrics that looked impressive but weren’t actually what users needed to make decisions. The notification system sent alerts, but at frequencies that annoyed rather than helped. The code worked, but I’d built the wrong thing.
The issue wasn’t the AI’s implementation - it was that I’d given it a solution to build instead of a problem to solve. And agentic tools are perfectly happy to build whatever you ask for. They won’t push back and ask “but why?”
What I do now:
For any new feature or product, I force myself to pause before any code gets written. I use a structured framework that asks six specific questions:
Who is the person with this problem? (Detailed enough that someone could objectively identify if they fit)
What problem are they experiencing? (From their perspective, not mine)
What negative impact do they experience? (Observable and verifiable)
How frequently does this happen? (Quantitative if possible)
What’s the root cause? (Can’t be “lack of solution X”)
What outcome would be achieved if solved? (Not “they’d have the tool” - what changes for them?)
I built this as a Claude Skill that actually interviews me, validates my answers, and won’t let me slip into solution-mode. It produces a concise problem statement (under 150 words) that becomes an artifact I can reference throughout development.
Why it matters:
This feels like a speed bump at first - and that’s exactly the point. When the AI asks “what should we build?” and you have a clear problem statement instead of a vague feature idea, everything downstream gets better. The AI can propose implementations, you can evaluate them against the actual problem, and you catch “technically correct but contextually wrong” solutions before you build them.
Problem statements also prevent solution bias. When I’m forced to articulate the root cause without saying “they don’t have software X,” I often realize I was jumping to a preconceived solution. Sometimes the real problem is workflow-based, not technical at all.
When I skip it:
I don’t go through all six questions for everything. Bug fixes and enhancements to existing features usually have enough context already - the problem is obvious, and we’re building on established workflows. This is specifically for new features or products where there’s a risk of building the wrong thing entirely.
Example:
Here’s what a completed problem statement looks like:
Small business owners with 5-20 employees who manually process invoices in spreadsheets multiple times per week experience a recurring problem: they spend 2-3 hours per session reconciling invoice data, often discovering errors days later when following up with clients. This happens because invoice data lives in multiple disconnected systems (email, spreadsheets, banking apps) with no single source of truth, requiring manual data entry that’s prone to human error. If solved, they would reduce invoice processing time to under 30 minutes per session and catch discrepancies immediately, improving cash flow predictability.
That’s 100 words that prevent me from building “invoice software” when what they really need might be better email filtering, or a simple reconciliation checklist, or something else entirely.
Questions for readers:
How do you ensure you’re solving the right problem before diving into implementation? Do you have a similar pause mechanism, or do you find your way to the right solution through iteration? What’s worked for you?
Practice 2: Atomic Breakdown - Teaching the AI to Think With You
The dead end that taught me this:
I’d describe what I wanted in a paragraph or two, the AI would generate a few hundred lines of code, and I’d run it. Sometimes it worked. More often, it would break in subtle ways - edge cases I hadn’t mentioned, conflicts with existing code patterns, or assumptions the AI made that didn’t match reality.
The problem was context mismatch. I had all this knowledge in my head about how the existing codebase worked, what patterns we followed, what edge cases mattered. The AI had none of it. I was essentially asking it to make decisions without the information needed to make them well.
The breakthrough came when I realized I needed to deliberately load context before asking for implementation. Not just “here’s what I want,” but a structured process that got us on the same page first.
The workflow I developed:
I now follow a six-step process for any feature work, whether it’s new or an enhancement:
1. Familiarization - I ask the AI to review the relevant parts of the codebase and summarize what exists today. “Look at the authentication flow and explain how it currently works.” This loads the patterns and conventions into its working memory.
2. Outcome Definition + Interview - This is where I combine two critical pieces in one prompt: I describe the outcome I want to achieve, then immediately tell the AI to interview me before analyzing anything.
The exact language matters: “I want to add auto-save for drafts so users don’t lose work if they close the browser. I want you to ask me any questions you need to produce the best possible user experience. Ask me one question at a time.”
That “one question at a time” part is crucial. When the AI asks everything at once, you get a generic list. When it asks one at a time, it can build on your answers. Your response to question one shapes question two. You end up with a conversation that surfaces considerations neither of you would have thought of upfront.
This typically runs 3-5 rounds of back and forth, and what comes out is dramatically better than if I’d just said “figure it out.”
3. Plan Generation - NOW the AI analyzes the codebase with all the context from the interview and proposes what changes would be needed. Not implementation - just research and planning about what would need to change and why. The quality of this analysis is directly dependent on the interview that just happened.
4. GitHub Issue - We document everything. Here’s what changed my output quality: I have the AI write the issue as if it’s teaching another developer who will implement this. Not “what I think needs to happen” but “what another person would need to know.” This forces much more detailed thinking.
5. TDD Enforcement - The issue includes test requirements as part of definition of done. Tests must be written first, watched to fail, then implementation happens.
The insight about interview-first:
The order here is critical. If you let the AI analyze first and ask questions after, it’s already committed to an approach. It’ll ask questions to validate its thinking, not to discover better options.
When you force the interview first, the AI doesn’t have a plan yet. It’s genuinely trying to understand before it commits. The analysis that follows is informed by real clarity, not assumptions.
Why this matters:
This separates thinking from doing. The interview and planning phases are where you catch “technically correct but contextually wrong” before any code exists. You’re teaching the AI to think with you, not just execute for you.
It also creates a documentation trail. When I start a new session (especially for larger implementations), I can point the AI to the GitHub issue and say “read this and ask any clarifying questions” - and we’re immediately aligned.
Example flow:
Here’s a condensed version for adding a “draft auto-save” feature:
Me: Review the current document editing flow and summarize how saves work today.
AI: [Summarizes - manual save button, validation on save, no interim persistence]
Me: I want to add auto-save for drafts so users don’t lose work if they close the browser. I want you to ask me any questions you need to produce the best possible user experience. Ask me one question at a time.
AI: How should we handle conflicts if someone has the doc open in two browsers?
Me: [Answer]
AI: Should draft auto-saves be visible to collaborators immediately, or only when the author explicitly saves?
Me: [Answer]
AI: What frequency feels right for auto-save - every 30 seconds, every 2 minutes, or triggered by pausing typing?
Me: [Answer - let’s say typing pause]
AI: Got it. Now let me analyze the codebase with this context... [Proceeds to propose changes that account for multi-tab conflicts, collaborator visibility, and typing-pause detection]
Me: Write a GitHub issue that another developer could implement from.
AI: [Creates detailed issue incorporating all the interview insights]
That interview-first approach meant the analysis already knew about multi-tab scenarios, collaborator expectations, and UX preferences. Without it, the AI might have proposed a simple timer that created problems we’d only discover in testing.
Questions for readers:
How do you handle context loading for longer implementations? Do you do planning and implementation in the same session or break them up? Have you found ways to keep the AI aligned with your codebase patterns, or do you catch divergence through code review?
Practice 3: Git as Your Safety Net
The dead end that taught me this:
I’d be an hour into implementing a feature when I’d realize the approach wasn’t working. Maybe the architecture was getting too complex. Maybe I’d discovered a constraint that invalidated the whole direction. Maybe the code technically worked but felt wrong.
The problem: I had no good rollback point. I could try to manually undo changes, but which changes? I’d made dozens of small edits across multiple files. I could start over completely, but that felt like throwing away an hour of learning. So I’d keep going, trying to salvage the code, making it messier in the process.
What I do now:
I create a branch at the start of every session. Then I commit after completing each logical chunk - not when something is “done” in the traditional sense, but when it “makes sense” as a restore point.
There’s no specific frequency rule. It’s more intuitive than that. Did I just get the database schema working? Commit. Did I finish the API endpoint even though the frontend isn’t connected yet? Commit. Did I refactor a helper function that was getting messy? Commit.
The mental model shifted from “commits are for finished work” to “commits are checkpoints.” Lower the friction. Trust that you can always undo.
The evolution of this practice:
This was MORE critical six months ago, before Claude Code and Cursor added built-in rollback features. Now these tools let you time-travel through recent changes without even touching git commands.
But I still commit regularly, because git gives me something the AI’s undo button doesn’t: the ability to say “that entire direction was wrong, go back to this morning” without losing the ability to reference what I tried. Sometimes failed experiments contain useful pieces, or at minimum, teach you what doesn’t work.
Database changes are different:
Most code changes are easy to roll back. You revert the commit, run the app, and you’re back to a previous state. Database migrations and schema changes are trickier - you can’t always undo them cleanly, especially in production.
For database work, I’m more deliberate. I think through the change more carefully upfront. I test the migration on a separate branch. I make sure I understand the rollback path before I execute. The risk is higher, so the planning is more thorough.
Security review at the pull request level:
Once a feature is complete and I’m ready to merge, I’ve added a critical gate: automated security review on the pull request. With AI generating code quickly, it’s easy to introduce vulnerabilities without realizing it - especially for those of us without years of security experience.
I use security agents that analyze the entire set of changes before they reach main:
Claude Code’s GitHub Action for security review - This automatically runs when I open a pull request. It analyzes all the changes in the PR and posts inline comments on specific lines of code where it finds security concerns. The system checks for common vulnerability patterns:
Injection attacks (SQL, command, NoSQL, XXE)
Authentication and authorization issues
Hardcoded secrets or exposed credentials
Weak cryptography or insecure random number generation
Missing input validation
Business logic flaws like race conditions
Insecure configurations and CORS settings
The action integrates directly into the GitHub workflow and can be customized to match specific security policies. It’s built on Claude Code’s extensible architecture - the security rules are just markdown documents that can be modified.
Aardvark (OpenAI’s security agent) - I’m also in the private beta for OpenAI’s Aardvark, a GPT-5-powered autonomous security researcher. Unlike traditional scanners, Aardvark thinks like a human security expert. It:
Analyzes the full repository to build a threat model
Scans commit-level changes against that threat model
Validates vulnerabilities by attempting to exploit them in sandboxed environments
Proposes patches that integrate with the codebase
Aardvark catches complex issues that only surface under specific conditions - the kind of vulnerabilities that are hard to spot in manual review. In OpenAI’s benchmarks, it identifies 92% of known vulnerabilities with low false positives.
Why security review at the PR level matters:
When you’re committing frequently (as you should for rollback capability), running security review on every commit would be noisy and slow. Most individual commits are incremental and incomplete.
The pull request is where you’re saying “this complete feature is ready to merge.” That’s the right time to evaluate the security posture of all the changes together. The security agents can see the full picture - how different parts interact, what the complete attack surface looks like, whether the feature as a whole introduces risks.
The workflow in practice:
Create feature branch
Implement feature with frequent commits (restore points)
Tests pass (from TDD practice)
Create pull request
Security agents automatically analyze all changes
Review findings - address critical issues, document accepted risks
Human approval for architecture/business logic
Merge to main
The security review adds maybe 5-10 minutes to the PR process, but it’s automated - I just need to review the findings. It catches issues that would take hours or days to fix in production.
Why this matters for agentic coding:
With traditional coding, you’re moving slowly enough that you remember what you changed. You can mentally track “I modified these three files in these ways.”
With agentic tools, you generate a lot of code quickly. The AI might touch ten files to implement one feature. And it’s just as easy to go down the wrong path quickly - or introduce security vulnerabilities that seem fine at first glance.
The frequent commits give you rollback points for when experiments fail. The PR-level security review ensures that when you do merge, you’re not introducing vulnerabilities at scale.
The practical workflow:
When I realize something isn’t working, I don’t immediately start fixing. I pause and ask: “Is this fixable, or should I restart from the last good checkpoint?”
If it’s fixable - a bug, a missing edge case, a small logic error - I keep going.
If the architecture is wrong, or I’m fighting the code, or it’s getting more complex instead of simpler - I roll back. Either using Claude Code’s built-in rewind, or by checking out the previous commit.
Then I start fresh with what I learned. The second attempt is almost always cleaner because I know what doesn’t work.
Questions for readers:
What’s your rollback strategy when AI takes you down the wrong path? Do you try to salvage the code or start fresh? How are you handling security review for AI-generated code - at the commit level, PR level, or post-merge? And how do you handle database changes that can’t easily be reverted - do you have strategies for testing migrations safely?
Practice 4: MCP Servers - Context is Everything
The dead end that taught me this:
Pre-MCP, my debugging workflow looked like this: AI generates database query. I copy it, paste into database client, run it, get an error. Copy error message back to AI. AI suggests fix. Repeat. Each cycle took minutes, and the AI was essentially flying blind - guessing at solutions without seeing actual state.
Worse, the AI would suggest queries that technically worked but violated constraints I forgot to mention. Or it would ignore row-level security policies. Or it wouldn’t account for triggers that modified data. I’d discover these issues only after deploying, because the AI had no way to verify its work against the actual system.
The code worked in isolation. It broke in context.
What changed with MCP:
MCP servers aren’t just a convenience feature - they fundamentally changed the quality of what gets built. Each MCP server brings its own documentation, best practices, and direct access to verify against real systems.
The AI isn’t just generating code anymore. It’s querying actual databases, checking real configurations, investigating live state. The difference is like asking someone to fix your car by describing it over the phone versus handing them the keys and letting them look under the hood.
The ones I use constantly:
Supabase - For database operations, this is transformative. The AI can see the actual schema, respect row-level security policies, understand triggers and constraints. It’s not guessing what the database looks like - it knows.
MongoDB - For non-relational data, same principle. The AI understands the actual document structure, can verify indexes exist, can test queries against real data.
GitHub - The AI can create issues, manage branches, review actual code in the repository. It’s not just generating suggestions about git commands - it’s doing the work directly.
Context7 - This one prevents a specific kind of failure: the AI suggesting code based on outdated or hallucinated APIs. Context7 fetches current, version-specific documentation from official sources and injects it directly into the AI’s context. When I ask “How do I set up authentication in Next.js? Use context7,” it pulls the latest official docs for the exact version I’m using - not what the AI was trained on months ago. This eliminates the frustrating cycle of getting code examples that reference deprecated methods or functions that don’t exist.
Netlify - For deployment and hosting configuration. The AI can verify environment variables are set, check build configurations, understand the actual deployment pipeline.
Why this matters for inexperienced builders:
Here’s the thing: experienced developers have years of accumulated best practices. They know that you need to check for existing indexes before adding new ones. They remember that foreign key constraints matter. They’ve learned the hard way what happens when you ignore row-level security.
We don’t have that experience yet. But MCP servers encode those best practices into the tools themselves. The AI isn’t just generating code from a generic playbook - it’s seeing your specific system and applying patterns appropriate to what actually exists.
The experience gap it fills:
Example: I ask for a database query to find all users who haven’t logged in for 30 days.
Without MCP: AI generates a query based on assumptions about my schema. Maybe it assumes a last_login column exists. Maybe it doesn’t account for users who have never logged in (null values). Maybe it ignores that some users are test accounts that should be filtered out.
With Supabase MCP: AI examines the actual users table. Sees the column is called last_login_at. Notices there’s a user_type column and test accounts should be excluded. Sees there’s an index on last_login_at so the query will be fast. Generates a query that works against the real schema and follows existing patterns.
The difference is context. Real, specific, actual context about your system.
Before vs. After debugging:
Before MCP:
“Here’s an error message, here’s my code, what’s wrong?”
Copy-paste cycle between terminals
AI guessing at solutions based on error messages alone
Debugging sessions that took hours
After MCP:
AI investigates directly, sees actual state
“Let me check the database... I see the issue, the foreign key constraint is failing because...”
Suggests fixes based on real data, not assumptions
Debugging cycles that take minutes
The speed increase is real, but more importantly, the solutions are better. They’re informed by what actually exists, not what the AI imagines might exist.
The practical difference:
I used to spend significant time translating between contexts. Copying data from one place to another. Explaining to the AI what the current state actually was. Now, the AI can see for itself.
It’s the difference between describing a room to someone over the phone versus letting them walk into the room and look around. They’ll notice things you forgot to mention. They’ll see relationships between elements you didn’t think to explain. Their understanding will be richer and more accurate.
Questions for readers:
Which MCP servers have changed your workflow most? Are there domains where you’re still doing manual copy-paste that could be MCP-ified? And for those building MCP servers - what problems are you solving that existing ones don’t address?
Practice 5: Claude Skills - Codifying Your Hard-Won Lessons
The dead end that taught me this:
Three months into using agentic coding tools, I noticed a pattern. I’d start a new chat session and find myself typing the same instructions I’d given in a previous session. “Make sure to write tests first.” “Document your reasoning before implementing.” “Use worktrees for feature isolation.”
The quality would vary based on how well I explained things each time. Some sessions, I’d remember to emphasize the important parts. Other sessions, I’d forget a critical detail and get suboptimal results. My lessons weren’t portable - each new conversation started from zero.
Worse, I was learning valuable patterns through failure, but those patterns weren’t being captured anywhere systematic. I’d hit a dead end, extract the lesson, apply it once, then forget the nuance a week later when I faced a similar situation.
What Skills enable:
Claude Skills changed this completely. They let you create extremely detailed procedures that the AI follows consistently - think of them as persistent instructions that travel with you across all conversations. But unlike simple saved prompts, Skills can include documentation, code examples, decision trees, validation rules, and multi-step processes.
I use Skills in two main categories:
Coding Best Practices - These are the disciplines that prevent brittle code:
Test-Driven Development
Git Worktrees
Problem Statement Definer
Domain Workflows - These are processes specific to my work:
Funding program research (for our startup portfolio)
Email reply assistant (with 30 examples of my writing style)
The Skills that made the biggest difference:
Test-Driven Development:
This one is rigid by design. It enforces: write the test first, run it and watch it fail, write minimal code to pass, refactor. The Skill includes explicit rules like “If you wrote code before the test, delete it and start over. No exceptions.”
Why does this matter? Because “I’ll test it later” never happens. And tests written after code often test the implementation you built, not the behavior you wanted. The Skill prevents me from rationalizing shortcuts that create technical debt.
It includes a complete checklist for definition of done, guidance on what to mock versus what to test with real systems, and specific patterns for database cleanup. Every test leaves the system clean - no pollution between runs.
Problem Statement Definer:
The 6-question framework I mentioned in Practice 1, formalized as a Skill. It validates my responses to prevent common pitfalls - like defining problems as “lack of solution X” or being too vague about who has the problem.
It interviews me through the questions, provides feedback if my answers don’t meet the criteria, and outputs a formatted problem statement under 150 words that I can reference throughout development.
Git Worktrees:
This Skill handles the setup for proper branch isolation. It verifies safety before making changes, selects appropriate directories, and ensures I’m working in isolated environments for feature work. Small thing, but it prevents “oops, I’m on main” moments.
Email Reply Assistant:
This one demonstrates that Skills work beyond coding. It contains 30 examples of my actual writing style - how I structure emails, my typical phrasing, my level of formality. When I need to draft a response, the AI references these examples and produces replies that sound like me.
The editing required drops dramatically. Instead of rewriting paragraphs to match my voice, I’m making minor tweaks. It’s the difference between “this is 60% right” and “this is 90% right.”
The pattern for creating Skills:
I create a new Skill when I notice one of these signals:
Repetition - I’m giving the same instructions across multiple sessions
Quality variance - Results vary based on how well I explain something each time
Hard-won lessons - I learned something through painful trial and error that I don’t want to relearn
The process I use: I have the AI interview me about what the Skill should do. Same technique as the problem statement - one question at a time, building context before it generates the Skill itself. This produces much better results than me trying to write the procedure directly.
Why this matters:
Skills are how your experience compounds. Each dead end you hit can become a Skill that prevents that same dead end in the future. Not just for you - for everyone you share the Skill with.
They also raise the baseline quality. Even on days when I’m tired or distracted, the Skills ensure certain standards are met. The AI follows the procedure whether I’m sharp or foggy.
The meta-benefit:
Creating Skills forces you to articulate what you’ve learned. The act of documenting “here’s how to do this well” clarifies your own thinking. I’ve often realized I didn’t fully understand a pattern until I tried to write it as a Skill.
And because Skills can be shared, they’re a way to accelerate others’ learning. The Test-Driven Development Skill I use represents hundreds of hours of trial and error. Someone else can adopt it in five minutes.
Questions for readers:
What instructions do you find yourself repeating across sessions? Have you created custom Skills yet? If so, what for? And what patterns have you learned the hard way that you wish you’d codified earlier?
The Meta-Practice: AI-Assisted Post-Mortems
The overarching lesson:
None of these five practices came from reading blog posts or taking courses. They emerged from failures - hundreds of them. The difference between those failures becoming wasted time versus valuable lessons comes down to one habit: pausing to extract the insight.
When I hit a dead end now, I don’t just restart or try a different approach. I run a mini post-mortem with the AI.
My process when I hit a wall:
I stop and ask: “Where did our path go wrong? Walk me through the decision points that led us here.”
The AI reviews what we did - not just the last action, but the sequence of choices. It might say: “We started implementing before clarifying the edge cases. That led to assumptions about how data validation should work. When we hit the foreign key constraint error, we were fixing symptoms instead of reconsidering the approach.”
That’s the insight. Not “we got an error” but “we implemented before understanding the requirements fully enough.”
Then I ask: “What should we have done differently at each decision point?”
The AI might respond: “We should have asked about the relationship between users and organizations before writing the schema. That question would have surfaced that one user can belong to multiple organizations, which changes the data model fundamentally.”
Now I have a portable lesson: for database work, map relationships before schema. That becomes either a mental checklist or, eventually, a Skill.
Why this works:
Dead ends happen fast now. A path that would take a traditional developer three days to realize was wrong? I know in two hours. That’s 12x faster feedback, which means I can run 12x more experiments in the same timeframe.
But speed only matters if you extract the lesson. Without the post-mortem, I’d just be failing fast without learning fast.
The AI is surprisingly good at this analysis because it has the full context of what we tried. It can see patterns I might miss - like “the last three times we hit this type of error, it was because we hadn’t verified the data state first.”
The compounding effect:
These five practices all came from dozens of post-mortems that revealed similar patterns:
Problem statements emerged from repeatedly building the wrong thing
Atomic breakdown came from context mismatches that caused subtle bugs
Git discipline came from wishing I could roll back more easily
MCP servers came from the copy-paste debugging cycle being so painful
Skills came from repeating the same instructions and getting variable results
Each post-mortem made one specific type of mistake less likely. Over time, entire categories of dead ends disappeared.
The honest truth:
I still hit dead ends constantly. The practices don’t prevent all mistakes - they prevent repeating mistakes. There’s always a new way for things to go wrong, especially as tools evolve and I tackle more complex problems.
But the surface area of “mistakes I’ve already made” keeps growing. And each one that gets codified into a practice or Skill is one I don’t have to relearn.
The invitation:
This is why I’m sharing these practices openly. Not because they’re the definitive answer, but because they represent one path through the learning curve. Your dead ends might reveal different patterns. Your post-mortems might surface insights I’ve missed.
The faster we share what we’re learning, the faster we all level up. These tools are evolving rapidly - Claude Code didn’t exist a year ago, Skills just got a major update, new MCP servers launch weekly. The best practices are being written right now, in real time, by everyone building with these tools.
So when you hit your next dead end - and you will - pause and ask where it went wrong. Extract the lesson. Share what you learned. That’s how we turn hundreds of individual dead ends into collective wisdom.
Closing: The Invitation
Recap the journey:
A year ago, I was stuck in setup hell - fighting with environments and dependencies instead of solving real problems. Today, I’m building production applications because I learned to fail fast and extract lessons faster.
The path wasn’t avoiding mistakes. It was hitting hundreds of dead ends in hours instead of weeks, then running post-mortems to capture what went wrong.
The five practices as scaffolding:
These practices emerged from those post-mortems:
Problem statements prevent solution bias by forcing clarity before code
Atomic breakdown loads context deliberately through interview-first workflows
Git discipline provides rollback safety, with PR-level security review preventing vulnerabilities at scale
MCP servers encode best practices by giving AI direct access to real systems
Skills codify hard-won lessons into persistent, shareable procedures
The real insight:
These tools let you fail fast enough to learn from it. The question isn’t “how do I avoid mistakes” - it’s “how do I extract the lesson before I repeat it?”
Every dead end is either wasted time or a lesson. The difference is whether you pause to ask “where did our path go wrong?”
Call to action:
What practices have emerged from YOUR dead ends? What am I missing that you’ve learned? Which of these resonates with your experience, and which feels off?
The best practices for agentic coding aren’t written yet. Or rather, they’re being written right now - by all of us, through our experiments and failures.
Share your lessons. Let’s accelerate each other’s learning.
Final thought:
We’re all figuring this out in real-time. The curriculum is written in dead ends. The faster we share ours, the faster we all level up.
What dead end taught you something valuable this week?


Thanks for writing this, it clarifies a lot, this agentic approach really makes AI less like slop and more like a propper pair programmer, super cool for us teachers.