My AI-Assisted workflow

How I built a structured AI-assisted development workflow as a Tech Lead, where the real work happens before a single line of code is written.

You open a chat, describe what you want, iterate on the output, and ship something that more or less works. It feels fast. The features work, technically. But nobody, including me, fully understood what is there. Edge cases nobody thought to handle, architecture that made sense in the moment but didn’t survive contact with the next feature. A growing sense that I was building faster and understanding less.

The problem I kept running into was this: how do you get the speed benefits of AI assistance without losing the clarity and intentionality that makes software maintainable? The short answer is that the real work happens before the coding starts.

The core idea: thinking in writing, not in code

What is AI actually good at? Implementation. What is it genuinely bad at? Figuring out what you actually want, catching the assumptions you forgot to make explicit, and telling you when your mental model of the problem is wrong. That’s your job. It will always be your job.

The single most valuable shift I made was treating every feature as a thinking problem first and an implementation problem second. The workflow is designed to force that thinking to happen before any code is written, and to use AI to stress-test it, not to skip it.

This workflow has been adapted to work with me from Mark Pocock’s skills.

The workflow

Step 1: Free-form plan

Everything starts with a document I write myself, in plain language, with no required structure. I describe the problem, my initial thinking about the solution, the constraints I’m aware of, and the things I’m uncertain about. This is not a deliverable, nobody reads it but me. Its only purpose is to get the thinking out of my head and into a form I can examine.

The quality of everything downstream depends entirely on the quality of this step. A vague plan produces a vague PRD, which produces vague issues, which produces code that technically runs but doesn’t do what you meant.

Step 2: PRD via write-a-prd

The free-form plan becomes the input to a structured interview process. The skill explores the codebase to understand the current state of things, then interviews me relentlessly about every aspect of the plan, walking down each branch of the design tree, resolving dependencies between decisions one by one. This is the step where bad ideas get caught. Not because the AI is smarter than me, but because being forced to answer specific questions about your own plan reveals the places where you were hand-waving. “How does this behave when the user isn’t authenticated?” “What happens if this operation partially fails?” “You said this replaces the existing feature, what happens to users who depend on the current behavior?”

The output is a structured PRD file containing a problem statement, a solution description, an extensive list of user stories, implementation decisions (modules, interfaces, schema changes, API contracts), module design, testing decisions, and explicit out-of-scope items. Everything is explicit. The user stories are the backbone of everything that follows. They need to be specific enough that acceptance criteria can be derived from them unambiguously downstream, not at this stage, but when scope is defined at the issue level.

Step 3: Issues via prd-to-issues

The PRD becomes a set of issues using vertical slices, tracer bullets that cut through every integration layer end-to-end rather than horizontal slices of a single layer. A slice that only touches the database, or only touches the UI, is not a valid slice. Each issue should deliver a narrow but complete path that is demoable or verifiable on its own. Each issue is classified as either AFK (the AI can implement and the change can be merged without human interaction) or HITL (a human decision is required at some point during implementation). Preferring AFK over HITL wherever possible keeps the work moving without becoming a bottleneck on my attention.

Before anything is written, the skill presents the proposed breakdown and asks: does the granularity feel right, are the dependency relationships correct, should anything be merged or split? Issues are written in dependency order so that cross-references between them use real numbers.

Each issue contains a concise description of the end-to-end behavior, a “how to verify” section describing exactly how to confirm the slice is complete, acceptance criteria in Given/When/Then format including error cases, a list of blockers, and references back to the user stories it addresses. Everything lives in files. I work across different platforms, sometimes GitHub, sometimes GitLab, and keeping the workflow file-based means it doesn’t depend on any particular tool.

Step 4: Tasks via issues-to-tasks

Each issue is broken down into concrete, ordered tasks, one task per focused AI session. The constraint is deliberate: if a task can’t be completed in a single session, it’s too large. The skill explores the specific parts of the codebase touched by the issue, identifies existing patterns to follow, and produces a task list with types (WRITE, TEST, MIGRATE, CONFIG, REVIEW), explicit outputs, and dependency order. Schema before logic, logic before API, API before UI, tests interleaved rather than batched at the end.

The key design decision in the task descriptions: they are written as instructions to the AI that will execute them, not as notes to a human developer. Each task specifies which files to touch, which existing patterns to follow, and what the output looks like when done. No code snippets, just intent, not implementation.

Step 5: The handoff to code

Each task description is a self-contained prompt. When I’m ready to implement a task, I open a fresh session and paste the task description along with the parent issue for context. The task description was written for this purpose, it specifies scope, references the right files and patterns, and defines what done looks like. Fresh context per task is intentional. Long sessions with accumulated context tend to drift: the model starts making decisions based on what it already did rather than what the task requires. Starting clean with a well-scoped task consistently produces better output than continuing a long session. For REVIEW tasks, the ones flagged as requiring a human decision, I stop, make the decision, update the task file with the outcome, and continue. These are the moments where the workflow earns its keep: the decision is made deliberately, in context, not buried in a long generation.

Step 6: Code review via code-review

Every PR goes through a structured six-pass review before merge. The passes cover logic errors, operation ordering, bad practices, security, magic strings and values, and pattern improvements.

Operation ordering deserves particular attention in AI-generated code. Models tend to produce code that does the right things but sometimes in the wrong sequence: sending a notification before committing a transaction, writing an audit log after the action it should record, mutating state before validating input. These bugs are easy to miss in review because the code looks correct at a glance.

The review runs on a file or diff, not on the whole feature. Scope is deliberately narrow, catching issues at the PR level is far cheaper than finding them later.

Step 7: Final audit via final-audit

At the end of a feature, a cross-cutting audit looks at things that can only be evaluated across the whole implementation. Not individual bugs, those should have been caught per PR, but systemic issues: inconsistencies between modules, patterns that were introduced early and replicated incorrectly everywhere, security assumptions that hold in isolation but break down across the full surface area.

The audit reads the full implementation before flagging anything, which is the point. It groups findings by severity, gives an explicit overall verdict on whether the feature is safe to leave in production, and asks for approval before making any changes. Unsupervised fixes on already-merged code are riskier than fixes made during development.

What this workflow is not

It is not fast to set up. The planning and PRD steps take real time, and the temptation to skip them in favor of going straight to code is constant. The workflow only pays off if you genuinely believe that thinking time before coding is cheaper than debugging time after.

It is also not a replacement for engineering judgment. The AI will suggest reasonable things at every step that are wrong for your specific situation. The review steps, where you evaluate the breakdown before anything is created, exist precisely because the AI’s output needs to be validated against knowledge it doesn’t have: your team’s conventions, your users’ actual behavior, the parts of the codebase that have hidden complexity.

The underlying principle

Every step in this workflow has the same structure: AI produces something, you review it with full context, then it gets created. The AI accelerates the production. The review is yours, always.

The workflow is designed to make that review as effective as possible, by ensuring that when you’re evaluating an issue, you have a PRD to check it against; when you’re evaluating a task, you have an issue to check it against; and when you’re reviewing code, you have acceptance criteria to check it against.

The skills

Now if you read this article to the end you at least deserve a link with my skills. Check my GitHub repo.

Read Next