The alignment loop
The practical workflow behind alignment before autonomy: grill sessions, atomic PRDs, vertical slices, and fresh-context implementation. Every step is context engineering.
Contents
I wrote about why alignment matters before autonomy — agents fill every gap with training-data defaults when you skip upfront understanding. That post was the argument. This one is the workflow. Here’s how I actually do it, end to end.
This builds on Matt Pocock’s skill-driven approach, stripped down to what I use day-to-day. I used a heavier spec system before (OpenSpec) — the models are good enough now that a PRD and well-sliced issues is all the structure you need.
The grill session
Every project, feature, or meaningful change starts here. I run a grill-me session — the agent asks me 10 to 50 questions about what I’m building. Not a requirements interview. A design unpacking.
This works in two directions. I surface decisions I haven’t consciously made — constraints, trade-offs, scope boundaries. The agent surfaces approaches I wouldn’t have considered. LLMs default to the most common representation in their training data. The back-and-forth conversation adds enough entropy — my constraints, my edge cases, my opinions — that the model’s responses diverge from generic defaults into something actually useful.
In many cases the agent’s recommendation is good and I acknowledge it. In many cases I add details that push the conversation somewhere the model wouldn’t go alone. The result is genuine alignment between two parties that started with different assumptions.
Building shared language
The extended version — grill-with-docs — does something extra. As terms get resolved during the conversation, it updates a CONTEXT.md glossary in the project. “Account” means the billing entity, not the user. “Cancellation” means the order is voided, not that the subscription ends.
This matters for future conversations. When the agent picks up the project in a new session, the shared language is already there. No re-explaining. No drift. The grill session isn’t disposable — it leaves artifacts that compound.
Capture: the PRD
When the grill session reaches shared understanding, I convert it to a PRD (product requirements document) using the to-prd skill. The conversation becomes a structured document — what we’re building, why, what’s in scope, what’s not.
Keep PRDs atomic. What feels like one feature from the user’s perspective is often three from the implementation side. Smaller PRDs produce sharper issues. Sharper issues produce better agent output. It is very easy to over-scope a PRD — resist the temptation.
The PRD captures the alignment so it survives beyond the conversation. The grill session might be 50 messages long — the PRD is the distilled version that another agent, in another session, can act on without the original context.
Plan: vertical slices
From the PRD, still in the same conversation context, I generate issues using the to-issues skill. Issues are discrete work items — like Jira tickets or GitHub issues — each describing one thing to build. The grill session and PRD context is right there — the agent has full understanding of the design when it creates the slices. This is the last step before I start fresh. The issues are sliced vertically, not horizontally.
Horizontal slicing builds in layers: database first, then backend, then frontend. The problem is you don’t get anything testable until all three layers exist. Integration problems hide until the end.
Vertical slicing builds thin end-to-end features: one pass that touches the database, API, and UI together. Each slice is independently testable. Each slice delivers something you can see and validate. The complexity increases gradually across slices rather than arriving all at once at the end.
This matters because each issue becomes a self-contained prompt for a fresh agent. If the issue is a thin vertical slice, the agent has everything it needs — clear scope, clear boundaries, clear success criteria. If the issue is “build the entire database layer,” the agent will make hundreds of decisions you haven’t aligned on.
Implementation
Fresh context
Context management is critical. I don’t continue the planning conversation into implementation. The planning produced artifacts — the PRD, the issues. Those artifacts are the handoff.
Each issue gets implemented in a fresh agent session. The agent picks up project-level context from skills and project files, reads the issue, and builds. No accumulated drift from a 200-message planning conversation clogging the window.
Models are smart within about 100k tokens. Past that, they degrade. This workflow is designed to never need more.
Interactive, not AFK
I stay present during implementation. AFK loops — handing the agent a task list and walking away — tend to go sideways for anything complex. The agent builds confidently in the wrong direction and you discover it too late.
Instead, I validate each issue as it’s built. I have the context. I can catch a wrong turn immediately rather than unwinding three hours of autonomous drift.
TDD as verification
The agent uses test-driven development: write a failing test, make it pass, refactor. The failing test is the spec. The passing test is the proof. TDD gives the agent a tight feedback loop and gives me confidence that alignment is maintained at the code level — not just the document level.
Prototyping
Sometimes after a grill session I prototype instead of going to the PRD. This is useful for visualizing frontend ideas — spin up a throwaway to validate an interaction pattern before committing to building it properly. The grill session clarified what I want; the prototype confirms it looks right.
Delegation
I recently moved from Claude Code to pi agent for better context management. One pattern that fits naturally is delegation.
From the coordinator agent, I can spawn a fresh pi agent in a new terminal pane (via cmux) with an enriched task description. The coordinator holds the plan — it knows all the issues, the overall architecture, the relationships between pieces. Each delegate gets one focused task with the context it needs, works in isolation, and reports back.
The coordinator monitors progress and flags if a delegate looks stuck. It’s a natural extension of the “fresh context per issue” principle — each delegate is a clean session focused on one slice.
I’ll write a dedicated post on the pi agent and cmux setup. For now, the key insight is that the same principle applies: alignment travels through artifacts (the issue description), not through shared conversation history.
Coming soon — the tooling setup behind this workflow.
The discipline
This workflow isn’t a one-shot setup. The discipline is running the loop for every increment:
Grill → PRD → Issues → Implement. Then again. And again.
It’s tempting to skip the grill after the first iteration goes well. The PRD was great, the issues were sharp, the implementation landed cleanly — surely the next feature can skip straight to coding? That’s when drift starts.
Each increment gets its own grill, its own atomic PRD, its own vertical slices. The loop stays tight. The alignment stays fresh.
For a bug fix or a quick script, I skip the ceremony and talk to the agent directly. This workflow is for anything complex enough that I’d regret not thinking it through first.
Every step is context engineering
Looking at this workflow end to end, every step is context engineering.
The grill session builds context — forces alignment into existence through conversation. Grill-with-docs captures shared language that persists across sessions. The PRD compresses that context into a document that survives beyond the conversation that created it. The issues compress further — each one is a self-contained unit of context that fits cleanly in a fresh agent window. Fresh sessions keep the token window clean. TDD provides verification context at the code level.
Models are capable within their effective context window. Past about 100k tokens they get worse, not better. This workflow is designed to keep every agent interaction well inside that boundary while preserving all the alignment that matters.
Context engineering isn’t a separate concern. It’s what this entire workflow is.
The conceptual argument — why alignment matters more as autonomy increases.
How skills, commands, and project files shape agent behavior.