The plan mode I wanted — Cody Salmond

Claude Code has a plan mode. I used it a lot in Claude Desktop. I think it really eases people into the workflow. You don’t trust the agent. You shouldn’t, either. Especially for complex problems. My problem with plan mode is that it’s just so awkward. Especially in the CLI. You get this giant hunk of markdown that might go off the rails after the first paragraph, then you enter a bizarre back and forth with the agent where all you really did was solve the problem yourself for your agent.

Some experience and a basic knowledge layer can help with this. Applying them can get a lot of the obvious things out of the way early or automatically. Plan mode doesn’t seem as necessary. The agent has a wealth of information to draw on about how you like to work, and can make a lot of assumptions. Naivete falls away into competence. And competence usually means design calls.

Agents are bad at those.

Suddenly the first 10 messages of each session become a flood of open ended questions that are given to you in a big untenable lump to answer all at once. Your eyes glaze over, you answer the important parts, and try to chip away at your agent’s confusion until things finally resolve. And over time, your context bloats as you flip flop on decisions after you’re exposed to more information and suggestions from your agent. Your agent forgets to ask a question you didn’t answer right away again. You might get confused. You might get anxious. Your agent is definitely confused (even if it insists it isn’t).

It works. And it’s exhausting.

What I actually wanted was an interactive plan mode. Dump everything into digestible, parallel, threads. I can read each one, check the relevant information, confirm or push back on a recommendation from the agent, and continue or revisit points until everything is resolved.

That’s workdoc.

How it works

All workdocs really are is a specific type of markdown file. An agent is given a task, and is told to create a workdoc. The agent orients itself, looks at the problem, and generates a file to confirm all of its assumptions and get answers for open ended questions. A “fork” per point. Each fork has:

A description of what is being decided (with a number for offhand reference)
Some options to consider
The agent’s recommendation, paired with some reasoning
An input slot for my answer, usually a paragraph or a simple “I Agree.”
A resolution (once everything is actually resolved)

This might feel like an extension of the “User Questions” interface that most users are familiar with. Except not limited by space, and non-linear. This is intentional.

At the top of each header, there’s a small emoji to keep things glanceable. We’ve got:

✅ Settled
⚠️ Agent is actively working on it
❗ Pending a decision from me

I open the file in Obsidian and edit it however I want. Answer one section, answer five, leave them all open, leave notes between them. Mention how one answer may be contingent on an in-flight answer in another section. When I’m ready, I ping the agent in chat to notify them. The agent re-reads the whole file, writes up the ones I answered, marks them ✅, adds any new questions that fell out of what I said, and writes the file back. Keep going until nothing’s open.

Some complications/solutions that fell out of this

Some calls along the way that turned out to matter more than I expected.

The answer slot is a real heading, not a bold label or HTML comment. The first draft used **Input:** followed by a placeholder comment. It looked clean in the raw markdown and was frustrating to use in Obsidian. Hard to spot, easy to leave the placeholder behind.

Switching to a real heading (#### Input) made every answer slot show up in the file outline. Now I can click “Input” in the outline and jump straight to where I’m meant to write.

No explicit mechanism for tracking dependencies on other forks. At one point, my agent suggested adding a field to track which questions blocked which. Too much bookkeeping. The agent reads the whole doc each pass and figures it out. When something I answer turns out to contradict something I’d already settled, the agent writes a new question about the contradiction instead of quietly ignoring it. The process is recorded on purpose.

Chat is still allowed to settle things. Some questions resolve faster by talking them through. The agent writes the answer into the doc on the same turn, so the file is always current. What’s not allowed is using chat to remind me which questions are still open. That’s what the emoji markers are for. Listing them in chat turns chat into a todo list, which is exactly what I wanted to stop doing.

Settled questions stay where they are. I considered crossing them out, moving them into a “done” section, deleting them. Leaving them in place with a ✅ on the heading reads cleanest. I can skim past the green ones and the file carries the whole history. This has value for future me, and my agents down the line.

Where it stands

Workdoc was set up as a skill, and since then, my pain is back down to a manageable level. It’s not perfect, and I’m sure I’ll revisit it at some point. Right now, I’m happy with it. The workdoc that designed the workdoc had thirteen forks in it. We settled them across three passes. As a test, I had a fresh agent session (one that had never seen any of this) read the file and pick up where I left off. It worked flawlessly.

The change I’m noticing most is the pacing. Plan mode and chat both cost a turn for every back-and-forth, and every turn loses a little of what was in my head. With the file holding the plan, I can think for a few minutes, answer five questions at once, and ping. With the context anchored in place instead of flowing past me like a river, I find it easier to multitask as well.