How I'm orchestrating Claude Code workflows that actually work / Gavin Vickery

I’ve been using Claude Code daily for months. Not casual use - actual production work across multiple projects. And the uncomfortable truth I keep coming back to: most people are using it wrong.

Not wrong in a gatekeepy “you’re holding it incorrectly” way. Wrong in a “you’re missing the entire point” way. The guides tell you to add subagents, configure skills, install MCP servers. They don’t tell you why these features exist or how to make them work together. They definitely don’t mention that adding more stuff often makes things worse.

It’s all about context

Claude’s 200K token context window sounds massive until you actually use it. That window is your most precious resource, and it fills up way faster than you’d expect. Every file Claude reads, every search result, every back-and-forth - it all piles up. When context fills, quality tanks. Claude starts forgetting things. Missing instructions and making bizarre decisions.

Reddit is full of people complaining about Claude “forgetting what it was doing two steps ago” or “ignoring explicit instructions.” Most of this isn’t Claude being stupid. It’s context pollution.

Once that clicked for me, the whole feature set made sense. Subagents? They’re isolated context windows. Skills? Pre-packaged project or company specific context that loads on demand. MCP servers? External context sources and actions. Slash commands? Templated context injection.

Every single feature is about managing those 200K tokens. That’s always what it comes back to.

Quick feature rundown

The terminology is confusing, so here’s the short version:

Subagents are specialized AI assistants that Claude delegates to. They get their own context window, totally separate from your main conversation. Built-in ones include “Explore” for searching your codebase and “Plan” for read-only research. When a subagent finishes, it hands off the refined results back to the main agent, and throws away its context.

Skills are capabilities that Claude decides when to use. You don’t invoke them directly - Claude picks them up automatically based on what you’re doing. Define them in .claude/skills/ folders. For me, these are project specific. Things like how React components are structured, how API endpoints are named, coding conventions, that sort of thing. You could put this all in the CLAUDE.md file, but skills let you organize it better and keep it out of the main prompt, again, freeing up the precious context window.

Slash commands are prompts you save as Markdown files. Drop commit.md in .claude/commands/ and now you’ve got /commit. Simple. I think of these as common commands or macros. Shit I type all the time and get sick of it. It’s the steps where I’ve done some “humain-in-the-loop” type review and give the agent its next instruction (which is usually git commit, run an audit, create tests, etc).

MCP servers connect Claude to external stuff - databases, APIs, web search, your local browser, whatever.

Hooks are shell commands that fire at specific moments. Before a tool runs, after a tool runs, when a session starts. Automation hooks. I don’t use these much honestly, but some good use cases are running linters and code formatters, unit tests, posting PR links to Slack etc.

The thing that tripped me up for weeks

Descriptions drive routing.

I cannot stress this enough. Claude decides which subagent to use, which skill to invoke, entirely based on the descriptions you write. Write a bad description and Claude will never touch your carefully crafted tool. I watched people spend hours building the perfect agent, slap on a description like “A code review tool”, then wonder why it never got used.

Bad: “Reviews code.”

Good: “Invoke when reviewing code for security vulnerabilities, authentication issues, and data exposure risks. Use after completing feature implementation and before creating pull requests.”

On the flip side of that, creating massive descriptions with every possible detail or comprehensive code or command examples is just a waste of context tokens. This is an art more than a science. You need just enough to guide the agent, while keeping the context tight.

The routing system needs to know when to use something. Telling it what something does isn’t enough.

What I actually use

Enough background. Here’s my setup.

Research subagent

This is the customization that changed everything for me. I have a “research” subagent wired up to context7, websearch, and fetch MCP servers. When the main agent needs to look something up - documentation, examples, what other people have done - it hands off to this research agent.

The magic is what happens after. The research agent doesn’t dump everything back into my main conversation. It reads a bunch of stuff, figures out what matters, and returns a summary. Just the relevant bits.

Why bother? Because if the main agent did the research itself, all those search results and docs would clog up its context. By using a subagent with its own context window, the main conversation stays clean. The answer comes back. The noise doesn’t.

It’s like having someone go to the library, read ten papers, and come back with a one-page brief instead of dropping a stack of books on your desk.

Slash commands I actually use

/do - My workhorse. Takes whatever I type after it and tacks on my standard stuff: which agents to use, formatting preferences, patterns to follow. Maybe 20 lines total. Saves me from repeating “use the research agent for lookups” on every request.

/commit - Git commits with my message format. I type /commit, it does the thing. No explaining my commit conventions every time and more consistent than trusting it’ll always check the related Skill.

/audit - This one’s more involved. It tells the main agent to walk through files one by one, handing each to an “audit” subagent. That audit agent checks against my standards - patterns, style, whatever I’ve defined. Doing it file-by-file prevents the context from exploding. Asking Claude to review your whole codebase at once is a recipe for garbage output.

/test - Same idea, but for testing. Dedicated test subagent that knows our frameworks.

Pattern: anything I type more than twice becomes a command.

Skills for stuff I’m tired of explaining

I have a Docker skill for one project. It knows the compose setup, the environment variables, which ports go where. Without it, Claude tries to figure out Docker fresh every session. Gets it wrong half the time. Asks me questions I’ve answered dozens of times before.

With the skill, it just knows. I stopped explaining. The skill explains for me.

Skills are basically onboarding docs that Claude actually reads.

MCP setup

Mine’s focused on research:

context7 - pulls in docs and code context, examples
websearch - google that shit
fetch - grabs content from URLs
chrome tools - lets Claude use my local browser for anything tricky or visual verification

MCP servers let Claude know things it wouldn’t otherwise know. They’re external brains, not just external tools.

What works

Write a plan before you code. Everyone says this. It’s annoying advice. It’s also correct. One person I read about spent 2 hours on a 12-step implementation doc, then let Claude execute step by step. Saved 6-10 hours. The upfront planning feels slow. It’s not.

Delegate anything that would pollute context. Research, code review, testing - hand it to a subagent. Keep your main conversation focused on one thing.

Return summaries, not data dumps. When a subagent finishes, it should give you conclusions. “Auth uses JWT with refresh tokens in httpOnly cookies” beats pasting the entire auth module.

Review one file at a time. Don’t ask Claude to audit your whole codebase. It’ll either refuse, phone it in, or fill its context with so much code it can’t think. One file. Dedicated review agent. Repeat.

Clear often. Use /clear between unrelated tasks. Start a new conversation when you’re past 60% context. Don’t let your conversation become a junk drawer.

How much autonomy is too much?

The community calls full autonomous mode “YOLO mode” - running --dangerously-skip-permissions and letting Claude do whatever. Some people swear by it.

I don’t go that far. My approach:

I decide what needs to happen
Claude does the tedious parts
I check before moving on
Nothing ships without me looking at it

There’s a post called “30 Days of Claude Code” that nails the tension: “I shipped more projects in 30 days than in the previous six months, but I was becoming a manager of AI output rather than a programmer.”

That’s the deal. You ship faster. You’re also not really coding anymore - you’re reviewing. Whether that’s a win depends on what you care about.

I treat Claude like a fast junior dev with great memory but questionable judgment. It does a lot. I’m still responsible for what goes out.

What doesn’t work

Adding tools without good descriptions. You can install 50 MCP servers and 20 agents. If your descriptions suck, Claude ignores all of it. More features + bad routing = wasted effort.

Letting conversations get bloated. “Just one more thing” until Claude is confused and contradicting itself. Clear aggressively. Fresh conversations are cheap.

Auto-accepting everything. The speed is addictive. But I’ve caught bugs that would’ve shipped if I’d clicked accept without looking. Happens more than I’d like to admit.

Giant prompts. “Build me authentication with OAuth, JWT, password reset, and 2FA” will get you garbage. Break it down. One piece at a time.

Expecting Claude to remember. It doesn’t. Not between sessions. That’s what CLAUDE.md is for. If you’re repeating yourself, write it down.

Over-engineering before you start. I’ve watched people spend days perfecting their Claude setup before writing any code. Start simple. Add stuff when you hit friction. Version five will be better than version one no matter what.

The stuff that still sucks

Claude hallucinates. Makes confident claims about things that aren’t true. Sometimes says it did something when it didn’t. Better config doesn’t fix this. You have to stay alert.

Context loss happens even when you’re careful. I do all the subagent delegation, all the context hygiene. Claude still loses the thread sometimes. The compact feature can make it worse.

Quality varies day to day. Some sessions Claude feels sharp. Others it feels like it forgot how to code. Anthropic has confirmed bugs. Model updates change behavior. Something that worked last week might not work today.

And the uncomfortable question: are you learning less? If Claude does all the hard stuff, do your skills atrophy? I don’t have a good answer. It’s something I think about.

If you’re just starting

Don’t overthink it.

Set up CLAUDE.md first. Document your project. Coding conventions. Common commands. Run /init to generate a starting point, then edit it. This one file makes a bigger difference than any other configuration.
Add one slash command. Whatever you type most often, make it a command.
Build a research subagent. Even a basic one that just does web searches in its own context window will clean up your main conversations.
Use what’s built in. The Explore agent is already there. Get comfortable with it before building custom stuff.
Add config when something annoys you. Not before. Friction tells you what to fix.

Where this goes

The tools will get better. Context windows will expand. Routing will improve. Half of what I’m manually configuring now will be automatic eventually.

But the core tension - automation versus oversight - isn’t going anywhere. As these tools get more capable, the question of how much to trust them gets harder, not easier.

The developers who do well with this stuff won’t be the ones who write the most code. Won’t be the ones who hand everything to AI either. It’ll be the people who figure out when to delegate and when to verify. Who treat AI as a tool, not a replacement for judgment.

Writing code isn’t the hard part anymore.

Need some help with this stuff? Hit me up on LinkedIn and lets nerd out together.