The Project Under The Project
I have many small goals and curiosities lined up for my batch at the Recurse Center, but also two big themes of proposed work: build Chromaculture and get better at wielding LLMs. While I didn’t set out to develop intense scaffolding for an AI, I did end up learning a lot about the project under my project.
Chromaculture has a database with real data in it, a deploy pipeline, secrets, and a metered hosting plan. I couldn’t just unleash Claude on all that in the hope it wouldn’t delete something valuable or take prod in some surprise direction.
A Repo Claude Can Actually Work In
I want to be super involved in planning and design for Chromaculture and own the code that ships, but don’t want to babysit a robot (or slow it down) in between. I didn’t want to second-guess every long shell command, and I knew about approval fatigue. I didn’t want to push breaking changes (though I still did). I wanted a repo structure that was agent-friendly, but still followed coding patterns and engineering practices that I’d decided on.
How do you assign a coding agent real work without having to look over its shoulder the whole time?
Don’t treat guardrails and autonomy as a single dial, with more of one meaning less of the other. They’re different layers, so put each concern on the layer that fits. My Chromaculture project uses skills and detailed plans to enable agent autonomy. Hooks, permissions, memory, and more provide constraints.
I’ll go through how each of these developed, and you can see sample files for what I’m describing in this repo.
Skills: The Repeatable Playbooks
A skill is a saved procedure the agent can invoke, like /complete-ticket, /create-plan, or /review-pr. Working early issues end-to-end was repetitive: load the conventions, read the issue, ask me the questions that block progress, plan it, branch, implement, verify, hand off a draft PR. So I wrote it down once as /complete-ticket, and now that whole arc is one command.
Skills are more than glorified macros. They put the guidance where the action is, which is the highest-salience prose available. A skill step running right now gets read far more reliably than a memory loaded an hour ago. And they bake the human checkpoints into the flow. /create-plan interrogates me until every real decision is locked in a plan document or GitHub issue before any code gets written. A settled plan is exactly what lets a later session run a whole phase on its own, since the hard calls are already made. /complete-ticket knows to pause at the genuine gates, like running a migration, and not at the routine stuff.
The agent is only waiting on me for the handful of decisions that are mine to make.
Mechanism over Memo
The obvious way to make a rule stick (writing it down) barely works. Even the most thorough rulebook in the world still has to compete for attention with everything else in the prompt, and can lose. Compliance is a probability, not a promise, weakest exactly when you’d want it most: a fresh session, a long context, a rule buried thousands of tokens back.
So I started structuring my constraints tooling according to this hierarchy:
A deny hook. The agent physically cannot do the thing. Holds every time, every session.
An ask gate. A human decides at the moment of action.
The allow-list. Make the right path the frictionless one, so it’s the path of least resistance.
Prose, like CLAUDE.md and AGENTS.md. In context every session, moderate pull.
Memory. Loaded, but in the background. The lowest tier, and the one I would watch get ignored.
For anything that has to hold, push it toward 1 and 2. Use 3 to make the correct path the easy one. Recognize 4 and 5 aren’t promised. When I catch the agent sailing past something I wrote in a memo, rather than more memo or annoyed correction, I consider whether the thing should move to a different tier.
Everything below is that idea, applied.
CLAUDE.md and AGENTS.md: The House Rules
These two files are the prose tier, and they load first each session. The README tells human contributors how to get started; AGENTS.md and CLAUDE.md are the same instinct pointed at agents. AGENTS.md covers what’s true about the stack. We’re on Next 16, which has breaking changes from what the model thinks it knows, Prisma 7 with a driver adapter, and server components by default. The model’s training data will lead it confidently astray if I don’t state these up front.
CLAUDE.md covers how I want to work: code style, the error-handling tiers, the git workflow, the pre-commit checklist. Self-documenting code over comments. Design choices. Open PRs as drafts and never merge.
This is tier 4. It persuades, but doesn’t enforce. So the test for what belongs here is whether it matters if the agent occasionally forgets. Naming conventions, sure. “Never wipe the production database,” no.
Hooks: The Hard Stops
This is the enforcement tier, where the real guardrails live. The allow-list, the deny rules, and the hook wiring all live in .claude/settings.json. Hooks themselves are small scripts that run before a tool call and can let it through, block it, or send it back for a human yes or no. They don’t persuade. They intercept.
I leaned on deny-as-reroute. A blocked command is a signpost, not a wall. A recursive grep -r that could cough up a dotfile secret gets denied, with a message pointing at rg, which skips hidden files and is auto-approved. A git -C <some-path> gets rerouted to bare git, which matches the allow-list. The deny doesn’t block the goal. It blocks the method and names a better one.
For example, the agent can commit, push, and open draft PRs. It cannot merge, and it cannot push to main without me lifting a one-time escape hatch. How branches come together stays my call: git merge and rebase get rerouted to the PR and git pull to a fetch plus a fresh branch.
A couple other hooks I have:
The database can’t be wiped. prisma migrate reset, db push --accept-data-loss, and friends are denied across every way you might invoke them. There’s no safe equivalent, so there’s no reroute. Just no.
Secrets are unreadable. Every path to a .env file is denied (Read tool, printenv, recursive grep, cat) so reading one is blocked outright rather than left to a prompt I might wave through. The variable names live in a checked-in TypeScript file. If the agent needs a value, it asks me.
Local work is safe from blunt instruments. Denies include: rm, git reset --hard, git clean, branch deletion, history rewrites. The agent can stage, commit, and branch, but can’t destroy uncommitted work or rewrite what’s already there.
Lessons learned along the way:
Initially, I’d written the push-to-main gate as an ask, to stop and check with me before any push to main. It did nothing. A hook’s ask is silently overridden by an explicit allow-list entry, and git push was already on the allow-list, so the prompt never fired. The precedence is deny, then allow, then ask, which means for a rule that has to hold on an already-allowed tool, only deny holds. I only knew because I asked Claude to try the push.
Claude can’t see what I was asked, only the result of the deny-ask-allow. So it might believe, and confidently claim, something didn’t request approval when it did. You gotta watch it actually behave or misbehave.
Pushing to main requires a sentinel file in /tmp: if it’s there, the gate opens until I remove it; if it’s not, the deny stands. The first version let the agent create that file, which meant the hard block was actually “please don’t elevate yourself.” The fix was to deny every way the agent could create it (Write, Edit, Bash), so the hatch only opens from outside the session, by me.
Separate hook scripts for each guard meant I was paying for six process spawns on every Bash command the agent ran. So I folded them into one dispatcher that loads each guard and composes their verdicts: first deny wins, and an ask never gets demoted to an allow. That made the precedence explicit instead of a byproduct of hook order.
The pieces don’t keep working together on their own. You write a guard, you add a second, and you have to go back and make them one coherent thing, each still in its own file with its own test.
MCPs: The Scoped Capabilities
The styling conventions in CLAUDE.md say to verify UI work in the browser before calling it done. For a while that was a lie of omission, because the agent couldn’t see the browser, so “verified” meant “looked plausible in the code.”
An MCP is plug-and-play tooling for the agent. Now I have both the Chrome and Playwright MCPs available, so the agent can drive a real browser: navigate, click, screenshot, read the console. The mechanics of it live in a runbook so the agent has step-by-step guidance: compare the screen against the design mocks, watch the console for errors and warnings, confirm role-based views, and check the work is responsive.
Browser verification is a capability that needs scope. The browser starts signed out. When the agent needs to test auth-gated UI, it signs in as the lowest role that proves the point. Signing in as an admin is itself gated behind a hook, because admin elevation is the kind of thing I want to be asked about. Least privilege isn’t only for the app’s users. It’s for the agent too.
Memories: The Best-Effort Notes
Memory is the bottom of the hierarchy. It’s a folder of small files, one fact each, that persist across sessions: who I am, corrections I’ve given, project state that isn’t derivable from the code. Things like “Dev QA users are already seeded, try logging in before re-seeding,” or “preview app sign-in isn’t available to Claude, hand that part to me.”
Memory is good at killing repeated friction. The agent stops re-asking the question I answered last week. But it’s the most ignorable text in the project, so nothing I’m counting on goes there. Resist the urge to add a memory that says “never do XYZ.”
Docs: The Durable Stuff
Not everything is a rule or a fact. Some things are explanations: runbooks, configuration write-ups, post-incident notes, the reasoning behind a decision I’ll otherwise re-litigate in three months. Those go in my docs directory.
I keep multi-phase implementation plans here too, so a fresh session can pick up one phase of work without me re-explaining the whole arc.
It’s also where the agent’s own behavior gets written down, including the enforcement hierarchy this whole post is about, so it’s something concrete I can point at and revise.
asked.md: The Feedback Loop
I also keep a running log of every command that prompted me for approval during a session in asked.md. Each row records what ran, what it was for, why it wasn’t auto-safe, and which rule caught it. When I get a handful, I run a plan-work-review-refine session to evaluate the ask and then improve the guardrails. That session is itself a skill, /harden-llm-permissions. Skills don’t only ship features; one of them exists to turn this log into tighter guardrails.
A good approval request is evidence my constraints are working: a judgment call that can’t be mechanized, failing safe into a human checkpoint. A dumb request is friction I can design away, like a read-only command that should have been auto-approved, or a safe path I hadn’t allow-listed yet. The good ones I leave. The dumb ones become a new allow-list entry or a smarter hook.
Going through each approval request with Claude can shed light on a class of behavior you can reroute. For example, I was constantly being prompted for Bash approval, even for simple reads the agent should do. Appending | head or ; echo $? to an allow-listed read forced a prompt. The trailing pipe shifts the command off the literal allow-list match, so tidying up the output made a safe command less safe. The fix wasn’t code, it was a habit: run reads bare.
Similarly, the agent kept validating JSON with node -e, and it prompted every single time. Rightly so, because “run arbitrary Node” is the one thing you can’t write a safe pattern for. Swapping to jq empty, which does one job, made it allow-listable and the prompts stopped. A wide tool you have to gate, a narrow tool you can bless. Reach for the narrow one.
There’s a limit: you can’t hook an omission. A hook fires on a wrong command. It can’t fire on a command that never ran, like the test that didn’t get written or the browser check that got skipped. There’s no string to intercept. For those I’m back to making the right path frictionless and letting a downstream gate catch the miss.
This adds up. Every session the agent’s autonomy goes up a notch as the dumb prompts disappear, and the guardrails stay just as tight, because the good prompts survive. The setup gets more permissive and no less safe at once, since I’m only ever removing friction that wasn’t protecting anything. Because I keep tuning what waits on me, the questions I do get are higher-impact ones.
On We Go
Sanitized examples of what I’ve described above are in this repo, so you can see how things fit together. It is a snapshot of what I had as of this writing, and isn’t alive like my real Chromaculture tooling.
We have to keep iterating and improving. If I’ve learned anything during my agentic adventures, it is that everything AI-related is immediately obsolete. Like the new car driving off the lot, I expect the techniques in this post and the work in that repo have already deteriorated in value.
My next step is to build on this with agent orchestration and loops.
The color app is coming along, too!