Recursive Copilot Instructions: A Practical Way to Make AI Consistent Across a Team
I stopped treating .github/copilot-instructions.md like a rules dump—and started treating it like an architecture.
I hit a wall recently on a large-scale enterprise modernization effort. Not the “I need a nap” wall. The grinding kind: I knew exactly what needed to happen—sequencing, refactors, boundary enforcement, repeatable patterns—but I couldn’t get the tooling to behave consistently enough to help at scale.
AI assistants were “available,” sure. But without the right context and constraints, they kept producing the same category of output: generic, inconsistent, and occasionally subtly wrong in ways that cost me more time to clean up than they saved. It felt like having a junior engineer with infinite confidence and zero situational awareness.
That night, I vented to my husband. He’s a domain architect—non-coding most days, but relentlessly good at systems thinking. I told him, verbatim, that I didn’t have the right tools to do my job. He didn’t argue. He went searching. Later, he sent me an MIT write-up on recursive LLM processing with a note like: “This feels like what you’re missing.”
Timing matters. Having the right person beside you matters, too.
The problem wasn’t the AI. It was how I was instructing it.
What “Recursive” Actually Changed for Me
MIT’s “recursive language models” idea (as summarized in VentureBeat and expanded in the paper) reframes long-context reasoning as a systems problem. Don’t force a model to hold everything in one giant prompt. Give it a way to inspect, summarize, and retrieve the right slices of context as it works.
If you’ve ever fed an assistant a giant blob of documentation and watched it degrade into vagueness, you already understand the failure mode: the model tries to pay attention to everything at once. The recursive approach builds a hierarchy—global summary → more specific summaries → even more specific shards—and uses that hierarchy to pull what’s relevant at the moment of need.
The paper gets into massive token scales, but what mattered to me day-to-day was the pattern: hierarchy, retrieval, and scope control.
Consistency isn’t a model problem—it’s a context architecture problem.
I read it and immediately thought: I already have a mechanism like this in every repo—and I'm using it like a sticky note.
The Translation: From Research Pattern to Copilot Instructions
Most teams treat GitHub Copilot instructions as a flat list of rules: “Use X. Don’t do Y. Follow Z.” That’s a decent starting line, but it behaves like a single prompt stuffed with policies. Over time, the file accumulates exceptions, edge cases, and “also, don’t forget…” until it becomes a grab bag. The assistant can’t tell what’s universal versus situational, so it guesses. And when it guesses, you pay for it in review time.
So I borrowed the recursive idea and mapped it into a four-layer structure inside .github/copilot-instructions.md:
Layer 0 is global invariants: always-true constraints such as security posture, language/runtime constraints, and baseline engineering principles.
Layer 1 is architectural principles: dependency direction, module boundaries, cross-cutting concerns, and “what belongs where.”
Layer 2 is a domain context map: bounded contexts, ownership, hard “must not” boundaries, and how contexts communicate.
Layer 3 is pattern anchors: named patterns that can be invoked (“when you see X, apply Y”), each pointing to canonical implementations.
The key shift is subtle but powerful: lower layers don’t repeat higher layers—they refine them. If Layer 0 says “don’t leak infrastructure concerns,” Layer 2 doesn’t restate it. Layer 2 says where the boundary is and what the leak looks like in this codebase.
That’s the DRY part. If a Layer 0 rule shows up copy-pasted in Layer 2, the instruction system is already rotting.
Treating instructions as a layered system stopped Copilot from improvising my architecture.
Below is a generic Layer 3 “pattern anchor” entry—concrete enough to reuse, abstract enough to stay safe.
PATTERN: REPOSITORY_PATTERN
TRIGGER: "data access", "repository", "query", "persistence", "db call"
INTENT: Keep persistence concerns isolated and testable; no business logic in repositories.
DO:
- Define an interface I<Aggregate>Repository in the domain or application boundary.
- Implement it in infrastructure/persistence.
- Use constructor injection; keep methods narrowly scoped and async where applicable.
- Return domain-friendly shapes (avoid leaking ORM/driver types).
DON'T:
- Add logging noise in repositories; bubble meaningful errors and handle higher.
- Let repositories call external services.
ANCHOR:
- "Canonical example lives in /src/<your-solution>/... (point to the repo’s real path)"Notice what’s missing: agency/client/system names, architecture diagrams, project details. The structure does the work. The anchor points to evidence inside the repository.
The SOLID Meta-Layer Nobody Talks About
This is where it clicked for me: the instructions file has to eat its own cooking.
We expect engineers to write code that’s maintainable, composable, and extendable. But many teams treat AI configuration like a junk drawer: all rules at the same level, repeated everywhere, updated by whoever is most annoyed that week.
A recursive structure forces an engineering discipline:
Single Responsibility shows up naturally. Layer 0 defines invariants. Layer 2 defines domain boundaries. Layer 3 defines reusable patterns. Each layer has one job.
DRY becomes enforceable. Constraints are declared once, then specialized—never duplicated.
Open/Closed becomes practical. You extend by adding a new domain section or a new pattern anchor, not by rewriting the global rules every time a new edge case appears.
What surprised me was the downstream effect: applying SOLID to instructions reduced architectural arguments during code reviews. Not because humans stopped disagreeing, but because the assistant stopped injecting chaos. “Where does this belong?” becomes less of a debate when suggestions consistently respect boundaries and patterns before a human even sees the difference.
This also changes onboarding. New engineers don’t just inherit tribal knowledge. They get a living “how we build software here” contract—applied at suggestion-time.
If you want fewer debates in review, stop letting the assistant freestyle your boundaries.
The “Cheap Model” Reality Check
Now for the operational constraint: tokens, request budgets, and model access are not infinite. Whatever your team uses—Copilot, Claude, ChatGPT, an internal agent—you’ll eventually hit some version of the same management question: how do you scale assistance across a team without scaling cost or friction?
I’ve seen teams answer that with either (a) “we’ll just use the best model everywhere,” or (b) “we’ll ban premium models and hope for the best.” In practice, both approaches fail—one on budget, the other on usefulness.
My answer has been: stop asking the model to guess your architecture.
With recursive instructions, a smaller or cheaper model doesn’t need to infer as much. It’s handed a hierarchy of context that’s scoped, named, and consistent. For well-defined tasks—routine migration steps, repetitive refactors, standardized patterns—the quality gap narrows because the assistant is no longer inventing structure. You’re telling it which structure already exists and where to find proof.
If your tooling claims certain models are “free” within a plan or that some requests count differently, treat that as a moving target and verify in the official docs before you make a policy decision. The point isn’t a specific plan detail. The point is that instruction quality is a cost-control lever: better constraints reduce wasted cycles regardless of which model you’re running.
Good instructions make cheaper models behave like responsible teammates instead of confident interns.
One File, Team-Level Consistency
Here’s the payoff that matters most to me as a Principal: consistency isn’t about one person being brilliant. It’s about a system making the right behavior cheap and repeatable.
A single .github/copilot-instructions.md, applied across the repo, shows up everywhere: in supported IDE sessions, in PRs, in “quick refactor” moments, in those 5-minute chats that silently shape code quality.
Done right, it becomes a compounding effect. New team members get an executable playbook. Senior engineers stop re-answering the same boundary questions. Reviews focus on intent and correctness, not preventable structural drift. Architectural constraints get enforced earlier—before code exists—because the assistant stops proposing changes that violate the rules you’ve already agreed on.
I’m not claiming it eliminates bad suggestions. It doesn’t. But it dramatically reduces the same categories of bad suggestions that waste senior time: misplaced responsibilities, boundary leakage, duplicated patterns, and “helpful” code that violates conventions.
The real win is moving governance earlier—before the diff is even born.
This is also why the security constraint matters. A layered instruction system can be concrete while staying anonymous: it teaches behavior by pointing to repo-local examples and naming patterns, not by narrating the engagement.
Closing: A Practical Prompt to Get Started
You don’t need a research lab to try this. You need one afternoon and the discipline to ground instructions in your repo, not your opinions.
Start small. Write Layer 0 as non-negotiables. Draft Layer 1 as boundary rules. Sketch Layer 2 as a rough domain map (rough is fine). Then add three to five pattern anchors you keep seeing in PRs.
Iterate weekly like you would any engineering system: adjust wording, prune duplication, strengthen anchors, and delete rules that can’t be justified by evidence in the codebase.
Here’s a copy/paste prompt you can use with a coding agent (Copilot Chat, Claude Code, or whatever you prefer). Keep it generic and repo-grounded:
Prompt (copy/paste):
“Scan this repository’s structure and documentation. Propose a .github/copilot-instructions.md that follows a recursive 4-layer hierarchy: Layer 0 global invariants, Layer 1 architectural principles, Layer 2 domain context map, Layer 3 named pattern anchors. Do not invent architecture; only state rules you can justify from evidence in the repo (folder structure, existing patterns, README/docs, code conventions). Keep it DRY: no rule should be repeated across layers—lower layers refine. For Layer 3, define 5 pattern anchors and point each to a real canonical example path in the repo.”
Close the loop the way my husband helped me close mine: the right tools were already on the table. We just needed to learn how to instruct them.
If you want consistent AI output, stop writing rules and start designing an instruction system.
