Technology

From Docs-as-Code to Docs-in-Code

Docs-as-code put documentation in version control. Docs-in-code puts it where the AI works, so the docs become persistent context for both the human and the model.

Jun 26, 2026 | 8 min read | Ai Llm Automation Devops Lessons-Learned

If you've worked in a modern engineering org, you already know docs-as-code. You write documentation in markdown, you keep it in version control, it renders nicely in GitHub or GitLab, and it gets reviewed in the same pull request as the change it describes. The win is that docs live next to the code and move through the same gates. No separate publishing step, no wiki that drifts out of date the moment someone merges.

I bought into that years ago (even though my org didn't). What I didn't expect is how much further the idea would go once I started doing most of my building alongside an LLM. Docs-as-code was about where the docs live in your process. What I've landed on now is something I think of as docs-in-code: the docs live where the AI works, and they become persistent context for both me and the model.

The friction that pushed me there

The thing that finally moved me wasn't a clever idea. It was annoyance.

At work I'd been using the Atlassian MCP to let an AI assistant read and write Confluence pages directly. On paper that's the dream: the model can pull the page it needs and update it when something changes. In practice it was slooowwwww. Every read was a round-trip, every write was a round-trip, and when you're in the middle of a coding session, that latency breaks your flow and the model's. You feel each call. The knowledge was technically reachable, but reaching for it cost enough that I stopped reaching.

So I flipped it. Now the source of truth lives in the codebase as markdown. The AI reads and writes those files at local-disk speed, because they're just files. Then, periodically, I use the Atlassian MCP to push the relevant markdown into Confluence so the rest of the org has it where they expect it. Both of these tasks are simple, and easily done by a cheaper LLM model (SWE-1.6 for Cognition/Windsurf, or Haiku for Anthropic).

That ordering matters. Confluence stopped being the live working copy and became a downstream publishing target. The repo is the single source of truth (a value I'll defend in almost any system), and the wiki is a render of it, not the other way around. Single source of truth has always been the rule I reach for first, and this is just that rule applied to documentation.

Docs as persistent context

Here's the part that surprised me: the biggest beneficiary of docs-in-code isn't the human reader. It's the model.

When documentation lives in the repo in a predictable place, it stops being something the AI has to go find. It becomes context the model can load before it acts. Working on the database layer? It reads the SQL docs first. Picking up a half-finished feature? The project notes tell it where things stand. And when it learns something worth keeping, it writes that back into the same files. The knowledge survives the session instead of evaporating when the context window resets.

That last point is the whole game. An LLM session is amnesiac by default. Anything it figured out last week is gone unless it got written down somewhere durable. Docs-in-code is how you give it a memory you can also read. Every doc the model maintains is context for the next session, and it's the same context a human teammate would read. You're not maintaining two artifacts. You're maintaining one, and both readers benefit.

There's a quieter benefit too: nobody has to know where the docs are. Not me, not the model. They're in the repo, in a structure that's the same across every project. No searching, no "check the wiki, no wait, check the other wiki."

Structuring for both readers

That predictability only works if the structure actually exists, which is why I baked it into my tooling. The setup skill in my bmad-lite-skills library now scaffolds a docs/ tree on day one, with separate buckets for different kinds of knowledge:

maintainer: how the project is run - build, deploy, release, the operational stuff a future maintainer (or model) needs.
setup: environment and platform guidance, including the best-practice reference docs the dev and review skills read before they touch code.
sql: schema, migrations, query conventions. The data layer is exactly where an AI most needs ground truth and most easily hallucinates without it.
project: the living state - decisions made, what's in flight, why something is the way it is.

The specific folders matter less than the fact that they're there from the start and consistent across projects. A predictable layout is something the model can navigate without being told each time. When the structure is the same everywhere, "read the relevant docs before you act" becomes a habit you can actually rely on instead of a hope.

The discipline that makes it work

None of this pays off if writing the docs is a cleanup task you do at the end, because you won't. It has to be part of the loop.

The way I keep it honest is to treat doc updates as part of the work, not after it. When the model implements something that changes how the system behaves, updating the relevant doc is part of that same change, the same way you'd update a test. The docs are in the repo, so they go through the same review I'd give any other diff. And because reading them is cheap and writing them is cheap, the cost of keeping them current is low enough that it actually happens.

I've wired that into bmad-lite too, so it isn't just a habit I have to remember. The first pass at this was too timid. Each skill updated a doc or two on its own, which meant the human-facing guides (how to stand the project up, how to operate it, what the database looks like) quietly rotted while the planning docs stayed current. So I pulled all the write-back logic into a single skill, docs-sync, that the other skills call. The caller supplies the trigger and the context, docs-sync owns the how. Single source of truth, applied to the documentation machinery itself, so the mechanics can't drift between callers.

The core operation watches the diff. After a change lands, it matches the changed files against a set of infrastructure signals (a new dependency manifest, an env or config change, a database migration, a new runnable script, a Dockerfile or CI workflow, a new long-running service) and routes each one to the right human guide: setup instructions, the maintainer runbook, or the SQL docs. If the topic is substantial enough, it writes a new focused page instead of cramming it into an existing one, and wires that page into the area's index.md so the index stays the hub a human actually reads from. Two design choices make it safe to leave on. It's gated and idempotent, so when a change touches none of those signals (most do not), it costs zero tokens and does nothing. And it's required to ground every edit in the actual diff. Where an operational step like a rollback is inferred rather than evidenced in the code, it still gets written, but tagged inferred, verify so nobody follows a hallucinated runbook at 2am.

There's a cost angle here that matters more than it looks. This kind of writing is mechanical: read the diff, match a signal, update the right page in plain prose. It is not reasoning-heavy work, so it has no business running on an expensive reasoning model. So docs-sync runs as its own agent, pinned to Haiku, no matter what the agent doing the actual development is running on. The dev work can be on Sonnet or Opus where the reasoning earns its cost, and the documentation pass hands off to the cheap model that's perfectly good at grounded, diff-driven writing. Isolating it in a separate agent does double duty: the doc-writing never clutters the working agent's context, and it never gets billed at the working agent's rate. Right-sizing the model to the task is its own form of architecture.

The other directions are handled too. The code review step appends discoveries to the epic's context file as it goes, and at the epic boundary docs-sync promotes the durable ones into the architecture doc, so the next epic plans against what the code actually became instead of what the plan said it would be. And there's a deliberate line it won't cross: the externally-sourced coding guidance (the Swift and web best-practice references that a separate refresh skill owns) is never auto-written. If implementation reveals that guidance has gone stale, docs-sync only raises a flag and recommends a re-sync. Three audiences (the human operators, the planning model, and the external canon) kept deliberately separate, because the wrong write into the wrong one is how docs lose trust.

That's the difference from the old Confluence workflow in one sentence: when keeping the knowledge current is expensive, it rots. When it's cheap, it stays alive.

Closing thought

Docs-as-code taught me to put documentation under the same discipline as code. Docs-in-code is what happens when your most frequent reader is an AI that needs that documentation loaded as context to do good work. The wiki isn't dead. It's just downstream now. The source of truth lives where the work happens, in the repo, where both the human and the model can read it without going looking.

Comments welcome!