Building software with an AI agent team: a Core-Adapter approach
When you start using AI coding agents seriously — not as autocomplete, but as actors that take a task and produce work — the question shifts from "how good is the model" to "how do you organize the work so the model can be useful." We have been working on a small framework for this, and the structure that has stayed stable through several rewrites is a Core-Adapter split. This note describes it.
The Core is small and rarely changes. It is the part that defines what a project is, what its blueprint looks like, what counts as "task" or "memory" or "skill," and how the agent reads and writes those artifacts. The Core is not a feature surface — it is a vocabulary and a set of conventions. If the Core is right, an agent that knows the Core can navigate any project that uses it.
Adapters are everything else, and Adapters are allowed to change. They are the integrations with the editor, the file viewer, the team chat, the deployment system. They are the visualizations and the dashboards. They are the optional protocols that connect the Core to whatever environment the team is actually working in. There can be many Adapters; there is exactly one Core.
Figure 01
Core and adapter — two repos, one team
The Core repo holds language- and framework-agnostic logic. Each Adapter clone overlays the platform-specific glue on top of the same branches.
Core repository
agnosticAgent specs
Skill library
Blueprint schema
Memory layers
Adapter clones
Obsidian adapter
vault paths · note IO
Adapter clones
TeamUI adapter
desktop runtime · IPC
Adapter clones
CLI adapter
shell glue · pipes
principleA new platform means a new Adapter, not a rewrite of Core. Replace the editor, swap the chat tool, change deploy target — the Core does not move.
Why this split matters in practice: an Adapter's lifetime is the lifetime of the tool it adapts to. Editors come and go. File browsers get redesigned. Team chat tools get acquired. None of that should require the Core to change. The Core has to outlive the Adapters, and the only way to make that affordable is to keep the Core small and the surface between Core and Adapter clean.
There is a second axis we use to discipline this work: a three-layer view of the framework itself. The first layer is meta-design — the design of the framework. The second layer is the framework as it lives on a runtime — the actual files and conventions a project carries when it is using the framework. The third layer is a real project — a specific application built with the framework, with its own blueprints and code. Mixing these layers in conversation is the single most reliable way to confuse a discussion. We label the layer of any given decision before we make it.
A small thing that turned out to matter a lot: artifacts have to be project-local files, not state inside an agent or a service. Blueprints, tasks, memory, skill definitions — they all live in the project repository. They are version-controlled, diffable, reviewable, and outlive any individual agent session. The agent's job is to read those files, do work, and write back changes that are themselves files. If the framework needs an external service to remember things, you have built a fragile system.
Figure 02
TEAM OS v7 — nine roles, three planes
Nine specialized AI roles cooperate across a standard plan-build-review process. Each role has its own skills, its own memory layer, and a defined handoff contract.
PLAN
shape the work
PM
project manager
Product
product owner
Design
visual + interaction
BUILD
do the work
Frontend
UI build
Backend
API + data
DevOps
ship + observe
REVIEW
verify the work
QA
verify behavior
Reviewer
code + plan review
Quality lead
final acceptance
contractsEach role reads from a layered memory (L1 task · L2 project · L3 framework). Output of one role is the input contract of the next.
A pattern that emerged: have one project-management role and several specialized roles, but keep them all reading from the same blueprints and writing to the same task tracker. We use one orchestrator and a small set of fixed/on-demand roles (designer, frontend, backend, reviewer, security, debugger, etc). They are not separate models — they are the same underlying model running with different prompts and different tool permissions. The structure is what gives them coherence, not the prompt.
On skills: we treat each skill as a small contract. It triggers under specific conditions, it has narrowly scoped instructions, it can call other skills, and it produces auditable output. A skill is much closer to a function than to a personality. When we add a new capability to the framework, we add a skill — not a new agent, not a new branch in some giant prompt. This makes the framework easier to maintain and easier to reason about.
On verification: agent work is only useful if you can check it. We invest as much in the loop that catches mistakes as we do in the loop that makes progress. Every step the agent takes either produces an artifact a human or another agent can review, or it does not happen. We have learned not to trust silent state changes — they have a way of accumulating into problems that take days to undo.
Figure 03
D0 / D1 / D2 — three layers of documentation
Different jobs read at different layers. D0 is principle (rarely written, often read). D1 is design (written when something is decided). D2 is current state (rewritten constantly).
D0
Principle
changes rarely
Why we do things this way at all.
e.g.mission, values, architecture philosophy, naming rules
readersevery agent on a new task
D1
Design
changes per decision
How a specific thing is intended to work.
e.g.feature specs, API contracts, schema decisions, blueprints
readerswhoever builds or audits
D2
Current state
rewritten constantly
What is actually true right now.
e.g.TODO list, open bugs, agent inboxes, runtime logs
readerswhoever picks up the next task
How agents read across layers
D0 (once per task) → D1 (the relevant module) → D2 (latest snapshot, before acting)
What we get from this structure, concretely: an agent can join a project mid-stream and become useful within a small number of turns, because the Core defines where everything is. A team can replace an Adapter — switch editors, switch chat tools, change deployment platforms — without rewriting the Core or the project artifacts. A skill can be tested in isolation. A blueprint can be reviewed by a human in plain text. None of these properties show up if the framework is one big prompt.
What we are still figuring out: how to handle disagreement between agents in a way that does not turn every decision into a committee, how to keep memory honest as a project ages, when to let an agent self-organize and when to keep it on rails. None of these have clean answers yet. The framework is a working tool, not a finished product, and our notes are written in that spirit.