Building software with an AI agent team: a Core-Adapter approach

Blog

March 26, 2026

Building software with an AI agent team: a Core-Adapter approach

When you start using AI coding agents seriously — not as autocomplete, but as actors that take a task and produce work — the question shifts from "how good is the model" to "how do you organize the work so the model can be useful." We have been working on a small framework for this, and the structure that has stayed stable through several rewrites is a Core-Adapter split. This note describes it.

The Core is small and rarely changes. It is the part that defines what a project is, what its blueprint looks like, what counts as "task" or "memory" or "skill," and how the agent reads and writes those artifacts. The Core is not a feature surface — it is a vocabulary and a set of conventions. If the Core is right, an agent that knows the Core can navigate any project that uses it.

Adapters are everything else, and Adapters are allowed to change. They are the integrations with the editor, the file viewer, the team chat, the deployment system. They are the visualizations and the dashboards. They are the optional protocols that connect the Core to whatever environment the team is actually working in. There can be many Adapters; there is exactly one Core.

Figure 01

Core and adapter — two repos, one team

The Core repo holds language- and framework-agnostic logic. Each Adapter clone overlays the platform-specific glue on top of the same branches.

Core repository

agnostic

Agent specs

Skill library

Blueprint schema

Memory layers

overlays on

Adapter clones

Obsidian adapter

vault paths · note IO

Adapter clones

TeamUI adapter

desktop runtime · IPC

Adapter clones

CLI adapter

shell glue · pipes

principleA new platform means a new Adapter, not a rewrite of Core. Replace the editor, swap the chat tool, change deploy target — the Core does not move.

Why this split matters in practice: an Adapter's lifetime is the lifetime of the tool it adapts to. Editors come and go. File browsers get redesigned. Team chat tools get acquired. None of that should require the Core to change. The Core has to outlive the Adapters, and the only way to make that affordable is to keep the Core small and the surface between Core and Adapter clean.

There is a second axis we use to discipline this work: a three-layer view of the framework itself. The first layer is meta-design — the design of the framework. The second layer is the framework as it lives on a runtime — the actual files and conventions a project carries when it is using the framework. The third layer is a real project — a specific application built with the framework, with its own blueprints and code. Mixing these layers in conversation is the single most reliable way to confuse a discussion. We label the layer of any given decision before we make it.

A small thing that turned out to matter a lot: artifacts have to be project-local files, not state inside an agent or a service. Blueprints, tasks, memory, skill definitions — they all live in the project repository. They are version-controlled, diffable, reviewable, and outlive any individual agent session. The agent's job is to read those files, do work, and write back changes that are themselves files. If the framework needs an external service to remember things, you have built a fragile system.

Figure 02

TEAM OS v7 — nine roles, three planes

Nine specialized AI roles cooperate across a standard plan-build-review process. Each role has its own skills, its own memory layer, and a defined handoff contract.

PLAN

shape the work

project manager

Product

product owner

Design

visual + interaction

BUILD

do the work

Frontend

UI build

Backend

API + data

DevOps

ship + observe

REVIEW

verify the work

verify behavior

Reviewer

code + plan review

Quality lead

final acceptance

contractsEach role reads from a layered memory (L1 task · L2 project · L3 framework). Output of one role is the input contract of the next.

A pattern that emerged: have one project-management role and several specialized roles, but keep them all reading from the same blueprints and writing to the same task tracker. We use one orchestrator and a small set of fixed/on-demand roles (designer, frontend, backend, reviewer, security, debugger, etc). They are not separate models — they are the same underlying model running with different prompts and different tool permissions. The structure is what gives them coherence, not the prompt.

On skills: we treat each skill as a small contract. It triggers under specific conditions, it has narrowly scoped instructions, it can call other skills, and it produces auditable output. A skill is much closer to a function than to a personality. When we add a new capability to the framework, we add a skill — not a new agent, not a new branch in some giant prompt. This makes the framework easier to maintain and easier to reason about.

On verification: agent work is only useful if you can check it. We invest as much in the loop that catches mistakes as we do in the loop that makes progress. Every step the agent takes either produces an artifact a human or another agent can review, or it does not happen. We have learned not to trust silent state changes — they have a way of accumulating into problems that take days to undo.

Figure 03

D0 / D1 / D2 — three layers of documentation

Different jobs read at different layers. D0 is principle (rarely written, often read). D1 is design (written when something is decided). D2 is current state (rewritten constantly).

Principle

changes rarely

Why we do things this way at all.

e.g.mission, values, architecture philosophy, naming rules

readersevery agent on a new task

Design

changes per decision

How a specific thing is intended to work.

e.g.feature specs, API contracts, schema decisions, blueprints

readerswhoever builds or audits

Current state

rewritten constantly

What is actually true right now.

e.g.TODO list, open bugs, agent inboxes, runtime logs

readerswhoever picks up the next task

How agents read across layers

D0 (once per task) → D1 (the relevant module) → D2 (latest snapshot, before acting)

What we get from this structure, concretely: an agent can join a project mid-stream and become useful within a small number of turns, because the Core defines where everything is. A team can replace an Adapter — switch editors, switch chat tools, change deployment platforms — without rewriting the Core or the project artifacts. A skill can be tested in isolation. A blueprint can be reviewed by a human in plain text. None of these properties show up if the framework is one big prompt.

What we are still figuring out: how to handle disagreement between agents in a way that does not turn every decision into a committee, how to keep memory honest as a project ages, when to let an agent self-organize and when to keep it on rails. None of these have clean answers yet. The framework is a working tool, not a finished product, and our notes are written in that spirit.

From the Blog

View all