Your CLAUDE.md Is an Architecture Document

What I learned turning a FastAPI project’s agent instructions into a reusable playbook

If you’ve been pairing with Claude Code, Cursor, or any other coding agent on a real backend, you’ve probably noticed the agent is only as good as the context you give it. Point it at a sprawling codebase with no guidance and it will happily invent a third way of doing dependency injection, scatter print() calls through your services, and open a fresh database session for every repository it touches.

The fix isn’t a better model. It’s a better CLAUDE.md (or .cursorrules, or whatever your tool calls its project instructions file).

I recently went through a FastAPI service’s agent-instructions file and realized it had quietly become the de facto architecture document on the project. Not because I set out to write architecture docs — but because the act of explaining the codebase to an agent forced me to make the implicit rules explicit. Below are the patterns worth stealing, generalized so you can drop them into your own project regardless of domain.

Why write architecture rules for the agent at all?

Three reasons:

Agents have no taste, only instructions. A human engineer absorbs your conventions by osmosis over weeks. An agent starts cold every session. If “business logic never touches the database directly” lives only in a senior engineer’s head, the agent can violate it on its first commit.
Writing it down catches your own inconsistencies and latent assumptions. The moment you try to write “services receive config via constructor injection,” you’ll find the three places where they don’t. The doc becomes a refactoring checklist.
It documents the why, which humans need too. The best entries aren’t “do X.” They’re “do X, because the obvious alternative Y silently breaks Z.” That’s an architecture decision record wearing a different hat.

Let’s look at the patterns.

Pattern 1: Pin down the layering, then state what each layer cannot do

The most valuable layout rules are phrased as prohibitions:

app/routers/        - route handlers only, no business logic
app/services/       - business logic, no direct DB access
app/repositories/   - all DB queries via the ORM
tests/              - mirrors app/ structure

Notice that each line has two halves: where things go, and what they’re forbidden from doing. “Route handlers only, no business logic” is far more actionable to an agent than “this is the routers folder.” The negative space is where agents (and junior devs) drift.

A CLAUDE.md that only says “we use a layered architecture” gives the agent nothing. One that says “repositories are the only place DB queries live” gives it a rule it can actually enforce — and that you can point to in a code review.

Pattern 2: Quarantine your settings library behind plain types

Here’s a pattern I now reach for on every FastAPI project. The problem: pydantic-settings (or dynaconf, or os.environ soup) tends to metastasize. Once one service imports Settings directly, every service does, and now your entire codebase is coupled to how configuration happens to be loaded.

All that effort you put into modularising a codebase is for nought when the seemingly trivial task of loading setting pins them together again.

The discipline: the settings library is an infrastructure concern and must not leak. You split config into two files:

app/config/
├── schema.py    # frozen dataclasses — the config interface the app depends on
└── settings.py  # pydantic-settings loaders — env vars, .env files, factory

schema.py holds plain frozen dataclasses with zero knowledge of where values come from:

from dataclasses import dataclass

@dataclass(frozen=True)
class DatabaseConfig:
    url: str
    pool_size: int = 5

@dataclass(frozen=True)
class AppConfig:
    debug: bool
    secret_key: str
    database: DatabaseConfig

settings.py is the only file allowed to import pydantic-settings. It reads an APP_ENV variable (development / test / production), instantiates the right settings subclass, maps it onto the frozen dataclasses, and caches the result:

from functools import lru_cache
import os
from pydantic_settings import BaseSettings
from app.config.schema import AppConfig, DatabaseConfig

class _Settings(BaseSettings):
    debug: bool = False
    secret_key: str
    database_url: str
    database_pool_size: int = 5

    class Config:
        env_file = ".env.development"

class _TestSettings(_Settings):
    class Config:
        env_file = ".env.test"

class _ProductionSettings(_Settings):
    class Config:
        env_file = None  # rely solely on real environment variables

def _build_database_config(s: _Settings) -> DatabaseConfig:
    return DatabaseConfig(url=s.database_url, pool_size=s.database_pool_size)

def _load() -> AppConfig:
    env = os.getenv("APP_ENV", "development").lower()
    match env:
        case "production":
            s = _ProductionSettings()
        case "test":
            s = _TestSettings()
        case _:
            s = _Settings()
    return AppConfig(
        debug=s.debug,
        secret_key=s.secret_key,
        database=_build_database_config(s),
    )

@lru_cache
def get_config() -> AppConfig:
    return _load()

def get_db_config() -> DatabaseConfig:
    return get_config().database

The payoff shows up everywhere downstream:

Services depend on a boring dataclass, not a settings framework. They become trivially testable. This is worth the price of admission alone.
Per-group accessors like get_db_config() are perfect FastAPI Depends targets and dependency_overrides keys in tests (more on that below).
One cache, in one place. The @lru_cache belongs only on get_config() — don’t sprinkle it on each _build_* helper.

Two rules worth writing into your own doc verbatim:

Never import from app.config.settings outside of app/config/ and router/lifespan wiring. Never hardcode environment-specific values anywhere outside app/config/.

The .env discipline that prevents production incidents

File	Purpose	In git?
`.env.example`	Template with all keys, no real values	✅ Yes
`.env.development`	Local dev overrides	❌ No
`.env.test`	Test runner overrides	❌ No
`.env.production`	Must not exist — use real env vars	❌ No

Production must not rely on .env files. Secrets get injected by the platform (Docker, Kubernetes, CI/CD) as real environment variables, or mounted via a secrets directory. Telling the agent this explicitly stops it from “helpfully” creating a .env.production when it scaffolds something.

Pattern 3: The dependency-injection trap that silently breaks transactions

This is the single highest-value entry in the whole document, and it’s the kind of thing you only learn after it bites you in production.

FastAPI’s Depends deduplicates dependencies within a request — but each distinct dependency callable gets resolved independently. So if a service needs two repositories and you wire it like this:

# WRONG — opens two separate DB sessions per request
def get_sensor_service(
    repo: SensorRepository = Depends(get_sensor_repository),
    other_repo: OtherRepository = Depends(get_other_repository),
) -> SensorService:
    return SensorService(repo, other_repo)

…each get_*_repository opens its own session. Your two repositories are now on two different sessions, and any “transaction” spanning both is a fiction. Writes can interleave, rollbacks won’t cover both, and you’ll spend a miserable afternoon staring at data that shouldn’t exist.

The rule: when a service needs more than one repository, inject the session once and build the repositories from it.

# CORRECT — single session, multiple repositories, real transaction
def get_sensor_service(db: AsyncSession = Depends(get_db)) -> SensorService:
    return SensorService(
        SQLAlchemySensorRepository(db),
        SQLAlchemyOtherRepository(db),
    )

Services that need only one repository can use the repository dependency directly — there’s no session-sharing problem with a single session. The point of writing this into CLAUDE.md is that the wrong version looks cleaner and more “DI-idiomatic,” so both agents and humans gravitate toward it. You have to explicitly warn against the attractive-but-broken option.

Pattern 4: Singletons with an explicit `init_` / `get_` contract

Not everything is request-scoped. Background engines, message-broker publishers, long-lived runtime components — these live for the process, not the request. The pattern that keeps them sane is a module-level singleton with a paired initializer and accessor:

_publisher: Publisher | None = None

def get_publisher() -> Publisher | None:
    return _publisher

def init_publisher() -> Publisher | None:
    global _publisher
    try:
        _publisher = Publisher()
    except ValueError as e:
        logger.error(f"Publisher initialization failed: {e}")
        _publisher = None
    return _publisher

Two refinements worth codifying:

Guard required singletons with RuntimeError, never assert. Assertions are stripped when Python runs under -O, so an assert _publisher is not None becomes a no-op in exactly the production build where you most want the check:

def get_ble_manager() -> BLEManager:
    if _ble_manager is None:
        raise RuntimeError("BLEManager not initialized")
    return _ble_manager

Distinguish required from optional. Optional integrations (a broker that may not be configured in this environment) return None, and callers handle absence. Required ones raise. Making the agent state which is which prevents it from defensively None-checking things that should be hard failures.

Again, this last point is an example of something that looks correct but potentially masks a hard dependency.

Pattern 5: A lifespan that reads like an orchestration script

FastAPI’s lifespan context manager is where startup and shutdown live, and it rots fast — it tends to become a 200-line wall of inline setup. Two rules keep it readable.

First, build the lifespan from a factory that closes over config, so it stays typed and testable:

def get_lifespan(config: AppConfig) -> Callable[[FastAPI], AsyncContextManager[None]]:
    @asynccontextmanager
    async def lifespan(app: FastAPI) -> AsyncGenerator[None, None]:
        # startup
        ...
        yield
        # shutdown
        ...
    return lifespan

Second, extract each conditional startup block into a named async helper so the lifespan body reads as a sequence of intentions, not implementation:

async def _start_messaging(config: MessagingConfig) -> tuple[Subscriber | None, Publisher | None]:
    if config.broker is None or config.port is None:
        return None, None
    subscriber = Subscriber(config)
    await subscriber.start()
    publisher = init_publisher(config)
    if publisher is not None:
        await publisher.start()
    return subscriber, publisher

A couple of supporting rules that pay off:

Every long-lived service must be registered via init_*/get_* — never held as a bare local variable inside the lifespan closure, or it’s invisible to the rest of the app.
A service’s start() must leave it fully operational. No “call start() then also call these three setup methods.” If callers have to remember a sequence, the sequence belongs inside start().

Pattern 6: Shutdown ordering is a real algorithm — spell it out

Startup order is forgiving. Shutdown order is not. Stop things in the wrong sequence and you get events written to an already-closed sink, or a publisher torn down while other services still need to emit their final messages.

The doc lays out an explicit, commented phase order:

Stop event-generating services first (hubs, SSE emitters).
Stop the consuming ones next.
A brief asyncio.sleep(0.1) to drain in-flight work — a pragmatic workaround, not a guarantee; skip it in latency-sensitive paths.
Stop independent external services in parallel via asyncio.gather.
Stop messaging infrastructure last — publisher before subscriber, so other services can still publish during their own shutdown.
Close write-only sinks (time-series DBs, log shippers).
Dispose the database engine.

The general principle: stop event-generating services before the infrastructure they write to.

And wrap every stop call in a helper that logs and swallows, so one stubborn service can’t block the rest of the shutdown:

async def safe_stop(name: str, coro: Coroutine[Any, Any, None], timeout: float = 5.0) -> None:
    """Stop a service with timeout protection."""
    try:
        await asyncio.wait_for(coro, timeout=timeout)
        logger.debug(f"{name} stopped successfully")
    except asyncio.TimeoutError:
        logger.warning(f"{name} stop timed out after {timeout}s")
    except Exception as e:
        logger.exception(f"Error stopping {name}: {e}")

Pattern 7: If your tests patch env vars, your DI is wrong

This is the entry that ties the whole architecture together, and it’s a litmus test you can apply to any FastAPI codebase today.

Because services depend only on plain frozen dataclasses, you construct config directly in unit tests — no environment variables, no cache-clearing, no monkeypatching:

def test_connects_with_correct_pool_size():
    config = DatabaseConfig(url="sqlite:///:memory:", pool_size=2)
    service = DatabaseService(config)
    ...

For full-request integration tests, you override the per-group accessor rather than get_config itself (overriding get_config would force you to build a complete AppConfig every time):

def test_handler(client):
    app.dependency_overrides[get_db_config] = lambda: DatabaseConfig(
        url="sqlite:///:memory:"
    )
    response = client.get("/")
    app.dependency_overrides.clear()

And the rule that makes it all hold together:

Do not patch env vars or clear the config cache in tests. If you feel the need to, the config dependency isn’t being injected correctly. Modules must also never call get_config() at import time — an import-time call populates the cache before any dependency_overrides are in place and silently wins for the rest of the process.

That last part again… The “config called at import time” bug is brutal to diagnose: your override is correctly registered, your test still sees production config, and nothing errors. Writing the rule down — for the agent and for yourself — is cheaper than debugging it.

How to actually use this

You don’t need to adopt every pattern above. The meta-lesson is about the shape of a good agent-instructions file:

State prohibitions, not just locations. “No business logic in routers” beats “this is the routers folder.”
Pair every non-obvious rule with its failure mode. “Inject the session once, because two Depends calls open two sessions and break transactions.” The why is what makes an agent (and a reviewer) trust the rule instead of second-guessing it.
Call out the attractive-but-wrong option explicitly. Agents gravitate toward the clean-looking version. If the clean version is broken, say so, with both snippets side by side.
Keep infrastructure concerns quarantined behind plain types, so the rules about “never import X outside Y” are enforceable and your tests stay boring.

The funny thing is that none of this is really about AI. Every rule here would improve a codebase with no agent anywhere near it. But writing instructions for an agent gives you a forcing function you didn’t have before: a reader who follows your conventions literally, has no tribal knowledge to fall back on, and will expose every gap between what your architecture says and what it actually does.

Treat your CLAUDE.md as an architecture document. Your future self — and your agent — will thank you.

Have a pattern you’ve baked into your own agent-instructions file? I’d love to hear what’s working for you.

Your CLAUDE.md Is an Architecture Document — Write It Like One