Your CLAUDE.md Is an Architecture Document — Write It Like One
What I learned turning a FastAPI project’s agent instructions into a reusable playbook
If you’ve been pairing with Claude Code, Cursor, or any other coding agent on a real
backend, you’ve probably noticed the agent is only as good as the context
you give it. Point it at a sprawling codebase with no guidance and it will
happily invent a third way of doing dependency injection, scatter print() calls
through your services, and open a fresh database session for every repository it
touches.
The fix isn’t a better model. It’s a better CLAUDE.md (or .cursorrules, or whatever
your tool calls its project instructions file).
I recently went through a FastAPI service’s agent-instructions file and realized it had quietly become the de facto architecture document on the project. Not because I set out to write architecture docs — but because the act of explaining the codebase to an agent forced me to make the implicit rules explicit. Below are the patterns worth stealing, generalized so you can drop them into your own project regardless of domain.
Why write architecture rules for the agent at all?
Three reasons:
-
Agents have no taste, only instructions. A human engineer absorbs your conventions by osmosis over weeks. An agent starts cold every session. If “business logic never touches the database directly” lives only in a senior engineer’s head, the agent can violate it on its first commit.
-
Writing it down catches your own inconsistencies and latent assumptions. The moment you try to write “services receive config via constructor injection,” you’ll find the three places where they don’t. The doc becomes a refactoring checklist.
-
It documents the why, which humans need too. The best entries aren’t “do X.” They’re “do X, because the obvious alternative Y silently breaks Z.” That’s an architecture decision record wearing a different hat.
Let’s look at the patterns.
Pattern 1: Pin down the layering, then state what each layer cannot do
The most valuable layout rules are phrased as prohibitions:
app/routers/ - route handlers only, no business logicapp/services/ - business logic, no direct DB accessapp/repositories/ - all DB queries via the ORMtests/ - mirrors app/ structureNotice that each line has two halves: where things go, and what they’re forbidden from doing. “Route handlers only, no business logic” is far more actionable to an agent than “this is the routers folder.” The negative space is where agents (and junior devs) drift.
A CLAUDE.md that only says “we use a layered architecture” gives the agent nothing.
One that says “repositories are the only place DB queries live” gives it a rule it can
actually enforce — and that you can point to in a code review.
Pattern 2: Quarantine your settings library behind plain types
Here’s a pattern I now reach for on every FastAPI project. The problem: pydantic-settings
(or dynaconf, or os.environ soup) tends to metastasize. Once one service imports
Settings directly, every service does, and now your entire codebase is coupled to how
configuration happens to be loaded.
All that effort you put into modularising a codebase is for nought when the seemingly trivial task of loading setting pins them together again.
The discipline: the settings library is an infrastructure concern and must not leak. You split config into two files:
app/config/├── schema.py # frozen dataclasses — the config interface the app depends on└── settings.py # pydantic-settings loaders — env vars, .env files, factoryschema.py holds plain frozen dataclasses with zero knowledge of where values come
from:
from dataclasses import dataclass
@dataclass(frozen=True)class DatabaseConfig: url: str pool_size: int = 5
@dataclass(frozen=True)class AppConfig: debug: bool secret_key: str database: DatabaseConfigsettings.py is the only file allowed to import pydantic-settings. It reads an
APP_ENV variable (development / test / production), instantiates the right
settings subclass, maps it onto the frozen dataclasses, and caches the result:
from functools import lru_cacheimport osfrom pydantic_settings import BaseSettingsfrom app.config.schema import AppConfig, DatabaseConfig
class _Settings(BaseSettings): debug: bool = False secret_key: str database_url: str database_pool_size: int = 5
class Config: env_file = ".env.development"
class _TestSettings(_Settings): class Config: env_file = ".env.test"
class _ProductionSettings(_Settings): class Config: env_file = None # rely solely on real environment variables
def _build_database_config(s: _Settings) -> DatabaseConfig: return DatabaseConfig(url=s.database_url, pool_size=s.database_pool_size)
def _load() -> AppConfig: env = os.getenv("APP_ENV", "development").lower() match env: case "production": s = _ProductionSettings() case "test": s = _TestSettings() case _: s = _Settings() return AppConfig( debug=s.debug, secret_key=s.secret_key, database=_build_database_config(s), )
@lru_cachedef get_config() -> AppConfig: return _load()
def get_db_config() -> DatabaseConfig: return get_config().databaseThe payoff shows up everywhere downstream:
- Services depend on a boring dataclass, not a settings framework. They become trivially testable. This is worth the price of admission alone.
- Per-group accessors like
get_db_config()are perfect FastAPIDependstargets anddependency_overrideskeys in tests (more on that below). - One cache, in one place. The
@lru_cachebelongs only onget_config()— don’t sprinkle it on each_build_*helper.
Two rules worth writing into your own doc verbatim:
Never import from
app.config.settingsoutside ofapp/config/and router/lifespan wiring. Never hardcode environment-specific values anywhere outsideapp/config/.
The .env discipline that prevents production incidents
| File | Purpose | In git? |
|---|---|---|
.env.example |
Template with all keys, no real values | ✅ Yes |
.env.development |
Local dev overrides | ❌ No |
.env.test |
Test runner overrides | ❌ No |
.env.production |
Must not exist — use real env vars | ❌ No |
Production must not rely on .env files. Secrets get injected by the platform (Docker,
Kubernetes, CI/CD) as real environment variables, or mounted via a secrets directory.
Telling the agent this explicitly stops it from “helpfully” creating a .env.production
when it scaffolds something.
Pattern 3: The dependency-injection trap that silently breaks transactions
This is the single highest-value entry in the whole document, and it’s the kind of thing you only learn after it bites you in production.
FastAPI’s Depends deduplicates dependencies within a request — but each distinct
dependency callable gets resolved independently. So if a service needs two repositories
and you wire it like this:
# WRONG — opens two separate DB sessions per requestdef get_sensor_service( repo: SensorRepository = Depends(get_sensor_repository), other_repo: OtherRepository = Depends(get_other_repository),) -> SensorService: return SensorService(repo, other_repo)…each get_*_repository opens its own session. Your two repositories are now on two
different sessions, and any “transaction” spanning both is a fiction. Writes can
interleave, rollbacks won’t cover both, and you’ll spend a miserable afternoon staring
at data that shouldn’t exist.
The rule: when a service needs more than one repository, inject the session once and build the repositories from it.
# CORRECT — single session, multiple repositories, real transactiondef get_sensor_service(db: AsyncSession = Depends(get_db)) -> SensorService: return SensorService( SQLAlchemySensorRepository(db), SQLAlchemyOtherRepository(db), )Services that need only one repository can use the repository dependency directly —
there’s no session-sharing problem with a single session. The point of writing this into
CLAUDE.md is that the wrong version looks cleaner and more “DI-idiomatic,” so both
agents and humans gravitate toward it. You have to explicitly warn against the
attractive-but-broken option.
Pattern 4: Singletons with an explicit init_* / get_* contract
Not everything is request-scoped. Background engines, message-broker publishers, long-lived runtime components — these live for the process, not the request. The pattern that keeps them sane is a module-level singleton with a paired initializer and accessor:
_publisher: Publisher | None = None
def get_publisher() -> Publisher | None: return _publisher
def init_publisher() -> Publisher | None: global _publisher try: _publisher = Publisher() except ValueError as e: logger.error(f"Publisher initialization failed: {e}") _publisher = None return _publisherTwo refinements worth codifying:
Guard required singletons with RuntimeError, never assert. Assertions are
stripped when Python runs under -O, so an assert _publisher is not None becomes a no-op in
exactly the production build where you most want the check:
def get_ble_manager() -> BLEManager: if _ble_manager is None: raise RuntimeError("BLEManager not initialized") return _ble_managerDistinguish required from optional. Optional integrations (a broker that may not be
configured in this environment) return None, and callers handle absence. Required ones
raise. Making the agent state which is which prevents it from defensively None-checking
things that should be hard failures.
Again, this last point is an example of something that looks correct but potentially masks a hard dependency.
Pattern 5: A lifespan that reads like an orchestration script
FastAPI’s lifespan context manager is where startup and shutdown live, and it rots
fast — it tends to become a 200-line wall of inline setup. Two rules keep it readable.
First, build the lifespan from a factory that closes over config, so it stays typed and testable:
def get_lifespan(config: AppConfig) -> Callable[[FastAPI], AsyncContextManager[None]]: @asynccontextmanager async def lifespan(app: FastAPI) -> AsyncGenerator[None, None]: # startup ... yield # shutdown ... return lifespanSecond, extract each conditional startup block into a named async helper so the lifespan body reads as a sequence of intentions, not implementation:
async def _start_messaging(config: MessagingConfig) -> tuple[Subscriber | None, Publisher | None]: if config.broker is None or config.port is None: return None, None subscriber = Subscriber(config) await subscriber.start() publisher = init_publisher(config) if publisher is not None: await publisher.start() return subscriber, publisherA couple of supporting rules that pay off:
- Every long-lived service must be registered via
init_*/get_*— never held as a bare local variable inside the lifespan closure, or it’s invisible to the rest of the app. - A service’s
start()must leave it fully operational. No “callstart()then also call these three setup methods.” If callers have to remember a sequence, the sequence belongs insidestart().
Pattern 6: Shutdown ordering is a real algorithm — spell it out
Startup order is forgiving. Shutdown order is not. Stop things in the wrong sequence and you get events written to an already-closed sink, or a publisher torn down while other services still need to emit their final messages.
The doc lays out an explicit, commented phase order:
- Stop event-generating services first (hubs, SSE emitters).
- Stop the consuming ones next.
- A brief
asyncio.sleep(0.1)to drain in-flight work — a pragmatic workaround, not a guarantee; skip it in latency-sensitive paths. - Stop independent external services in parallel via
asyncio.gather. - Stop messaging infrastructure last — publisher before subscriber, so other services can still publish during their own shutdown.
- Close write-only sinks (time-series DBs, log shippers).
- Dispose the database engine.
The general principle: stop event-generating services before the infrastructure they write to.
And wrap every stop call in a helper that logs and swallows, so one stubborn service can’t block the rest of the shutdown:
async def safe_stop(name: str, coro: Coroutine[Any, Any, None], timeout: float = 5.0) -> None: """Stop a service with timeout protection.""" try: await asyncio.wait_for(coro, timeout=timeout) logger.debug(f"{name} stopped successfully") except asyncio.TimeoutError: logger.warning(f"{name} stop timed out after {timeout}s") except Exception as e: logger.exception(f"Error stopping {name}: {e}")Pattern 7: If your tests patch env vars, your DI is wrong
This is the entry that ties the whole architecture together, and it’s a litmus test you can apply to any FastAPI codebase today.
Because services depend only on plain frozen dataclasses, you construct config directly in unit tests — no environment variables, no cache-clearing, no monkeypatching:
def test_connects_with_correct_pool_size(): config = DatabaseConfig(url="sqlite:///:memory:", pool_size=2) service = DatabaseService(config) ...For full-request integration tests, you override the per-group accessor rather than
get_config itself (overriding get_config would force you to build a complete
AppConfig every time):
def test_handler(client): app.dependency_overrides[get_db_config] = lambda: DatabaseConfig( url="sqlite:///:memory:" ) response = client.get("/") app.dependency_overrides.clear()And the rule that makes it all hold together:
Do not patch env vars or clear the config cache in tests. If you feel the need to, the config dependency isn’t being injected correctly. Modules must also never call
get_config()at import time — an import-time call populates the cache before anydependency_overridesare in place and silently wins for the rest of the process.
That last part again… The “config called at import time” bug is brutal to diagnose: your override is correctly registered, your test still sees production config, and nothing errors. Writing the rule down — for the agent and for yourself — is cheaper than debugging it.
How to actually use this
You don’t need to adopt every pattern above. The meta-lesson is about the shape of a good agent-instructions file:
- State prohibitions, not just locations. “No business logic in routers” beats “this is the routers folder.”
- Pair every non-obvious rule with its failure mode. “Inject the session once,
because two
Dependscalls open two sessions and break transactions.” The why is what makes an agent (and a reviewer) trust the rule instead of second-guessing it. - Call out the attractive-but-wrong option explicitly. Agents gravitate toward the clean-looking version. If the clean version is broken, say so, with both snippets side by side.
- Keep infrastructure concerns quarantined behind plain types, so the rules about “never import X outside Y” are enforceable and your tests stay boring.
The funny thing is that none of this is really about AI. Every rule here would improve a codebase with no agent anywhere near it. But writing instructions for an agent gives you a forcing function you didn’t have before: a reader who follows your conventions literally, has no tribal knowledge to fall back on, and will expose every gap between what your architecture says and what it actually does.
Treat your CLAUDE.md as an architecture document. Your future self — and your agent —
will thank you.
Have a pattern you’ve baked into your own agent-instructions file? I’d love to hear what’s working for you.