Intro

When Scaffolding Breaks

2 min read

Context files aren't permanent. They were written at a specific point in time with specific knowledge. The codebase evolves. The team changes. The architecture shifts. The CLAUDE.md stays the same. What was accurate becomes misleading. What was precise becomes ambiguous. What was explicit becomes stale.

Scaffolding breaks in three distinct ways, and each requires a different response. Context conflicts — where two files say contradictory things and the AI must choose. Context drift — where the files were correct once but no longer match the codebase. Context gaps — where the files are complete and consistent but simply don't cover a situation the AI is now in.

This module examines all three failure modes in depth. You'll diagnose each one, identify the governance violation behind it, and argue for a remediation protocol — not an ad-hoc fix, but a systematic response that prevents recurrence. NIST MANAGE's risk management requirements apply directly here: a scaffolding failure is a managed risk, not an emergency.

Portfolio artifact

Debate

A scaffolding failure analysis — three failure scenarios diagnosed with root cause, governance violation identified, remediation protocol argued and defended, and a position on which failure mode is most dangerous and why.

By the end of this module, you will:

Distinguish between context conflicts, drift, and gaps as distinct failure modes
Identify the NIST MANAGE violation that corresponds to each failure mode
Argue a remediation protocol for each failure type with governance grounding
Defend a position on which failure mode is most dangerous to a project
Design a detection mechanism for each failure mode before it causes damage

Scenario

Three Broken Projects

3 min read

Three teams. Each has a scaffolding system. Each has a problem the scaffolding caused.

Project Alpha — The Conflict. Alpha has a root CLAUDE.md that says "all API responses must be wrapped in a standard envelope with status, data, and error fields." The API package CLAUDE.md, written six months later by a different developer, says "follow REST conventions strictly — return raw resource objects, use HTTP status codes for error communication." Both files are in use. An AI session working on a new endpoint gets both files. It generates an endpoint that partially honors both: HTTP 422 status codes with an envelope-wrapped body. Neither convention is correctly followed. The code passes review from a developer who didn't realize both conventions were active.

Project Beta — The Drift. Beta's CLAUDE.md says "authentication is handled by the AuthMiddleware class in lib/auth/middleware.ts — always use this class, never implement auth logic directly." This was accurate when it was written. A major security audit eight months ago resulted in a complete rewrite of the authentication system. AuthMiddleware was deprecated and replaced with a token validation pipeline in lib/security/tokens.ts. The CLAUDE.md was not updated. An AI session starts, reads the CLAUDE.md, and implements a new protected endpoint using the deprecated AuthMiddleware. The code compiles. The tests pass. It reaches code review before anyone notices.

Project Gamma — The Gap. Gamma has a thorough, up-to-date CLAUDE.md. Every architectural decision is documented. Every banned pattern is listed. The conventions are consistent and accurate. A developer asks the AI to help implement rate limiting for the first time. The CLAUDE.md is silent on rate limiting — it was never needed before. The AI makes reasonable choices based on training data: it installs a rate limiting library, implements per-IP limits, and caches limits in Redis. The implementation is technically correct. But it's inconsistent with how Gamma handles all other infrastructure: they use PostgreSQL for everything and have a strict rule against adding new infrastructure dependencies. That rule is in the CLAUDE.md under "Infrastructure" — but it doesn't mention rate limiting, so the AI doesn't apply it.

Lesson

Three Failure Modes, Three Responses

4 min read

Each failure mode has a different root cause, a different detection signal, and a different remediation path. Treating them as variations of the same problem leads to remediation that fixes the symptom without addressing the cause.

Failure Mode 1 — Context Conflict

Definition: Two or more context files make contradictory claims about how the AI should behave. Root cause: the files were written independently without a coordination protocol. The AI has no way to resolve the contradiction — it will honor both partially or arbitrarily pick one. Detection: A consistency audit comparing all context files against each other. Remediation: Establish an explicit hierarchy (the root always wins, or the package overrides with explicit declaration) and audit immediately when any file is updated.

Failure Mode 2 — Context Drift

Definition: The context files were accurate when written but no longer match the codebase. Root cause: the scaffolding maintenance process doesn't require context file updates when the code changes. Detection: Last-verified dates trigger review; a grep for deprecated symbols referenced in context files. Remediation: Any commit that deprecates or replaces a pattern named in a context file must include a context file update — this is enforced by a commit convention, not by hope.

Failure Mode 3 — Context Gap

Definition: The context files are complete and accurate but don't cover a new situation. Root cause: no scaffolding system can be fully forward-looking. Detection: When an AI session makes an architectural choice not covered by the context files, that's a gap signal — a new situation has appeared. Remediation: After any session where the AI made an unguided architectural choice, update the context files before closing the session. Gaps fill themselves if you maintain the discipline.

Governance — NIST AI RMF: MANAGE Function

NIST AI RMF — MANAGE: Risk Treatment

The MANAGE function requires that identified risks have treatment plans — not just detection, but response protocols. Each scaffolding failure mode is a risk category that must have a defined treatment: conflict → hierarchy protocol, drift → co-update requirement, gap → post-session update discipline. Treating these as emergencies rather than managed risks means the same failures recur indefinitely.

NIST AI RMF — MANAGE: Monitoring

MANAGE also requires ongoing monitoring for risk materialization. Scaffolding failures don't announce themselves — they show up as AI behavior that "seemed reasonable." A monitoring practice is an explicit review step: after any AI session, verify that the output is consistent with the context files. When it's not, determine whether the inconsistency is a conflict, drift, or gap — and apply the appropriate treatment.

Context

Detection Before Damage

2 min read

The most dangerous scaffolding failure is the one that produces code that compiles, passes tests, and reaches code review before anyone realizes something is wrong. All three projects in the scenario had that problem. Detection must happen before damage, not after.

Detect Conflicts: Consistency Audit

After any context file is updated, run a consistency check: read all context files together and look for rules that contradict each other. This can be done manually for small systems or with a simple script that flags shared terms used differently across files. Add this to the commit convention: "Any CLAUDE.md or rules file change requires a consistency audit before merge."

Detect Drift: Symbol Grep

When a class, function, or pattern is deprecated or renamed, grep for that term in all context files before closing the PR. If it appears in a context file without a DEPRECATED marker, update the context file in the same PR. This links code changes and context changes, making drift mechanically impossible without deliberate choice.

Detect Gaps: Decision Review

After any AI session, read the session's commits. For every architectural decision the AI made, verify it's covered by a context file rule. If it made a decision not covered by any rule, that's a gap. Add the rule before starting the next session. Gaps compound: an uncovered decision in one session becomes a precedent the AI follows in the next.

⚔ Debate Lab

Scaffolding Failure Analysis

~25 minutes · 3 failures

What you're doing

Diagnose each of the three broken projects, identify the governance violation, and argue a remediation protocol. Then defend your answer to the hardest question: which failure mode is most dangerous?

Roles

🔬

You — DiagnosticianIdentify failure type, root cause, governance violation, and remediation protocol for each project.

⚖

AI — ChallengerI'll test your diagnosis against edge cases, push back on oversimplified remediations, and force a ranking argument at the end.

Three failure modes

Conflict (Alpha) · Drift (Beta) · Gap (Gamma)

For each failure

Failure mode — which type?

Root cause — why did it happen?

NIST MANAGE violation — which requirement?

Remediation — systematic, not ad hoc

Detection — before damage, not after

Shift + Enter for a new line

✓ Module Complete

You've completed Module 7 of 8. Your failure analysis is your seventh portfolio artifact.

Next Module →