Intro

When Memory Contradicts Itself

2 min read

Memory stores degrade. Not because entries are deleted — because they accumulate. The same fact gets written multiple times in slightly different forms. Outdated versions coexist with current ones. Entries that should have been merged into a single canonical record exist as four separate fragments, each partially correct.

Consolidation synthesis is the process that fixes this. It merges overlapping entries, resolves contradictions, and produces a non-redundant store where each fact has one canonical form. It is to memory what normalization is to a database — not glamorous, but the difference between a system that works reliably and one that doesn't.

Building a consolidation algorithm requires making real design decisions: when do two entries say the same thing? When do they contradict, and which version wins? When should a group of related entries become a single synthesized summary? This module teaches you to make those decisions explicitly.

Portfolio artifact

Build

A consolidation algorithm spec — a documented merge strategy with similarity thresholds, deduplication rules, conflict resolution policy, and archival criteria for a described memory store.

By the end of this module, you will:

Identify the four types of overlap that require consolidation: duplicates, near-duplicates, conflicts, and fragmented facts
Define similarity thresholds for triggering a merge operation
Design a conflict resolution policy that specifies which version of a fact wins
Specify archival criteria for entries that are consolidated but should not be deleted
Apply NIST MANAGE principles to consolidation as a lifecycle operation

Scenario

The Contradiction Problem

3 min read

A developer runs a query against their AI memory store: "What database does this project use?" Three entries come back. The first says PostgreSQL. The second says MySQL, from eight months ago. The third says "switching from MySQL to PostgreSQL — in progress," from five months ago.

The AI sees all three. Depending on which it weights most heavily, it might give the right answer, or it might hedge, or it might give the wrong answer confidently. The memory system has not resolved the contradiction — it has stored the contradiction and handed it to the AI to figure out.

The same pattern plays out across hundreds of entries. "The team agreed on camelCase" exists alongside "naming conventions: snake_case preferred." "API rate limit is 100 req/min" coexists with "updated rate limits: 500 req/min after March upgrade." Architectural decisions that evolved through three iterations all coexist as three separate entries, none marked as superseded.

The problem is that no consolidation has ever run. Every write was an append. No process has ever looked at the store and said: these two entries say the same thing, merge them. These three entries are a contradiction, resolve it. These five entries are fragments of a single fact, synthesize them.

Without consolidation, a memory store is a log, not a knowledge base. This module is about turning it into a knowledge base.

Lesson

Four Types of Overlap, Three Operations

3 min read

Consolidation handles four types of memory overlap, each requiring a different response. Identify the type first, then apply the right operation.

The four overlap types

Exact duplicates — two entries with identical or near-identical content. Operation: delete one, keep the highest-scoring version.

Near-duplicates — two entries that say the same thing in different words or at different levels of detail. Operation: merge into a single canonical entry that preserves the more complete version.

Contradictions — two entries that make incompatible claims about the same subject. Operation: resolve using the conflict resolution policy (usually: most recent wins, but not always). Archive the superseded entry rather than deleting it — you may need to reconstruct history.

Fragments — multiple entries that each capture part of a larger fact. Operation: synthesize into a single comprehensive entry. Archive the source fragments.

Defining similarity thresholds

Before consolidation can run, you need a similarity threshold: how similar must two entries be before they're candidates for merging? Semantic similarity (cosine distance between embeddings) is the most robust approach. A threshold around 0.85–0.90 catches near-duplicates without merging entries that are merely related. The right threshold for your store depends on how granular your entries are — finer-grained stores need higher thresholds.

Conflict resolution policy

When two entries contradict, you need a policy that determines which wins. Recency wins is the default — the newer fact supersedes the older one. But not always: if a carefully-reasoned architectural decision from six months ago contradicts a brief mention in last week's session, recency alone gives the wrong answer. A complete policy specifies when recency wins and when it doesn't — for example: recency wins for factual updates (rate limits, versions, names), but explicit decisions marked with a flag win over session mentions regardless of age.

Governance — NIST AI RMF

NIST MANAGE — Consolidation as Memory Lifecycle Management

NIST's MANAGE function treats lifecycle management as a core AI risk practice. A memory store without consolidation accumulates contradictions over time — which means the AI is operating on an increasingly unreliable knowledge base. Consolidation is the maintenance operation that keeps the store trustworthy. NIST MANAGE asks: is there a defined schedule for consolidation runs? Who triggers them? Who reviews the results?

NIST GOVERN — Audit Trail for Deletions

When consolidation archives or deletes entries, those operations should be logged. If a consolidation run incorrectly merged two entries that were actually about different topics, you need to be able to reverse it. NIST GOVERN asks: is there a record of what was consolidated, when, and by what rule? Archive-before-delete is the safest policy.

Context

Three Design Decisions Before You Build

2 min read

A consolidation algorithm is a set of explicit decisions, not a magic process. Before you write the spec, you need to answer three questions precisely.

1. What is your similarity threshold and how did you arrive at it?

Don't guess. A threshold set too low will merge entries that should be separate. A threshold set too high will miss duplicates that are obvious to a human reader. The threshold should be validated empirically: sample 50 pairs of entries above the threshold and confirm they should have been merged. Adjust based on what you find.

2. When does recency override a deliberate decision?

The hardest conflict resolution case is a contradiction between a well-documented past decision and a recent casual mention. Your policy needs to handle this explicitly. One approach: entries written by a system process (session summaries, automated captures) yield to entries written by an explicit "decide" operation. Another: entries tagged as decisions are never overridden by recency alone — they require an explicit supersede operation.

3. What gets archived versus deleted — and for how long?

Archive means: keep the entry, don't inject it, mark it as superseded. Delete means: remove it entirely. The safest policy is archive-before-delete with a retention window. Define that window: 90 days? 1 year? Permanent for decision-type entries? Specify this before you run your first consolidation pass.

You'll apply all three in the lab — building a complete consolidation algorithm spec and defending your design decisions under technical review.

🔨 Build Lab

Consolidation Algorithm Spec

~25 minutes · 4 exchanges

What you're building

A complete consolidation algorithm spec: similarity thresholds, operation rules for all four overlap types, a conflict resolution policy, and archival criteria. I'll review each decision and push for justification.

Roles

🏗

You — Algorithm DesignerYou specify the consolidation rules. Every decision needs a reason.

🔍

AI — Senior Engineer ReviewI'll challenge your thresholds, test your conflict policy against hard cases, and flag gaps in the spec.

Framework — apply to your spec

Four overlap types: duplicate, near-duplicate, contradiction, fragment

Similarity threshold: set empirically, not by intuition

Conflict resolution: when does recency win? When doesn't it?

Archive before delete — define the retention window

NIST MANAGE: consolidation as a scheduled, logged lifecycle process

Deliverable

A spec document covering: similarity threshold, rules for each overlap type, conflict resolution policy, archive/delete criteria, and consolidation schedule.

Shift + Enter for a new line

✓ Module Complete

You've completed Module 3 of 8. Your consolidation algorithm spec is in your portfolio.

Next Module →