Memory stores degrade. Not because entries are deleted — because they accumulate. The same fact gets written multiple times in slightly different forms. Outdated versions coexist with current ones. Entries that should have been merged into a single canonical record exist as four separate fragments, each partially correct.
Consolidation synthesis is the process that fixes this. It merges overlapping entries, resolves contradictions, and produces a non-redundant store where each fact has one canonical form. It is to memory what normalization is to a database — not glamorous, but the difference between a system that works reliably and one that doesn't.
Building a consolidation algorithm requires making real design decisions: when do two entries say the same thing? When do they contradict, and which version wins? When should a group of related entries become a single synthesized summary? This module teaches you to make those decisions explicitly.
A developer runs a query against their AI memory store: "What database does this project use?" Three entries come back. The first says PostgreSQL. The second says MySQL, from eight months ago. The third says "switching from MySQL to PostgreSQL — in progress," from five months ago.
The AI sees all three. Depending on which it weights most heavily, it might give the right answer, or it might hedge, or it might give the wrong answer confidently. The memory system has not resolved the contradiction — it has stored the contradiction and handed it to the AI to figure out.
The same pattern plays out across hundreds of entries. "The team agreed on camelCase" exists alongside "naming conventions: snake_case preferred." "API rate limit is 100 req/min" coexists with "updated rate limits: 500 req/min after March upgrade." Architectural decisions that evolved through three iterations all coexist as three separate entries, none marked as superseded.
The problem is that no consolidation has ever run. Every write was an append. No process has ever looked at the store and said: these two entries say the same thing, merge them. These three entries are a contradiction, resolve it. These five entries are fragments of a single fact, synthesize them.
Without consolidation, a memory store is a log, not a knowledge base. This module is about turning it into a knowledge base.
Consolidation handles four types of memory overlap, each requiring a different response. Identify the type first, then apply the right operation.
Exact duplicates — two entries with identical or near-identical content. Operation: delete one, keep the highest-scoring version.
Near-duplicates — two entries that say the same thing in different words or at different levels of detail. Operation: merge into a single canonical entry that preserves the more complete version.
Contradictions — two entries that make incompatible claims about the same subject. Operation: resolve using the conflict resolution policy (usually: most recent wins, but not always). Archive the superseded entry rather than deleting it — you may need to reconstruct history.
Fragments — multiple entries that each capture part of a larger fact. Operation: synthesize into a single comprehensive entry. Archive the source fragments.
Before consolidation can run, you need a similarity threshold: how similar must two entries be before they're candidates for merging? Semantic similarity (cosine distance between embeddings) is the most robust approach. A threshold around 0.85–0.90 catches near-duplicates without merging entries that are merely related. The right threshold for your store depends on how granular your entries are — finer-grained stores need higher thresholds.
When two entries contradict, you need a policy that determines which wins. Recency wins is the default — the newer fact supersedes the older one. But not always: if a carefully-reasoned architectural decision from six months ago contradicts a brief mention in last week's session, recency alone gives the wrong answer. A complete policy specifies when recency wins and when it doesn't — for example: recency wins for factual updates (rate limits, versions, names), but explicit decisions marked with a flag win over session mentions regardless of age.
NIST's MANAGE function treats lifecycle management as a core AI risk practice. A memory store without consolidation accumulates contradictions over time — which means the AI is operating on an increasingly unreliable knowledge base. Consolidation is the maintenance operation that keeps the store trustworthy. NIST MANAGE asks: is there a defined schedule for consolidation runs? Who triggers them? Who reviews the results?
When consolidation archives or deletes entries, those operations should be logged. If a consolidation run incorrectly merged two entries that were actually about different topics, you need to be able to reverse it. NIST GOVERN asks: is there a record of what was consolidated, when, and by what rule? Archive-before-delete is the safest policy.
A consolidation algorithm is a set of explicit decisions, not a magic process. Before you write the spec, you need to answer three questions precisely.
Don't guess. A threshold set too low will merge entries that should be separate. A threshold set too high will miss duplicates that are obvious to a human reader. The threshold should be validated empirically: sample 50 pairs of entries above the threshold and confirm they should have been merged. Adjust based on what you find.
The hardest conflict resolution case is a contradiction between a well-documented past decision and a recent casual mention. Your policy needs to handle this explicitly. One approach: entries written by a system process (session summaries, automated captures) yield to entries written by an explicit "decide" operation. Another: entries tagged as decisions are never overridden by recency alone — they require an explicit supersede operation.
Archive means: keep the entry, don't inject it, mark it as superseded. Delete means: remove it entirely. The safest policy is archive-before-delete with a retention window. Define that window: 90 days? 1 year? Permanent for decision-type entries? Specify this before you run your first consolidation pass.
You'll apply all three in the lab — building a complete consolidation algorithm spec and defending your design decisions under technical review.