← Courses
Building Agentic Pipelines
← Module 7
Module 8 of 8
Next →
Intro
Scenario
Lesson
Context
Lab Build ~30 min
Intro

Backlog-Driven Pipelines

2 min read · Capstone

Everything in this course has been about a single pipeline run — one input, one output, one journey through stages and gates. Backlog-driven pipelines change the model. Instead of a developer triggering a pipeline manually with a single input, a pipeline pulls from a queue of prioritized work items, processes them systematically, and writes results back to the backlog as a record of what was done, what succeeded, and what failed.

Backlog-driven execution is the difference between a pipeline you run and a pipeline that runs. It requires new design work: the backlog entry must contain everything the pipeline needs; the priority and ordering logic must be explicit; the trigger conditions must be defined; and the write-back must create an accountability trail that a human can audit.

This is the capstone module. You'll apply everything from the course — stages, gates, handoffs, spec documents, recovery paths, escalation policies — to design the backlog integration layer for a real pipeline. The artifact is not the pipeline. It's the system that feeds the pipeline and records what it does.

Your artifact — Build Lab
A backlog integration design — the backlog entry schema, priority and ordering rules, pipeline trigger logic, and completion write-back defined for a real pipeline, with the accountability trail documented
  • Design a backlog entry schema that contains everything a pipeline needs to execute without human assistance
  • Write explicit priority and ordering rules for a backlog-driven pipeline
  • Define the trigger logic — what conditions cause the pipeline to pull from the backlog, and what prevents double-processing
  • Design the completion write-back — what the pipeline writes to the backlog when it succeeds, fails, or escalates
  • Apply all four NIST AI RMF functions (GOVERN, MAP, MEASURE, MANAGE) to the backlog integration design
  • Identify the accountability trail — what information allows a human to audit any pipeline run from backlog entry to completion
Scenario

The Queue That Nobody Managed

3 min read

A team has been running their specification generation pipeline for six months. The pipeline takes a GitHub issue and produces a technical specification. Developers trigger it manually — they open the pipeline tool, paste the issue URL, and click run.

After six months, the team has a problem: nobody knows what's in the queue, what's been processed, or what failed. There are 340 GitHub issues in their backlog. Approximately 60 of them have had specifications generated. Nobody is sure which 60. The pipeline tool logs individual runs, but there's no view of backlog coverage. Three specifications have been generated twice because developers didn't know the pipeline had already run on their issue.

The team's solution: switch to backlog-driven execution. The pipeline will pull from a prioritized queue of GitHub issues, process them in order, and write the result — success, failure, or escalation — back to the backlog as a comment on the issue. No manual triggering. No duplicate processing. Full coverage tracking.

But switching to backlog-driven execution requires design work the team hasn't done.

What does a backlog entry contain? The pipeline needs more than an issue URL — it needs priority, context, and any constraints on when it can run.

How does the pipeline decide what to process next? By age? By priority label? By team assignment?

What prevents two pipeline instances from processing the same item simultaneously? If two instances race for the same entry, the result is a duplicate spec and a confused team.

What does the pipeline write back when it succeeds? When it fails? When it escalates to human review? Without a defined write-back format, the audit trail is missing from the start.

How does a human audit any specific pipeline run three months later? The accountability trail must be in the backlog entry itself — not only in pipeline logs that may not be retained.

Lesson

The Pipeline That Runs

5 min read

A backlog-driven pipeline is not just a scheduled trigger. It's a governance system. The backlog entry is a specification of work. The write-back is an accountability record. The priority rules are a policy. All of these must be designed explicitly — they cannot be inferred at runtime.

A backlog entry must be self-contained. The pipeline should be able to execute any entry without asking a human for additional context. This means the entry must include: the input to the pipeline (issue URL, task description, or structured input), any constraints on execution (run only during business hours, require human sign-off before stage N), priority metadata (team, label, age, explicit priority score), and status tracking (not started, in progress, complete, failed, escalated).

The entry is also the accountability record's foundation. If a human audits a pipeline run six months later, the entry is the starting point — what was the input, what were the constraints, what was the priority at the time of processing?

Explicit ordering rules prevent the pipeline from becoming a FIFO queue that processes low-priority items while high-priority items wait. Design the rule as a formula: priority score = (explicit priority label weight) + (age in days × age weight) + any additional signals. The formula must be specific enough to compute from the entry's metadata, and it must be deterministic — the same entry always gets the same priority score given the same formula.

Two pipeline instances must not process the same entry simultaneously. The standard solution: a claimed state. When the pipeline picks up an entry, it marks it as in-progress with a timestamp and an instance ID. Any other instance that pulls the same entry sees the claimed state and skips it. Entries that have been in-progress for longer than a configurable timeout are returned to the queue as unclaimed — the timeout handles the case where a pipeline instance crashes mid-execution.

The write-back creates the accountability trail. At minimum, it must include: the pipeline run ID, the timestamp of each stage's completion, the output artifact location, the gate results for each stage (pass/fail, with validation rule IDs), and the final status (complete, failed at stage N, escalated). For failures and escalations, the write-back must also include the failure reason and the recovery action taken (retried N times, escalated to [human role]).

GOVERN

Who owns the pipeline, who owns the backlog, and what policies govern which items are eligible for automated processing? The GOVERN function requires that these roles and policies are explicitly assigned — not assumed. An item that the team has not explicitly declared eligible for automated processing should not enter the queue.

MAP

What are the failure modes of the backlog integration layer itself? What happens if the priority formula produces wrong scores for a class of items? What if the claimed state mechanism fails and two instances process the same entry? MAP the integration layer, not just the pipeline stages — the integration layer has its own failure modes that are separate from what happens inside any single run.

MEASURE

What metrics track whether the backlog-driven pipeline is performing correctly? Coverage rate (what percentage of eligible items have been processed), processing latency (how long does a high-priority item wait before the pipeline processes it), escalation rate (what percentage of items require human review), and failure rate by failure type. These metrics are not optional — they are the signal that the integration layer is working.

MANAGE

How does the backlog integration design evolve as the pipeline evolves? New pipeline stages may require new fields in the backlog entry. New team structures may require updated priority rules. The write-back format must be versioned so that historical records remain readable after the pipeline changes. MANAGE means designing for evolution, not just for the current state.

Context

The Four Layers to Design

3 min read

This is a capstone module. All four layers of the backlog integration design must be present in your artifact. Apply the course's full toolkit to each.

1. Entry schema — what the pipeline needs before it starts

Every field must be justified. If a field isn't used by at least one stage, gate, or priority rule, it doesn't belong in the entry. If a field is needed by a stage but not in the entry, the stage will fail silently or require human intervention. Write the entry schema as a flat list of fields with name, type, and purpose.

2. Priority and ordering formula — the policy for what runs first

Express it as a computable formula. A priority label called "high" is not a formula. "Priority score = (high:3, medium:2, low:1) + (age in days / 7) rounded down, max 5" is a formula. Test it: given these three entries, which one runs first?

3. Trigger and deduplication logic — what starts the pipeline and prevents double-processing

Trigger conditions: when does the pipeline pull from the backlog? Continuous polling? Scheduled intervals? Event-driven? Deduplication: how does the pipeline mark an entry as claimed, and how does it handle crashed instances? The timeout value for reclaiming stale in-progress entries must be explicit.

4. Completion write-back — the accountability trail

Define the exact fields written back on success, failure, and escalation. Include: run ID, stage-by-stage timestamps, gate results, output artifact reference, and final status with reason. This write-back is what a human reads three months from now when they need to know what the pipeline did. If it doesn't contain enough to reconstruct the run, the accountability trail is incomplete.

In the lab, you'll design all four layers for a real pipeline — yours or the specification generation pipeline from the scenario. The AI will push you to complete every layer with enough specificity to implement.

◆ Build Lab
Backlog Integration Design
~30 minutes · 4 layers
What you're doing
Design the backlog integration layer for a real pipeline. You'll cover all four layers: entry schema, priority formula, trigger and deduplication, completion write-back. The output is a design specific enough to implement and audit.
Roles
🏗
You — Integration ArchitectDesign all four layers for a real pipeline. Be specific enough that another developer could implement from your design.
🔍
AI — Senior ArchitectI'll review each layer against the course's full toolkit. I won't let you skip a layer or stay vague. All four NIST functions apply.
Framework — all four layers required
Entry schema: self-contained, every field justified by the stage that needs it
Priority formula: computable, deterministic, testable with examples
Deduplication: claimed state + timeout handling for crashed instances
Write-back: run ID + stage timestamps + gate results + output reference + final status
GOVERN: roles and policies explicitly assigned
MAP: failure modes of the integration layer itself
Success criteria
All four layers complete with enough specificity to implement. Accountability trail sufficient for a human to reconstruct any pipeline run from backlog entry to completion.
Shift + Enter for a new line
✓ Module Complete
You've completed Module 8 of 8.
Back to Courses →