← Courses
Building Agentic Pipelines
← Module 2
Module 3 of 8
Module 4 →
Intro
Scenario
Lesson
Context
Lab Debate ~25 min
Intro

Orchestration vs. Chaining

2 min read

There are two fundamentally different ways to connect agents. In a chain, output from one step becomes input to the next — the path is fixed, the sequence is predetermined, and every stage executes in order regardless of what happened upstream. In an orchestrated pipeline, a coordinator receives results, decides what happens next, and can route, retry, branch, or escalate based on the quality and content of each output.

Both approaches work. Neither is universally correct. The mistake most developers make is choosing one and applying it everywhere — building chains when they need routing, or adding orchestration overhead to workflows that run the same way every time.

This module forces the decision. You'll be handed three architectural approaches to the same pipeline problem and asked to defend one of them. A skeptical technical panel will pressure-test your reasoning against failure scenarios, coordination costs, and maintenance burden. The goal isn't to win — it's to understand exactly where your chosen architecture breaks down, and whether that's a tradeoff you can defend.

Your artifact — Debate Lab
A written architectural defense — your chosen control model (chain, orchestrated, or hybrid) argued against the specific failure modes it doesn't handle well, with explicit tradeoffs stated
  • Distinguish sequential chaining from coordinator-based orchestration and name the tradeoffs of each
  • Apply the routing test to determine when a pipeline genuinely needs a coordinator
  • Calculate the coordination tax for an orchestrated pipeline and decide if it's justified
  • Defend an architectural choice under adversarial questioning about its failure modes
  • Apply NIST MAP to architectural decisions — mapping how each design fails, not just how it succeeds
Scenario

Three Proposals for the Same Pipeline

4 min read

A development team is building an automated code review pipeline. The input is a GitHub pull request. The output is a structured review report — security findings, logic issues, style violations, and a ship/revise recommendation. The pipeline must handle PRs ranging from a one-line typo fix to a 2,000-line feature addition. Three engineers have each proposed a different architecture.

The PR diff enters a linear sequence: diff normalization → security scan → logic review → style check → report assembly. Each stage passes its output directly to the next. If any stage fails, the chain terminates. There is no routing logic. The security scan runs on a one-liner the same way it runs on a 2,000-line feature. Report assembly receives all three upstream outputs and produces the final document.

Estimated cost: lowest. No coordinator, no state management. Easy to debug — each step is a function with a clear input and output.

A coordinator agent receives the PR and makes routing decisions. Small PRs go to a fast-path reviewer. Large PRs are split and dispatched to parallel specialized reviewers — one for security, one for logic, one for style — each running independently. The coordinator collects results, detects conflicts between reviewer outputs, and routes unresolved conflicts to a final arbitration step before report assembly.

Estimated cost: 40–60% higher token usage. The coordinator adds two round trips. Parallel execution reduces wall-clock time for large PRs. The arbitration step adds latency when reviewers disagree.

A lightweight triage step classifies the PR as simple or complex. Simple PRs enter a sequential chain (Architecture A). Complex PRs enter an orchestrated flow with parallel reviewers but no arbitration step — conflicts in reviewer outputs are surfaced in the report rather than resolved automatically.

Estimated cost: slightly above A for simple PRs, slightly below B for complex ones. Adds a classification step with its own failure mode: misclassified PRs get the wrong architecture.

The team needs to ship one of these architectures. You will defend one of them — including its failure modes — to the rest of the engineering team.

Lesson

The Coordination Tax

5 min read

Orchestration is not an upgrade over chaining. It's a different tradeoff. A coordinator adds real costs — extra round trips, more failure modes, more state to manage — and it earns those costs only when you genuinely need routing logic at the control level. The question is never "should I orchestrate?" It's "does the routing flexibility I gain justify the coordination overhead I'm paying?"

Chains are the right architecture when the execution path is fixed and the same for every input. When stage 2 always follows stage 1, regardless of what stage 1 produced. When failures are handled locally — retry the step, or terminate. When the pipeline runs the same way for a one-line input and a thousand-line input, and that's intentional.

Chains are often the wrong choice when developers treat them as a default. "I'll start with a chain and add routing later" almost never works — adding routing to a chain requires rearchitecting it as an orchestrated pipeline. The coupling runs in the wrong direction.

Orchestration earns its cost when any of the following are true:

Routing based on output

The output of one stage determines which stage runs next — or whether multiple stages run in parallel. This is impossible in a chain without retrofitting a coordinator.

Isolated failure handling

A failure in one parallel stream should not cascade to the others. A coordinator can isolate failures, route retries to different models, and aggregate partial results. A chain cannot.

State that spans stages

The pipeline needs to maintain context across branches and merges — decisions made in stage 1 affect arbitration in stage 4. A coordinator manages that state. A chain doesn't have a coordination layer to hold it.

NIST AI RMF — MAP Function

MAP requires identifying how an AI system can fail before building it. For architectural decisions, this means mapping the failure modes of each proposed design — not just the happy path. An orchestrated pipeline fails differently than a chain. A chain's failures cascade linearly; an orchestrator's failures are concentrated in the coordinator. NIST MAP applied here means drawing both failure maps before committing to either architecture.

O*NET — Complex Problem Solving (2.B.3.g)

Architectural choices compound. Choosing the wrong control model doesn't produce a one-time rework — it propagates into every stage the pipeline adds later. O*NET's Complex Problem Solving framework requires identifying the long-term constraints of a decision, not just its immediate fit. A chain that needs routing in month three requires more rework than a chain that never needs routing at all.

O*NET — Systems Evaluation (6.A.1.b)

Systems evaluation means measuring whether the system is doing what it should — and identifying the indicators that would tell you it isn't. For a pipeline architecture decision, that means naming: what metric would tell you that Architecture A is underperforming versus Architecture B for your actual inputs? Cost per review? Latency on large PRs? Review accuracy on security findings?

Context

Three Tests Before You Commit

3 min read

Before defending an architecture, you need three things: a routing test to determine whether coordination is required, a failure map to understand how your chosen design breaks, and a measurement criterion to know whether the architecture is actually performing. Apply all three in the lab.

Does stage output determine what runs next?

Ask this question about every stage in your proposed architecture. If any stage's output changes which stage runs next — or whether multiple stages run in parallel — you need a coordinator. If every stage always calls the next stage regardless of output, a chain is sufficient and adding orchestration is waste. For the code review pipeline: does a small PR genuinely need the same review path as a 2,000-line feature? If yes, chain. If not, you need routing — which requires a coordinator.

Where does your architecture concentrate its failure risk?

Every architecture has a failure concentration point. In a chain, failures cascade — a bad output in stage 2 corrupts every stage downstream. In an orchestrated pipeline, failures concentrate in the coordinator — if it misroutes, every parallel stream is wrong. In a hybrid, failures concentrate in the triage classifier — a misclassified PR gets the wrong architecture. NIST MAP requires naming this concentration point before committing. You're not choosing the architecture that never fails. You're choosing which failure mode you can monitor and recover from.

What observable metric tells you the architecture is wrong for your inputs?

Name one metric — before shipping — that would indicate your chosen architecture is underperforming. For cost-sensitive teams: cost per review on PRs of different sizes. For latency-sensitive teams: wall-clock time on large PRs. For accuracy-sensitive teams: rate at which security findings are missed on small PRs that go through the fast path. O*NET Systems Evaluation requires that you know what you're measuring before you ship, not after you observe a problem.

In the lab, you'll commit to one of the three architectures and defend it to a panel that has read the failure maps for all three. You'll apply the routing test, name your failure concentration point, and state your measurement criterion. All three, in your opening argument.

⚡ Debate Lab
Architectural Defense
~25 minutes · 6 exchanges to complete
Your role
🧑‍💻
EngineerPick one architecture (A, B, or C) and defend it. State your reasoning using the three tests from Context. The panel will challenge your failure modes — defend them with specifics, not principles.
AI role
🔬
Technical PanelA skeptical engineering review panel that has read all three proposals and knows the failure modes of each. It will probe coordination overhead, cascade failures, triage accuracy, and measurement plans. It does not accept vague tradeoffs.
Framework reminders
Routing test: Does stage output determine what runs next? If yes, you need a coordinator.
Failure concentration: Where does your architecture concentrate its failure risk?
Measurement criterion: What metric tells you the architecture is wrong for your inputs?
How to complete
State which architecture you're defending and open with all three tests applied. The panel will push on your weakest points — stay specific. Lab completes after 6 substantive exchanges.
Shift + Enter for a new line
✓ Module Complete
You've completed Module 3 of 8.
Next Module →