Intro

Should AI Do This?

2 min read

Every AI system that makes decisions on behalf of people needs an answer to one question: Should this exist?

Not "Can we build it?" — everyone can build it. Not "Does it work?" — most AI works at something. The question is: Should we deploy it?

The difference between a thoughtful builder and someone who just ships is the ability to step back and ask: What are the stakes? What goes wrong if the AI is right? What goes wrong if the AI is wrong? Who gets hurt? Can they opt out? Is there a human reviewing this?

This module gives you a framework for making that judgment. Not abstractly — on real proposals, under pressure, where your thinking has to hold up in a room full of people with different incentives.

Your artifact

A governance-aligned recommendation memo evaluating an AI deployment against EU AI Act risk categories, NIST risk mapping criteria, and UNESCO human rights principles

By the end of this module, you will:

Apply a framework to evaluate the appropriateness of any AI use case
Identify red lines and gray areas in AI deployment decisions
Argue a position on an AI deployment and defend it under pressure
Evaluate the stakes, reversibility, and human alternatives for any use case
Write a clear recommendation memo with reasoning that holds up

Scenario

The VC Room

3 min read

Five startups pitch in a single afternoon. Five uses of AI that could be funded.

Pitch 1: An AI grief counselor. An app that talks to people who've just lost a loved one. The AI is trained on interviews with real grief therapists. The founders believe this can reach people in rural areas with no access to actual therapists. They've talked to 40 people in their target demographic. Twelve said they'd use it.

Pitch 2: AI parole recommendations for county courts. A model trained on 5 years of parole decisions from a specific county. The AI scores each parole petition as "likely to succeed" or "likely to fail." The county uses this to prioritize cases — parole officers handle "likely to succeed" cases first, as they're lower risk. The vendor claims this could process 30% more cases per year without adding headcount.

Pitch 3: An emotional AI tutor. Software for schools that adapts to a child's emotional state while learning. The AI watches for frustration, boredom, or confidence and adjusts lesson difficulty and pacing. The founders have tested it with 200 students. Math scores improved by an average of 8%.

Pitch 4: AI performance prediction from behavioral signals. A tool for HR departments that predicts which employees are most likely to become high performers. The model trains on hiring, reviews, and performance data. The company's pitch: Use this for targeted development spending — invest in people the model says will have high impact.

Pitch 5: An AI legal brief writer. A tool that takes a case description and outputs a legal brief draft. Targets solo practitioners and small firms that can't afford expensive associate lawyers. The AI is trained on thousands of real briefs. 80% of outputs require zero editing; 15% need minor tweaks; 5% need rewrites.

The room is split. One partner wants to fund all five. Another wants to fund none. A third says "it's complicated." They have one hour to decide which to fund.

This module teaches you to do that analysis.

Lesson

The Four-Question Framework

3 min read

One framework runs through every hard deployment decision. Ask these four questions. Be specific with your answers. Your reasoning will come from this framework.

Question 1: What are the stakes?

What happens if the AI is wrong? There's a difference between "wrong" meaning inconvenient and "wrong" meaning someone gets hurt. There's a difference between low-stakes errors and high-stakes errors you can't undo.

Question 2: Can humans override this?

Not theoretically — in practice. Does a human review every decision? Can they say no? Would they? If the AI recommends X and a human has the authority to do Y instead, then humans still have control. If the AI recommends X and it just happens, humans have lost control.

Question 3: Is there a human alternative?

What would happen if you didn't deploy the AI? What would a human do instead? What would be lost, and what would be gained? The AI doesn't replace nothing — it replaces something. Know what you're replacing.

Question 4: Do people know AI is involved?

Do the affected parties understand that an AI made or influenced the decision about them? Can they ask for a human review? Can they opt out? Transparency isn't everything, but it's something.

Governance Standards — The Regulatory Layer

EU AI Act — Article 9: Risk Management

High-risk AI systems must have a documented risk management process covering: identification of reasonably foreseeable risks, estimation of those risks, risk mitigation, and residual risk evaluation. Before deploying any AI in high-stakes domains (justice, employment, education, healthcare), ask: Does a documented risk management process exist? Who owns it?

NIST AI RMF — MAP Function

The MAP function requires identifying AI risks in context: who is affected, what harms could occur, and how likely and severe those harms are. Before approving a deployment, you should be able to map the risk landscape — not abstractly, but with named populations, named failure modes, and named harms. If you can't map it, you can't manage it.

UNESCO AI Recommendation — Human Rights & Fairness

UNESCO's 2021 AI ethics recommendation requires that AI systems respect human rights and do not discriminate. Concretely: Is there evidence that the training data reflects biased decisions from the past? Does the AI encode historical inequities? Are the affected populations — often those with least power — able to challenge decisions about them?

These four questions won't give you clean yes-or-no answers. They'll give you clarity about what you're actually deciding.

Context

Where the Lines Are

2 min read

Three zones. Red, gray, and green. Understanding them helps you see where deployment proposals actually fall.

Red Line — Do not deploy

AI making irreversible decisions that significantly affect people's lives without human review. Parole recommendations that judges don't review. Hiring systems that automatically reject people. Medical diagnoses that patients never see explained. High stakes + no override + no transparency = red line. This isn't about the AI being bad — it's about the structure concentrating authority in a system that can't reverse mistakes.

Gray Area — Deploy with conditions

AI assisting decisions where humans still have final authority but may defer to the AI. Customer service bots that humans can review. Recommendations that experts consider but aren't bound by. Here you're setting conditions: humans review X%, thresholds trigger escalation, explanations are provided, people can request human review.

Green Light — Deploy

AI automating tasks where errors are low-stakes and easily corrected. Sorting emails. Generating drafts. Organizing data. If the worst case is "the output is bad and we need to redo it," the stakes are manageable. Humans will catch it. It's reversible.

Apply regulatory standards to each zone

High-risk deployments require Art.9 compliance

Any system that falls in the Red or Gray zone likely qualifies as high-risk under EU AI Act Annex III. That means mandatory risk management documentation, conformity assessment, and post-market monitoring — before deployment, not after.

NIST MAP before you approve

For Gray and Red zone proposals, run a NIST MAP analysis: Name the affected population. Name the failure modes. Estimate probability and severity. If the deployer cannot provide this, your approval should be conditional on producing it.

UNESCO fairness check for all three zones

Even Green-light deployments can encode unfairness. Ask: Is the training data representative of all affected groups? Can individuals contest AI-influenced decisions? Are marginalized communities disproportionately exposed to error?

Not every deployment is obvious. Gray area is where the conversation happens.

⚔ Debate Lab

Appropriateness Debate

~20 minutes · 3 proposals

What you're doing

You'll evaluate three AI deployment proposals. For each, take a position: approve, approve with conditions, or reject. Defend your position. I'll challenge it.

Roles

🤔

You — Decision-MakerYou evaluate deployments and make a judgment call. Your reasoning should hold up under pressure.

🎯

AI — Skeptical QuestionerI'll push back on every position. Not to change your mind — to make sure you've thought it through.

Three proposals

Public defender case prioritization · Housing intervention prediction · Resume screening

Framework — apply to each

What are the stakes?

Can humans override?

Is there a human alternative?

Do people know AI is involved?

EU AI Act Art.9 — Is there a risk management process?

NIST MAP — Can you name the affected population and failure modes?

UNESCO — Does the deployment respect human rights and avoid encoding historical bias?

Success criteria

Take a clear position on each proposal. Defend it with specific reasoning tied to the framework. You can change your position if your reasoning evolves — that's fine.

Shift + Enter for a new line

✓ Module Complete

You've completed Module 3 of 8.

Next Module →