Your Agent Portfolio

Design, document, and defend an AI agent you'd be proud to build

BUILD LAB ~30 min

Capstone: Build Your Own Agent

In this final module, you'll design an AI agent from the ground up. Not for a textbook scenario—for a real problem you care about. You'll specify what the agent does, how it makes decisions, where it could go wrong, and how you'd govern it. This is your portfolio piece: evidence that you understand agent design.

Learning Outcomes

Design a complete agent architecture for a problem of your choice
Specify tools, decision logic, and failure modes
Document your governance and accountability approach
Articulate the agent's limitations and when humans should override it
Present your design clearly enough that others can understand it and critique it

Portfolio Artifacts

By the end of this lab, you'll have:

Agent Design Document: A written specification describing what your agent does, its architecture, tools, decision-making process, and failure modes.

Governance Framework: A clear statement of where the agent acts autonomously, where humans approve, and how you'd detect and fix failures.

Risk & Mitigation Plan: A list of what could go wrong, why it matters, and how you'd prevent or recover from each failure.

A Defense: You can articulate why this agent would be net-positive to the world, and you're prepared for criticism.

The Creative Brief

Pick Your Problem

You can design an agent for any domain you find interesting. But make it real. Not "an agent that solves everything." Something specific.

Example Domains

Education: An agent that tutors students, adapting to their learning pace and explaining concepts in multiple ways. Challenge: How do you know if the student actually learned, or just got the right answer?

Environmental Monitoring: An agent that monitors satellite imagery to detect deforestation or illegal mining. Challenge: False positives get researchers on wild goose chases. False negatives let destruction happen undetected.

Accessibility: An agent that transcribes video in real time with high accuracy, tailored to the speaker's accent and domain expertise. Challenge: No single model works for all languages and contexts.

Supply Chain: An agent that forecasts demand, optimizes inventory, and coordinates with suppliers. Challenge: Small mistakes compound across a global network.

Moderation: An agent that flags harmful content on a social platform, balancing free speech with safety. Challenge: Context matters. The same post is fine in one context and dangerous in another.

Science: An agent that reads research papers and generates literature reviews. Challenge: Must understand nuance, synthesize across conflicting studies, and not hallucinate citations.

Constraints

Keep your agent ambitious but bounded. Avoid:

Too vague: "An agent that helps people" isn't specific enough. Help them do what?
Too complex: "An agent that understands everything and predicts the future" isn't feasible. Scope down.
Too simple: "An agent that sends emails" doesn't show depth. Pick something that requires real decision-making.

Pick something you could defend to a skeptical peer. Something you'd be willing to stand behind professionally.

How to Design an Agent

Step 1: Define the Problem

The Question to Answer

What problem does your agent solve?
Who benefits? (Be specific: students, farmers, doctors, artists?)
What would they do without this agent? (Manually? Use something worse? Give up?)
What constraints do they face? (Time? Cost? Expertise? Risk tolerance?)
Why is this hard? (Data scarcity? Complexity? Ambiguity? Scale?)

Step 2: Design the Architecture

Five Core Components

Input/Sensing: What data does the agent receive? Where does it come from? Is it real-time or batch? How fresh is it?
Reasoning/Decision: How does the agent decide what to do? What's the logic? Is it deterministic or probabilistic?
Tools/Actions: What can the agent actually do? What tools does it have? What does it NOT have access to?
Learning/Feedback: How does the agent improve over time? What signals drive learning? Who decides if it's improving?
Safeguards/Guardrails: What prevents the agent from going wrong? Where are the hard stops?

Step 3: Identify Failure Modes

Ask Yourself

What's the worst thing this agent could do?
What's the most likely failure? (Not the worst, but the most probable.)
What data would the agent need to be very wrong?
How would you know if the agent failed? (What metric would break?)
How long before you'd catch the failure?
What's the cost of being wrong?

Step 4: Design Governance

The Oversight Model

Autonomy: What does the agent decide alone?
Escalation: What decisions go to humans first?
Audit: How do you monitor the agent's performance?
Override: How do you stop or correct the agent if needed?
Accountability: Who's responsible if something goes wrong?

Step 5: Articulate Limitations

Be Honest About

What your agent can't do (be specific, not vague)
What it might do worse than humans (even if it's faster)
What assumptions it relies on (if those assumptions break, what happens?)
What contexts it's not suitable for
What you'd need to know before deploying it at scale

Putting It Together: The Design Document

Your portfolio piece should include:

Agent Design Checklist

1-sentence summary: "This agent [does what] for [who] to [solve what problem]."
Problem statement: Why does this problem matter? Who feels the pain?
Architecture diagram or description: Inputs, reasoning, tools, feedback, safeguards.
Tool inventory: What can the agent access or do?
Decision logic: How does it choose? (Simple if-then rules? ML model? LLM? Hybrid?)
Failure modes: What can go wrong? How likely is each? What's the impact?
Governance model: Where does it act autonomously? Where do humans approve? How do you audit?
Limitations & honest assessment: What doesn't it do well? When should humans override?
Success metrics: How do you know if the agent is actually helping?
Open questions: What would you need to know before deployment? What research is needed?

Design Principles to Keep in Mind

Start narrow: An agent that does one thing well is better than an agent that does everything poorly.
Assume worst case: How will this agent be misused? What happens then?
Plan for uncertainty: Your agent will be wrong sometimes. How will you detect and fix it?
Humans come first: The agent is a tool. People's welfare is the goal.
Be transparent: You should be able to explain your agent to affected users and stakeholders.

Reference & Examples

What a Good Agent Design Looks Like

Specific: "An agent that reads property inspection reports and flags safety hazards" is better than "an agent that helps with real estate."
Scoped: It knows what it's not supposed to do. ("This agent does X. It doesn't do Y.")
Honest about limitations: "This agent is trained on urban properties. It may not work well for rural or historic buildings."
Designed for failure: It includes checks, escalations, and human oversight points.
Accountable: You've thought about who's responsible if something goes wrong.

Real-World Agent Designs (Examples to Study)

GitHub Copilot: Autocompletes code based on context. Scope: programming assistance. Limitation: sometimes generates insecure or incorrect code. Oversight: developer reviews before accepting.
Autonomous vehicles: Navigate roads without human intervention. Scope: highways and urban streets (context-dependent). Limitation: edge cases (construction, weather, unpredictable pedestrians). Governance: fail-safe design, redundant sensors, human takeover available.
Content recommendation (Netflix, Spotify): Suggests shows, songs, or movies. Scope: personalized entertainment discovery. Limitation: can reinforce existing preferences. Oversight: users can rate recommendations and adjust their preferences.
Fraud detection (banks): Flags suspicious transactions autonomously. Scope: pattern matching on transaction data. Limitation: false positives frustrate customers; false negatives let fraud through. Governance: high-confidence alerts are blocked immediately; lower-confidence flags are reviewed by humans.

Questions to Guide Your Thinking

Would you use this agent if you were the end user?
What could make you regret building this agent?
Who disagrees with your design? What's their concern?
How would you explain this agent to a skeptical journalist? A concerned parent? A regulator?
If this agent fails in the worst way, what happens to you professionally?

Build Your Agent Design

Agent Portfolio Lab

Design an AI agent you'd be proud to build. The AI coach will ask clarifying questions and pressure-test your design.

Your task: Design a complete AI agent. Pick a real problem you care about. Specify what the agent does, how it makes decisions, where it could fail, and how you'd govern it. The coach will challenge your thinking and help you refine the design.

Start by describing your agent: the problem it solves, who it serves, and your initial architecture.

What to Cover in Your Design

The problem: Be specific. Don't say "help people"—help them do what?
The users: Who benefits? What's their current workflow? What pain do they feel?
The architecture: Inputs, reasoning, tools, feedback, safeguards.
The limitations: What doesn't it do? When might it fail? What assumptions might break?
The governance: Where does it act autonomously? Where do humans approve? How do you audit?
The defense: Why is this agent net-positive? What would you say to a critic?