Capstone: Build Your Own Agent
In this final module, you'll design an AI agent from the ground up. Not for a textbook scenario—for a real problem you care about. You'll specify what the agent does, how it makes decisions, where it could go wrong, and how you'd govern it. This is your portfolio piece: evidence that you understand agent design.
Learning Outcomes
- Design a complete agent architecture for a problem of your choice
- Specify tools, decision logic, and failure modes
- Document your governance and accountability approach
- Articulate the agent's limitations and when humans should override it
- Present your design clearly enough that others can understand it and critique it
Portfolio Artifacts
By the end of this lab, you'll have:
Agent Design Document: A written specification describing what your agent does, its architecture, tools, decision-making process, and failure modes.
Governance Framework: A clear statement of where the agent acts autonomously, where humans approve, and how you'd detect and fix failures.
Risk & Mitigation Plan: A list of what could go wrong, why it matters, and how you'd prevent or recover from each failure.
A Defense: You can articulate why this agent would be net-positive to the world, and you're prepared for criticism.
The Creative Brief
Pick Your Problem
You can design an agent for any domain you find interesting. But make it real. Not "an agent that solves everything." Something specific.
Example Domains
Education: An agent that tutors students, adapting to their learning pace and explaining concepts in multiple ways. Challenge: How do you know if the student actually learned, or just got the right answer?
Environmental Monitoring: An agent that monitors satellite imagery to detect deforestation or illegal mining. Challenge: False positives get researchers on wild goose chases. False negatives let destruction happen undetected.
Accessibility: An agent that transcribes video in real time with high accuracy, tailored to the speaker's accent and domain expertise. Challenge: No single model works for all languages and contexts.
Supply Chain: An agent that forecasts demand, optimizes inventory, and coordinates with suppliers. Challenge: Small mistakes compound across a global network.
Moderation: An agent that flags harmful content on a social platform, balancing free speech with safety. Challenge: Context matters. The same post is fine in one context and dangerous in another.
Science: An agent that reads research papers and generates literature reviews. Challenge: Must understand nuance, synthesize across conflicting studies, and not hallucinate citations.
Constraints
Keep your agent ambitious but bounded. Avoid:
- Too vague: "An agent that helps people" isn't specific enough. Help them do what?
- Too complex: "An agent that understands everything and predicts the future" isn't feasible. Scope down.
- Too simple: "An agent that sends emails" doesn't show depth. Pick something that requires real decision-making.
Pick something you could defend to a skeptical peer. Something you'd be willing to stand behind professionally.
How to Design an Agent
Step 1: Define the Problem
The Question to Answer
- What problem does your agent solve?
- Who benefits? (Be specific: students, farmers, doctors, artists?)
- What would they do without this agent? (Manually? Use something worse? Give up?)
- What constraints do they face? (Time? Cost? Expertise? Risk tolerance?)
- Why is this hard? (Data scarcity? Complexity? Ambiguity? Scale?)
Step 2: Design the Architecture
Five Core Components
- Input/Sensing: What data does the agent receive? Where does it come from? Is it real-time or batch? How fresh is it?
- Reasoning/Decision: How does the agent decide what to do? What's the logic? Is it deterministic or probabilistic?
- Tools/Actions: What can the agent actually do? What tools does it have? What does it NOT have access to?
- Learning/Feedback: How does the agent improve over time? What signals drive learning? Who decides if it's improving?
- Safeguards/Guardrails: What prevents the agent from going wrong? Where are the hard stops?
Step 3: Identify Failure Modes
Ask Yourself
- What's the worst thing this agent could do?
- What's the most likely failure? (Not the worst, but the most probable.)
- What data would the agent need to be very wrong?
- How would you know if the agent failed? (What metric would break?)
- How long before you'd catch the failure?
- What's the cost of being wrong?
Step 4: Design Governance
The Oversight Model
- Autonomy: What does the agent decide alone?
- Escalation: What decisions go to humans first?
- Audit: How do you monitor the agent's performance?
- Override: How do you stop or correct the agent if needed?
- Accountability: Who's responsible if something goes wrong?
Step 5: Articulate Limitations
Be Honest About
- What your agent can't do (be specific, not vague)
- What it might do worse than humans (even if it's faster)
- What assumptions it relies on (if those assumptions break, what happens?)
- What contexts it's not suitable for
- What you'd need to know before deploying it at scale
Putting It Together: The Design Document
Your portfolio piece should include:
Agent Design Checklist
- 1-sentence summary: "This agent [does what] for [who] to [solve what problem]."
- Problem statement: Why does this problem matter? Who feels the pain?
- Architecture diagram or description: Inputs, reasoning, tools, feedback, safeguards.
- Tool inventory: What can the agent access or do?
- Decision logic: How does it choose? (Simple if-then rules? ML model? LLM? Hybrid?)
- Failure modes: What can go wrong? How likely is each? What's the impact?
- Governance model: Where does it act autonomously? Where do humans approve? How do you audit?
- Limitations & honest assessment: What doesn't it do well? When should humans override?
- Success metrics: How do you know if the agent is actually helping?
- Open questions: What would you need to know before deployment? What research is needed?
Design Principles to Keep in Mind
- Start narrow: An agent that does one thing well is better than an agent that does everything poorly.
- Assume worst case: How will this agent be misused? What happens then?
- Plan for uncertainty: Your agent will be wrong sometimes. How will you detect and fix it?
- Humans come first: The agent is a tool. People's welfare is the goal.
- Be transparent: You should be able to explain your agent to affected users and stakeholders.