Multi-Agent Orchestration
Multi-agent systems represent the next evolution of AI applications, enabling complex reasoning by decomposing problems into specialized sub-tasks handled by purpose-built agents. In this challenge, you will design an orchestration platform where multiple AI agents — each with distinct capabilities, tools, and knowledge bases — collaborate to handle complex business workflows. The architecture uses Amazon Bedrock Agents as the foundation, with a supervisor agent that receives user requests and delegates to specialist agents: a research agent with web search tools, a data analysis agent with SQL query capabilities, a document generation agent with template access, and a review agent that validates outputs. The orchestration layer uses Step Functions for deterministic workflow routing combined with Bedrock's native agent-to-agent invocation for dynamic delegation. Each agent has a scoped IAM role limiting its tool access — the research agent cannot write to databases, the data agent cannot access external APIs. Memory management uses DynamoDB for conversation history with a sliding window context strategy, and a shared scratchpad pattern where agents post intermediate results for other agents to consume. The architecture handles agent failure gracefully: if a specialist agent fails or returns low-confidence results, the supervisor retries with an alternative agent or escalates to human review via an SQS queue and SNS notification. Tool use is implemented via Lambda functions registered as Bedrock Agent action groups, with input/output schemas enforcing type safety. Observability includes per-agent latency tracking, token usage monitoring per agent, tool call success rates, and end-to-end workflow completion metrics. Cost control uses a budget allocation model where each workflow gets a token budget, and agents must operate within their allocation. This challenge teaches multi-agent orchestration patterns, agent capability scoping, and the operational challenges of running AI agent systems in production.
AWS Services You'll Use
Challenge Details
- Path
- AI/ML Infrastructure
- Difficulty
- Advanced
- Duration
- 75 min
- Plan
- Pro
Architecture Patterns You'll Learn
Why This Challenge?
Unlike whiteboard exercises or multiple-choice quizzes, this challenge requires you to design a real architecture with actual AWS services, evaluate trade-offs, and defend your decisions. Our automated validators check your design against production-grade criteria. Complete it and it shows up in your verified portfolio with your architecture diagram and design rationale.
More from AI/ML Infrastructure
RAG Pipeline Architecture
Design a Retrieval-Augmented Generation pipeline that grounds LLM responses in enterprise knowledge bases.
Advanced · 70 minML Model Serving Platform
Design a model serving platform that delivers low-latency predictions with A/B testing and canary deployment.
Advanced · 70 minAI Gateway Security Layer
Design a security gateway that enforces responsible AI policies, rate limits, and content filtering for LLM APIs.
Advanced · 65 minReady to design this for real?
Get the full scenario, design your architecture using real AWS services, and validate against production-grade criteria. Your completed challenge shows up in your verified portfolio.
Start Challenge