AdvancedPro~6 hrs

AI/ML Infrastructure

The ability to design AI/ML infrastructure is the most in-demand architecture skill of 2026, yet most engineers struggle to move beyond proof-of-concept Jupyter notebooks to production-grade systems. This path bridges that gap with five challenges that cover the full AI/ML infrastructure stack: RAG pipelines for grounding LLMs in enterprise data, multi-agent orchestration for complex reasoning workflows, model serving platforms for low-latency inference, AI gateway security for responsible AI deployment, and feature stores for consistent ML feature management. Each challenge uses Amazon Bedrock and SageMaker alongside supporting AWS services, teaching you to design systems that are not just technically sound but also cost-effective, observable, and compliant with enterprise governance requirements.

Start This Path Browse All Paths

AWS Services Across This Path

BedrockOpenSearch ServerlessS3LambdaStep FunctionsTextractCloudWatchDynamoDBSQSSNSIAMSageMakerSageMaker Feature StoreAPI GatewayComprehendElastiCacheKinesis Data FirehoseAthenaGlueKinesis Data Streams

5 Challenges in This Path

Advanced70 min

RAG Pipeline Architecture

Design a Retrieval-Augmented Generation pipeline that grounds LLM responses in enterprise knowledge bases.

BedrockOpenSearch ServerlessS3Lambda+3

RAGsemantic chunkinghybrid search+2

Advanced75 min

Multi-Agent Orchestration

Design a multi-agent system where specialized AI agents collaborate to solve complex tasks.

BedrockStep FunctionsLambdaDynamoDB+4

supervisor patterntool useshared scratchpad+2

Advanced70 min

ML Model Serving Platform

Design a model serving platform that delivers low-latency predictions with A/B testing and canary deployment.

SageMakerS3Step FunctionsCloudWatch+2

canary deploymentA/B testingmulti-model endpoint+2

Advanced65 min

AI Gateway Security Layer

Design a security gateway that enforces responsible AI policies, rate limits, and content filtering for LLM APIs.

API GatewayLambdaBedrockComprehend+5

gateway patternpre/post processing pipelinePII redaction+2

Advanced65 min

ML Feature Store

Design a feature store that serves consistent ML features for both training pipelines and real-time inference.

SageMaker Feature StoreS3GlueAthena+4

offline/online storepoint-in-time correctnessfeature freshness monitoring+2

Ready to start AI/ML Infrastructure?

Each challenge gives you a real scenario, real AWS services, and automated validation. Complete the path and add verified system design experience to your portfolio.

Start This Path