Advanced70 min

ML Model Serving Platform

Serving machine learning models in production with consistent low latency, high availability, and the ability to safely roll out new model versions is a core MLOps challenge that most tutorials gloss over. In this challenge, you will design a model serving platform on AWS SageMaker that handles real-time inference, batch predictions, and near-real-time streaming inference for multiple ML models across different teams. The real-time inference tier uses SageMaker endpoints with auto-scaling based on invocations per instance and model latency P99 metrics. You will design a multi-model endpoint architecture where a single endpoint hosts multiple models (reducing cost) with intelligent routing based on the request payload. For safe model deployment, the architecture implements SageMaker's deployment guardrails: canary deployments shift 10% of traffic to the new model version, monitor key metrics (latency, error rate, data drift) for 30 minutes, then automatically promote or rollback. A/B testing uses SageMaker's production variant feature to split traffic between model versions with statistical significance tracking. The batch prediction tier uses SageMaker Batch Transform for overnight scoring jobs, with S3 input/output and Step Functions orchestration for the ETL-predict-load workflow. The platform includes a model registry in SageMaker Model Registry with approval workflows — data scientists register model artifacts, automated quality gates check accuracy thresholds, and engineering leads approve production deployments. Feature consistency is ensured by sharing a SageMaker Feature Store between training and serving pipelines, eliminating training-serving skew. Observability covers model performance monitoring using SageMaker Model Monitor for data drift detection, custom CloudWatch metrics for business KPIs, and automated retraining triggers when model accuracy degrades below threshold. This challenge teaches model serving architecture, safe deployment strategies for ML, and the operational patterns required for reliable ML in production.

Start Challenge Back to AI/ML Infrastructure

AWS Services You'll Use

SageMakerS3Step FunctionsCloudWatchLambdaSageMaker Feature Store

Challenge Details

Path: AI/ML Infrastructure
Difficulty: Advanced
Duration: 70 min
Plan: Pro

Architecture Patterns You'll Learn

canary deploymentA/B testingmulti-model endpointmodel registrydrift detection

Why This Challenge?

Unlike whiteboard exercises or multiple-choice quizzes, this challenge requires you to design a real architecture with actual AWS services, evaluate trade-offs, and defend your decisions. Our automated validators check your design against production-grade criteria. Complete it and it shows up in your verified portfolio with your architecture diagram and design rationale.

Ready to design this for real?

Get the full scenario, design your architecture using real AWS services, and validate against production-grade criteria. Your completed challenge shows up in your verified portfolio.

Start Challenge

ML Model Serving Platform

AWS Services You'll Use

Challenge Details

Architecture Patterns You'll Learn

Why This Challenge?

More from AI/ML Infrastructure

RAG Pipeline Architecture

Multi-Agent Orchestration

AI Gateway Security Layer

Ready to design this for real?