AI services
AWS AI Service Comparison
Compare Textract, Comprehend, Polly, Rekognition, and Translate side by side. See capabilities, pricing, and integration patterns for each AWS AI service.
| Service | What It Does | Input Types | Output | Pricing | Best For | Integration |
|---|---|---|---|---|---|---|
| Textract | Extracts text, tables, and form data from scanned documents using OCR and ML. | PDF, JPEG, PNG, TIFF | Structured JSON with text, key-value pairs, tables, and bounding boxes | $1.50 per 1,000 pages (Detect Text) / $50 per 1,000 pages (Analyze Document) | Invoice processing, form extraction, document digitization, ID verification | Lambda trigger on S3 upload, direct API call, async with SNS notification |
| Comprehend | Natural language processing: sentiment analysis, entity recognition, topic modeling, language detection. | Plain text (UTF-8), up to 100 KB per document | JSON with sentiment scores, entities, key phrases, language codes, PII labels | $0.0001 per unit (1 unit = 100 characters) for real-time analysis | Customer feedback analysis, content moderation, support ticket routing, PII detection | Lambda, direct API call, batch jobs with S3 input/output, Comprehend Medical for HIPAA |
| Polly | Converts text to lifelike speech in dozens of languages with neural and standard voices. | Plain text or SSML markup, up to 3,000 characters per request | Audio stream (MP3, OGG, PCM) or stored to S3 for long-form synthesis | $4.00 per 1M characters (Standard) / $16.00 per 1M characters (Neural) | Accessibility features, IVR systems, e-learning narration, content audio versions | Direct API, Lambda, MediaConvert pipeline, real-time streaming via WebSocket |
| Rekognition | Image and video analysis: object detection, face comparison, content moderation, text in images. | JPEG, PNG (images), MP4, MOV (video via Kinesis Video Streams or S3) | JSON with labels, bounding boxes, confidence scores, face match results | $1.00 per 1,000 images (first 1M/mo) / $0.10 per minute of video | Content moderation, facial verification, PPE detection, celebrity recognition, visual search | Lambda trigger on S3, Kinesis Video Streams for real-time, direct API for on-demand |
| Translate | Neural machine translation supporting 75+ languages with custom terminology support. | Plain text (UTF-8), HTML, DOCX via batch translation | Translated text with source language detection, terminology customization | $15.00 per 1M characters (real-time) / $7.50 per 1M characters (batch) | Website localization, multilingual chat, document translation, subtitle generation | Direct API, Lambda, batch jobs with S3, custom terminology dictionaries |
Extracts text, tables, and form data from scanned documents using OCR and ML.
Natural language processing: sentiment analysis, entity recognition, topic modeling, language detection.
Converts text to lifelike speech in dozens of languages with neural and standard voices.
Image and video analysis: object detection, face comparison, content moderation, text in images.
Neural machine translation supporting 75+ languages with custom terminology support.
Quick Decision Guide
- Need to extract text from documents? → Textract
- Need sentiment analysis or NLP? → Comprehend
- Need text-to-speech? → Polly
- Need image or video analysis? → Rekognition
- Need language translation? → Translate
Build AI applications hands-on
Integrate AWS AI services in a live AWS playground. Follow guided missions that build real infrastructure — no simulations.
Start building free →