AI model selection

LLM Comparison Tool

Not all LLMs are created equal. Pick a task and see how Claude 3.5 Sonnet, GPT-4o, Llama 3.1, Gemini 1.5 Pro, and Mistral Large rank in performance, pricing, and strengths.

1

Claude 3.5 Sonnet

Context: 200K

9.2

/ 10

Pricing

$3 / $15 per 1M

Context Window

200K tokens

Strengths for Coding

Best code refactoringLarge codebase understandingAgentic coding via Claude Code
2

GPT-4o

Context: 128K

8.8

/ 10

Pricing

$5 / $15 per 1M

Context Window

128K tokens

Strengths for Coding

Strong debuggingBroad language supportCode Interpreter integration
3

Gemini 1.5 Pro

Context: 1M

8.3

/ 10

Pricing

$3.50 / $10.50 per 1M

Context Window

1M tokens

Strengths for Coding

Massive context windowGood at multi-file analysisGoogle ecosystem integration
4

Llama 3.1 70B

Context: 128K

7.8

/ 10

Pricing

Self-hosted or ~$2.65 / $3.50 per 1M

Context Window

128K tokens

Strengths for Coding

Open sourceSelf-hostableStrong for its size
5

Mistral Large

Context: 128K

7.5

/ 10

Pricing

$4 / $12 per 1M

Context Window

128K tokens

Strengths for Coding

Multilingual codeFunction callingEU data compliance

Build AI apps hands-on

Go beyond benchmarks. Build production AI applications with these models on real AWS infrastructure using Amazon Bedrock.

Start building free →