Llama 3 70B Instruct
Meta · 70B params · 8k context
Deploy this modelrocket_launch
P
Parthexpand_moreAbout
Meta's flagship open-source chat model. Excellent reasoning and instruction following.
Type
Chat
Parameters
70B params
Context
8k
License
Llama 3 Community
Recommended deployment
memory
Best quality
H100 SXM x8 · 12k tok/s
$26.4
/hour
memory
Balanced
A100 SXM x4 · 6k tok/s
$6.6
/hour
memory
Budget
RTX 4090 x2 · 2k tok/s
$1.1
/hour
Pricing
Input tokens
$0.60
per 1M tokens
Output tokens
$1.80
per 1M tokens
Quick stats
Deploys
2,410
Recommended GPU
H100 SXM
Inference latency
~280ms P50
Throughput
~12k tok/s
Use cases
- check_circleCustomer support automation
- check_circleKnowledge base search
- check_circleAgentic workflows
- check_circleContent generation