Code Llama 34B
Meta · 34B params · 16k context
Deploy this modelrocket_launch
P
Parthexpand_moreAbout
Specialized for code generation, completion, and reasoning.
Type
Code
Parameters
34B params
Context
16k
License
Llama Community
Recommended deployment
memory
Best quality
H100 SXM x8 · 12k tok/s
$26.4
/hour
memory
Balanced
A100 SXM x4 · 6k tok/s
$6.6
/hour
memory
Budget
RTX 4090 x2 · 2k tok/s
$1.1
/hour
Pricing
Input tokens
$0.40
per 1M tokens
Output tokens
$1.20
per 1M tokens
Quick stats
Deploys
980
Recommended GPU
H100 SXM
Inference latency
~280ms P50
Throughput
~12k tok/s
Use cases
- check_circleCustomer support automation
- check_circleKnowledge base search
- check_circleAgentic workflows
- check_circleContent generation