BGE Large EN v1.5
BAAI · 335M params · 512 context
Deploy this modelrocket_launch
P
Parthexpand_moreAbout
State-of-the-art English embedding model for retrieval and search.
Type
Embedding
Parameters
335M params
Context
512
License
MIT
Recommended deployment
memory
Best quality
RTX 4090 x8 · 12k tok/s
$26.4
/hour
memory
Balanced
A100 SXM x4 · 6k tok/s
$6.6
/hour
memory
Budget
RTX 4090 x2 · 2k tok/s
$1.1
/hour
Pricing
Input tokens
$0.02
per 1M tokens
Output tokens
$0.00
per 1M tokens
Quick stats
Deploys
4,210
Recommended GPU
RTX 4090
Inference latency
~280ms P50
Throughput
~12k tok/s
Use cases
- check_circleCustomer support automation
- check_circleKnowledge base search
- check_circleAgentic workflows
- check_circleContent generation