Llama 3 70B Inference

Llama 3 70B Instruct · prod-us-east-4 · A100 SXM x8

addScale
P
Parthexpand_more

Status

Running

Region

US-EAST

Uptime

326h

Req / hr

4,820

Cost / hr

$13.20

Deployment: Llama 3 70B Inference

Running
input

Data Ingestion

Completed
tune

Preprocessing

Completed
download

Loading Model

Completed
local_fire_department

Warming Up

Completed
bolt

Serving

100%
Serving Progress100%

Job Details

Job IDtrain_dep_01c9d
ModelLlama 3 70B Instruct
GPUs8 × A100 SXM
Start TimeMay 24, 2026 10:30 AM
Elapsed Time02:14:32
Est. Completion05:18:45 PM
Endpointhttps://api.spazenode.io/v1/dep_01

Real-time Monitoring

Active GPUs

8

±0% vs last 24h

GPU Utilization

78%

+3% vs last 24h

Memory Utilization

64%

+2% vs last 24h

Request Rate

4,820/hr

+7% vs last 24h

GPU Utilization Over Time
Last 24h

78% average

GPU Memory Allocation

88%

VRAM

Model Weights
60%
KV Cache
28%
System Overhead
12%