Llama 3 70B Inference
Llama 3 70B Instruct · prod-us-east-4 · A100 SXM x8
addScale
P
Parthexpand_moreStatus
Running
Region
US-EAST
Uptime
326h
Req / hr
4,820
Cost / hr
$13.20
Deployment: Llama 3 70B Inference
Runninginput
Data Ingestion
Completedtune
Preprocessing
Completeddownload
Loading Model
Completedlocal_fire_department
Warming Up
Completedbolt
Serving
100%Serving Progress100%
Job Details
Job IDtrain_dep_01c9d
ModelLlama 3 70B Instruct
GPUs8 × A100 SXM
Start TimeMay 24, 2026 10:30 AM
Elapsed Time02:14:32
Est. Completion05:18:45 PM
Endpointhttps://api.spazenode.io/v1/dep_01
Real-time Monitoring
Active GPUs
8
±0% vs last 24h
GPU Utilization
78%
+3% vs last 24h
Memory Utilization
64%
+2% vs last 24h
Request Rate
4,820/hr
+7% vs last 24h
GPU Utilization Over Time
Last 24h78% average
GPU Memory Allocation
88%
VRAM
Model Weights
60%KV Cache
28%System Overhead
12%