⚡ Network Architecture

Democratizing AI Compute

SpazeNode aggregates idle GPUs around the globe, creating a secure, ultra-low-latency, and highly scalable inference engine. Here is how the magic happens.

storefront

Stage 01Service Layer

Developer Console & API

The entry point for developers. Spin up serverless inference API endpoints, deploy models via our command-line tool, or use the Python/TS SDKs to integrate compute nodes directly into your codebase.

check
OpenAI-compatible chat & completions endpoint
check
Deploy custom Docker weights with one command
check
Managed API key access & granular team roles
check
Live request monitoring & token usage statistics

developer-terminal (~/spazenode)

$ # Install SpazeNode CLI

npm install -g @spazenode/cli

$ # Log in with your API Key

spazenode auth login --key=sp_live_9e28fa0d

✔ Success! Authenticated as organization 'Alpha Labs'

$ # Deploy a Llama-3-8B endpoint on 2x RTX 4090s

spazenode deploy --model llama-3-8b --gpu RTX4090 --count 2

➜ Spin-up requested... allocating nodes...

✔ Endpoint live: https://api.spazenode.com/v1/chat/completions

settings_suggest

Stage 02Virtualization Layer

Decentralized OS

The brains of the network. This proprietary orchestration layer aggregates thousands of individual, heterogeneous GPUs into a single virtual data center. It performs routing, security checks, and hardware attestation.

check
Latency-aware smart routing to the nearest active GPU
check
Dynamic batching and KV-caching for up to 4x throughput
check
Continuous secure hardware attestation & reputation checks
check
Auto-failover: requests transparently re-route if a host disconnects

orchestrator-logs (region: us-east-1)

[INFO] Inbound request received for endpoint /llama-3-8b

[DISPATCH] Performing zero-trust hardware attestation...

➜ Attestation token verified: Host #4902-RTX4090

[ROUTING] Evaluating nodes latency & queue depth...

➜ Host #4902-RTX4090 (us-east-4a) - Latency: 14ms (Selected)

➜ Host #3812-RTX4090 (us-east-1b) - Latency: 28ms (Backup)

[BATCH] Applying dynamic request queue collation...

➜ Queue depth: 3 reqs -> combined into token batch (vram efficiency)

[DISPATCH] Forwarding payload to Host #4902 sandboxed docker container

hub

Stage 03Supply Layer

Distributed GPU Network

The physical compute infrastructure. Hosts from all over the world (small data centers, mining facilities, private dev rigs) install our sandboxed agent. The agent securely executes inference queries inside a firewalled container.

check
Secure firewalled Docker runtime for complete host isolation
check
Automatic payouts in stablecoins based on processed tokens
check
Support for H100s, A100s, L40s, RTX 4090s, and RTX 3090s
check
Automatic hardware benchmark and rating upon registration

host-agent-daemon (~/spaze-agent)

$ spaze-agent-daemon --status

● Daemon: Active (Listening for workloads)

Hardware: NVIDIA RTX 4090 (24GB VRAM) | Drivers: 535.11

Connection: Fiber 1Gbps Symmetric (Checked)

[DAEMON] Inbound workload received! Spawn sandbox container...

➜ Model weights: Llama-3-8b (Cached in local storage)

➜ Compute active. GPU Load: 94% | VRAM: 18.2 GB | Temp: 68°C

[DAEMON] Workload complete. Tokens output: 1,420 | Processing time: 0.8s

[EARNINGS] Earned $0.00071 for computation. Lifetime: $148.20

code

For AI Developers

Access the scale of hyper-scalers without the premium cost. Spin up OpenAI-compatible API servers, fine-tune models, or lease dedicated nodes with a credit pool that scales with your team.

Deploy your first endpoint arrow_forward

dns

For Compute Providers

Put your idle GPU hardware to work. Register as a host, run our fully secure sandboxed docker agent, set your availability times and target pricing, and receive automated payouts in USDC.

Monetize your GPU cluster arrow_forward

Ready to experience decentralized compute?

Start deploying serverless endpoints in seconds, or monetize your GPU infrastructure on the global SpazeNode network.

Deploy Your First Model Monetize Your GPU