Sell your idle NVIDIA GPU.
Buy inference at half the price.
DarkGPU is a decentralized inference marketplace. Gamers, homelabbers, and data centers sell idle compute. Developers buy OpenAI-compatible inference for up to 60% less. Operators keep 95% of revenue. No crypto. Real USD via Stripe.
Consumer
│
│ HTTPS (OpenAI-compat)
▼
Coordinator ──────┐
│ WS │
▼ ▼
Your GPU Another GPU
(RTX 4090) (A100)
Turn idle silicon into cash flow
Install the agent, pick a model, connect your wallet. Your GPU serves inference when idle and pauses the moment you touch the machine. Keep 95% of every token served.
curl -fsSL https://darkgpu.ai/install.sh | bashdarkgpu authdarkgpu serve
Drop-in OpenAI replacement at half the price
Change the base URL. Keep your existing code. Every request is end-to-end encrypted — the GPU owner never sees your prompts or your data.
client = OpenAI(
base_url="https://api.darkgpu.ai/v1",
api_key=DGPU_API_KEY,
)
Estimate your earnings
We show three scenarios. Default is conservative because at launch, demand is lower than supply. If we showed you only the 80% utilization number, we'd be lying. Darkbloom's hacker-news critics were right — so we're not doing that.
Estimates net of electricity. Actual earnings depend on network demand, model popularity, and your provider reputation. We do not guarantee any specific earnings. At launch, expect the conservative tier.
Consumer pricing
Per-token pricing. No subscriptions. No minimums. 5% platform fee, the rest goes to the operator.
| Model | Input / 1M | Output / 1M | OpenRouter | Savings |
|---|---|---|---|---|
| Qwen3.5 7B 4-bit, fast general purpose | $0.015 | $0.06 | $0.10 | 40% |
| Llama 3.3 8B popular open baseline | $0.02 | $0.08 | $0.15 | 47% |
| Gemma 4 12B Google's latest | $0.035 | $0.14 | $0.30 | 53% |
| Qwen3.5 27B frontier reasoning | $0.06 | $0.30 | $0.80 | 62% |
| Llama 3.3 70B high quality | $0.10 | $0.50 | $1.20 | 58% |
Free tier
Platform fee
Minimum top-up
OpenAI-compatible. Change one line.
Streaming, function calling, embeddings. Every SDK that talks to OpenAI talks to DarkGPU.
from openai import OpenAI
client = OpenAI(
base_url="https://api.darkgpu.ai/v1",
api_key="dgpu-...",
)
response = client.chat.completions.create(
model="qwen3.5-7b",
messages=[{"role": "user", "content": "Hello from an idle 4090"}],
stream=True,
)
for chunk in response:
print(chunk.choices[0].delta.content or "", end="")
One command. Five minutes to serving.
Linux with NVIDIA driver 535+ and CUDA 12+. Detects your GPU, picks a compatible model, connects to the coordinator.
$ curl -fsSL https://darkgpu.ai/install.sh | bash
$ darkgpu auth
$ darkgpu serve
The operator runs your inference. They can't read it.
Four layers. Each independently verifiable.
End-to-end encryption
Requests are encrypted with the provider's X25519 public key before leaving your device. The coordinator routes ciphertext. Only the provider process can decrypt.
GPU attestation
Each node registers a hardware fingerprint (GPU UUID, driver, CUDA caps). Data-center nodes with H100/H200 use NVIDIA Confidential Computing. Consumer nodes use software attestation with trust tiers exposed to the consumer.
Verified binary
Coordinator verifies the provider agent binary hash on every connect. Tampered binaries fail attestation and don't receive routing.
Public audit log
Provider attestation is queryable at GET /v1/providers/:id/attestation. Trust tier (basic vs verified) is returned on every inference response so consumers can filter.
The questions HN asked about Darkbloom
"Earnings numbers are too good to be true."
They often are on platforms like this. We show three scenarios: 10% utilization (conservative, what you should expect at launch), 30% (moderate, realistic once the network has demand), and 80% (optimistic, only when demand exceeds supply for your model). The default view is conservative.
"Will this fry my GPU or kill my SSD?"
NVIDIA GPUs are designed for sustained compute — data center cards run at 100% for years. We auto-throttle at 85°C. Model weights are loaded once per model into VRAM; there's minimal sustained SSD write traffic. The agent logs disk I/O per model so you can see exactly what's happening.
"Why not buy GPUs yourself if the ROI is so good?"
Capital efficiency. We build and own the platform, not the hardware. This is the Airbnb/Uber play — the hardware already exists, idle, in millions of homes and small data centers. Deploying that capacity against demand is where the value is. We may add our own capacity later once demand exceeds supply.
"Is this a crypto thing?"
No. No tokens. No blockchain. No wallet. Payments are USD via Stripe. Providers onboard via Stripe Connect and receive weekly bank transfers.
"What if there's no demand?"
At launch, there won't be much. We're seeding it ourselves with $10 free credits for new consumers, routing our own internal workloads through the network, and listing on OpenRouter for instant external demand. Early providers get priority routing via reputation score.
"What about latency through the coordinator?"
Target is <100ms overhead. The coordinator is on Cloudflare's edge (300+ cities). Phase 5 adds geographic routing so consumers in Europe get EU providers. Long-term roadmap includes P2P mode for latency-critical workloads.
"Supply will eventually exceed demand."
Probably, for cheap models. Reputation-weighted routing means reliable, fast providers get priority. New providers can bring specialty models (e.g. fine-tunes, larger context) where supply is thinner. We also give providers the ability to set a price floor so the market doesn't race to zero.
"Why should I trust software you wrote on my machine?"
The provider agent is open source (Rust, auditable). You can build it from source. It only opens an outbound WebSocket — no inbound ports. It uses nvidia-smi and a local vLLM subprocess; nothing hidden.