Sell your idle NVIDIA GPU.
Buy inference at half the price.

DarkGPU is a decentralized inference marketplace. Gamers, homelabbers, and data centers sell idle compute. Developers buy OpenAI-compatible inference for up to 60% less. Operators keep 95% of revenue. No crypto. Real USD via Stripe.

Start earning → Use the API → Read the code →

✓ Open-source provider agent ✓ OpenAI SDK compatible ✓ Stripe payouts (USD) ✓ No tokens, no blockchain

Consumer
   │
   │ HTTPS (OpenAI-compat)
   ▼
Coordinator ──────┐
   │ WS           │
   ▼              ▼
Your GPU   Another GPU
 (RTX 4090)  (A100)

01 — How it works

Providers (GPU owners)

Turn idle silicon into cash flow

Install the agent, pick a model, connect your wallet. Your GPU serves inference when idle and pauses the moment you touch the machine. Keep 95% of every token served.

curl -fsSL https://darkgpu.ai/install.sh | bash
darkgpu auth
darkgpu serve

Consumers (devs & teams)

Drop-in OpenAI replacement at half the price

Change the base URL. Keep your existing code. Every request is end-to-end encrypted — the GPU owner never sees your prompts or your data.

client = OpenAI(
  base_url="https://api.darkgpu.ai/v1",
  api_key=DGPU_API_KEY,
)

02 — Honest earnings

Estimate your earnings

We show three scenarios. Default is conservative because at launch, demand is lower than supply. Showing you only the 80% utilization number would be dishonest — so we don't.

GPU

Hours/day available: 20

Electricity rate ($/kWh): 0.15

Conservative · 10% util

$3.60

per day · $108/month

Moderate · 30% util

$10.80

per day · $324/month

Optimistic · 80% util

$28.80

per day · $864/month

Estimates net of electricity. Actual earnings depend on network demand, model popularity, and your provider reputation. We do not guarantee any specific earnings. At launch, expect the conservative tier.

03 — Pricing

Consumer pricing

Per-token pricing. No subscriptions. No minimums. 5% platform fee, the rest goes to the operator.

Model	Input / 1M	Output / 1M	OpenRouter	Savings
Qwen3.5 7B 4-bit, fast general purpose	$0.015	$0.06	$0.10	40%
Llama 3.3 8B popular open baseline	$0.02	$0.08	$0.15	47%
Gemma 4 12B Google's latest	$0.035	$0.14	$0.30	53%
Qwen3.5 27B frontier reasoning	$0.06	$0.30	$0.80	62%
Llama 3.3 70B high quality	$0.10	$0.50	$1.20	58%

Free tier

$10

credits for new accounts

Platform fee

operators keep 95%

Minimum top-up

prepaid via Stripe

04 — API

OpenAI-compatible. Change one line.

Streaming, function calling, embeddings. Every SDK that talks to OpenAI talks to DarkGPU.

python

from openai import OpenAI

client = OpenAI(
    base_url="https://api.darkgpu.ai/v1",
    api_key="dgpu-...",
)

response = client.chat.completions.create(
    model="qwen3.5-7b",
    messages=[{"role": "user", "content": "Hello from an idle 4090"}],
    stream=True,
)

for chunk in response:
    print(chunk.choices[0].delta.content or "", end="")

05 — Provider install

One command. Five minutes to serving.

Linux with NVIDIA driver 535+ and CUDA 12+. Detects your GPU, picks a compatible model, connects to the coordinator.

bash

$ curl -fsSL https://darkgpu.ai/install.sh | bash
$ darkgpu auth
$ darkgpu serve

Zero prereqs — bundles vLLM, Python 3.12, ffmpeg

Idle detection — pauses when you're gaming or working

Scheduling — serve only during hours you choose

Open source — Rust agent, audit what runs on your machine

Auto-updates — security patches without a reboot

Stripe payouts — weekly, USD, direct to your bank

06 — Security

The operator runs your inference. They can't read it.

Four layers. Each independently verifiable.

Transport

End-to-end encryption

Requests are encrypted with the provider's X25519 public key before leaving your device. The coordinator routes ciphertext. Only the provider process can decrypt.

Hardware

GPU attestation

Each node registers a hardware fingerprint (GPU UUID, driver, CUDA caps). Data-center nodes with H100/H200 use NVIDIA Confidential Computing. Consumer nodes use software attestation with trust tiers exposed to the consumer.

Runtime

Verified binary

Coordinator verifies the provider agent binary hash on every connect. Tampered binaries fail attestation and don't receive routing.

Transparency

Public audit log

Provider attestation is queryable at GET /v1/providers/:id/attestation. Trust tier (basic vs verified) is returned on every inference response so consumers can filter.

07 — FAQ

Common questions

"Earnings numbers are too good to be true."

They often are on platforms like this. We show three scenarios: 10% utilization (conservative, what you should expect at launch), 30% (moderate, realistic once the network has demand), and 80% (optimistic, only when demand exceeds supply for your model). The default view is conservative.

"Will this fry my GPU or kill my SSD?"

NVIDIA GPUs are designed for sustained compute — data center cards run at 100% for years. We auto-throttle at 85°C. Model weights are loaded once per model into VRAM; there's minimal sustained SSD write traffic. The agent logs disk I/O per model so you can see exactly what's happening.

"Why not buy GPUs yourself if the ROI is so good?"

Capital efficiency. We build and own the platform, not the hardware. This is the Airbnb/Uber play — the hardware already exists, idle, in millions of homes and small data centers. Deploying that capacity against demand is where the value is. We may add our own capacity later once demand exceeds supply.

"Is this a crypto thing?"

No. No tokens. No blockchain. No wallet. Payments are USD via Stripe. Providers onboard via Stripe Connect and receive weekly bank transfers.

"What if there's no demand?"

At launch, there won't be much. We're seeding it ourselves with $10 free credits for new consumers, routing our own internal workloads through the network, and listing on OpenRouter for instant external demand. Early providers get priority routing via reputation score.

"What about latency through the coordinator?"

Target is <100ms overhead. The coordinator is on Cloudflare's edge (300+ cities). Phase 5 adds geographic routing so consumers in Europe get EU providers. Long-term roadmap includes P2P mode for latency-critical workloads.

"Supply will eventually exceed demand."

Probably, for cheap models. Reputation-weighted routing means reliable, fast providers get priority. New providers can bring specialty models (e.g. fine-tunes, larger context) where supply is thinner. We also give providers the ability to set a price floor so the market doesn't race to zero.

"Why should I trust software you wrote on my machine?"

The provider agent is open source (Rust, auditable). You can build it from source. It only opens an outbound WebSocket — no inbound ports. It uses nvidia-smi and a local vLLM subprocess; nothing hidden.

Sell your idle NVIDIA GPU.Buy inference at half the price.