Skip to Content
GuidesEarnings

Earnings

Understand how node operators earn INFER tokens and manage withdrawals.

How Earnings Work

  1. A developer sends an inference request to the INFER API
  2. The network routes the request to an available node based on model, latency, and load
  3. Your node processes the request and returns the response
  4. Tokens are calculated based on the number of prompt + completion tokens processed
  5. You receive 90% of the fee; 10% goes to the INFER network

Revenue Share

PartyShare
Node Operator90%
INFER Network10%

Pricing

Earnings depend on which model you’re serving:

ModelRate (per 1M tokens)Your Share (90%)
Llama 3.1 8B$0.10$0.09
Llama 3.1 70B$0.50$0.45
Mixtral 8x22B$0.60$0.54

Estimated Monthly Earnings

HardwareEst. Monthly
1x RTX 4090$500 – $800
4x A100 40GB$3,000 – $5,000
8x H100 80GB$10,000 – $15,000

Estimates based on current network demand. Actual earnings vary with utilization.

Viewing Earnings

Dashboard

Navigate to Operator → Earnings to see:

  • Total lifetime earnings
  • Earnings breakdown by day/week/month
  • Per-model earnings breakdown
  • Request count and average latency

Desktop App

The INFER desktop app shows:

  • Real-time earnings ticker
  • Native notifications for new earnings
  • System tray tooltip with today’s earnings

Live View

For active monitoring, use Operator → Live Earnings to see:

  • Real-time request stream
  • Token counts per request
  • Earnings per request
  • Live latency metrics

Withdrawals

Navigate to Operator → Earnings → Withdraw to request a withdrawal of accumulated tokens.

Minimum withdrawal: 100 INFER tokens.

Maximizing Earnings

  • Uptime: Keep your node online 24/7 for maximum utilization
  • Multiple models: Serve multiple models to handle diverse traffic
  • Low latency: Optimize your setup for fast inference
  • Good hardware: Faster GPUs serve more requests per hour