Earnings
Understand how node operators earn INFER tokens and manage withdrawals.
How Earnings Work
- A developer sends an inference request to the INFER API
- The network routes the request to an available node based on model, latency, and load
- Your node processes the request and returns the response
- Tokens are calculated based on the number of prompt + completion tokens processed
- You receive 90% of the fee; 10% goes to the INFER network
Revenue Share
| Party | Share |
|---|---|
| Node Operator | 90% |
| INFER Network | 10% |
Pricing
Earnings depend on which model you’re serving:
| Model | Rate (per 1M tokens) | Your Share (90%) |
|---|---|---|
| Llama 3.1 8B | $0.10 | $0.09 |
| Llama 3.1 70B | $0.50 | $0.45 |
| Mixtral 8x22B | $0.60 | $0.54 |
Estimated Monthly Earnings
| Hardware | Est. Monthly |
|---|---|
| 1x RTX 4090 | $500 – $800 |
| 4x A100 40GB | $3,000 – $5,000 |
| 8x H100 80GB | $10,000 – $15,000 |
Estimates based on current network demand. Actual earnings vary with utilization.
Viewing Earnings
Dashboard
Navigate to Operator → Earnings to see:
- Total lifetime earnings
- Earnings breakdown by day/week/month
- Per-model earnings breakdown
- Request count and average latency
Desktop App
The INFER desktop app shows:
- Real-time earnings ticker
- Native notifications for new earnings
- System tray tooltip with today’s earnings
Live View
For active monitoring, use Operator → Live Earnings to see:
- Real-time request stream
- Token counts per request
- Earnings per request
- Live latency metrics
Withdrawals
Navigate to Operator → Earnings → Withdraw to request a withdrawal of accumulated tokens.
Minimum withdrawal: 100 INFER tokens.
Maximizing Earnings
- Uptime: Keep your node online 24/7 for maximum utilization
- Multiple models: Serve multiple models to handle diverse traffic
- Low latency: Optimize your setup for fast inference
- Good hardware: Faster GPUs serve more requests per hour