Error Codes
All error responses follow a consistent format with an HTTP status code and a JSON body.
Error Response Format
{
"error": {
"code": "ERROR_CODE",
"message": "Human-readable description",
"details": {}
}
}Error Codes
| HTTP Status | Code | Description |
|---|---|---|
| 400 | VALIDATION_ERROR | Request body failed validation. Check the details field for specific field errors. |
| 401 | UNAUTHORIZED | Missing or invalid API key. |
| 403 | FORBIDDEN | Your account does not have permission for this action. |
| 404 | NOT_FOUND | The requested resource does not exist. |
| 409 | CONFLICT | The request conflicts with current state (e.g., duplicate resource). |
| 413 | PAYLOAD_TOO_LARGE | Request body exceeds the 1MB size limit. |
| 422 | INVALID_JSON | Request body is not valid JSON. |
| 429 | RATE_LIMITED | Too many requests. See Rate Limits. |
| 500 | INTERNAL_ERROR | Unexpected server error. |
| 503 | SERVICE_UNAVAILABLE | No nodes available to serve the requested model. |
Handling Errors
from openai import OpenAI, APIError
client = OpenAI(
api_key="sk-infer-YOUR_KEY",
base_url="https://inferexchange.com/api/v1"
)
try:
response = client.chat.completions.create(
model="llama-3.1-8b",
messages=[{"role": "user", "content": "Hello"}]
)
except APIError as e:
print(f"Error {e.status_code}: {e.message}")Rate Limit Headers
When rate limited (429), the response includes headers indicating when you can retry:
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1708300060
Retry-After: 30