Error Codes

All error responses follow a consistent format with an HTTP status code and a JSON body.

Error Response Format


{
  "error": {
    "code": "ERROR_CODE",
    "message": "Human-readable description",
    "details": {}
  }
}

Error Codes

HTTP Status	Code	Description
400	`VALIDATION_ERROR`	Request body failed validation. Check the `details` field for specific field errors.
401	`UNAUTHORIZED`	Missing or invalid API key.
403	`FORBIDDEN`	Your account does not have permission for this action.
404	`NOT_FOUND`	The requested resource does not exist.
409	`CONFLICT`	The request conflicts with current state (e.g., duplicate resource).
413	`PAYLOAD_TOO_LARGE`	Request body exceeds the 1MB size limit.
422	`INVALID_JSON`	Request body is not valid JSON.
429	`RATE_LIMITED`	Too many requests. See Rate Limits.
500	`INTERNAL_ERROR`	Unexpected server error.
503	`SERVICE_UNAVAILABLE`	No nodes available to serve the requested model.

Handling Errors


from openai import OpenAI, APIError
 
client = OpenAI(
    api_key="sk-infer-YOUR_KEY",
    base_url="https://inferexchange.com/api/v1"
)
 
try:
    response = client.chat.completions.create(
        model="llama-3.1-8b",
        messages=[{"role": "user", "content": "Hello"}]
    )
except APIError as e:
    print(f"Error {e.status_code}: {e.message}")

Rate Limit Headers

When rate limited (429), the response includes headers indicating when you can retry:


X-RateLimit-Limit: 60
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1708300060
Retry-After: 30