OpenAI-Compatible

HaioAI Gateway

A unified AI gateway supporting OpenRouter, OpenAI, Gemini, DeepSeek, Z.ai and custom providers — with API key management, usage tracking, and plan-based access control.

Overview

The gateway exposes an OpenAI-compatible API. Any client that works with OpenAI will work here — just change the base_url.

FeatureDetails
base_urlhttps://ai.haiocloud.com/v1
ProtocolHTTPS only (HTTP redirects to HTTPS)
FormatJSON request & response, SSE for streaming
AuthBearer token or X-API-Key header

Authentication

Include your API key in every request using one of these headers:

Authorization: Bearer sk-gw-xxxxxxxxxxxxxxxxxxxx

or

X-API-Key: sk-gw-xxxxxxxxxxxxxxxxxxxx
API keys are issued by your administrator via the Django admin panel at /admin/. Keys start with sk-gw-.

Base URL

BASEhttps://ai.haiocloud.com/v1

OpenAI Compatibility

The gateway is a drop-in replacement for the OpenAI API. Set your client's base URL and use your gateway API key:

base_url  = "https://ai.haiocloud.com/v1"
api_key   = "sk-gw-your-key"
Model routing is automatic based on the model name prefix. gpt-* → OpenAI, gemini-* → Gemini, deepseek-* → DeepSeek, org/model → OpenRouter.

POST /v1/chat/completions

POST/v1/chat/completions

Send a chat message and receive a completion. Compatible with the OpenAI Chat Completions API.

Request Body

FieldTypeRequiredDescription
modelstringYesModel ID. See /v1/models for available models.
messagesarrayYesArray of message objects: [{"role":"user","content":"..."}]
streambooleanNoSet true to receive SSE token stream. Default: false
temperaturenumberNoSampling temperature (0–2). Default: model default
max_tokensintegerNoMaximum tokens to generate
top_pnumberNoNucleus sampling probability

All other OpenAI-compatible fields are forwarded as-is to the upstream provider.

Example Request

{
  "model": "nex-agi/nex-n2-pro:free",
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello!"}
  ],
  "temperature": 0.7
}

Example Response

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1700000000,
  "model": "nex-agi/nex-n2-pro:free",
  "choices": [{
    "index": 0,
    "message": {"role": "assistant", "content": "Hello! How can I help?"},
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 20,
    "completion_tokens": 9,
    "total_tokens": 29
  }
}

Streaming

Add "stream": true to receive tokens as Server-Sent Events (SSE) as they are generated.

POST /v1/chat/completions
Content-Type: application/json

{
  "model": "nex-agi/nex-n2-pro:free",
  "messages": [{"role": "user", "content": "Count to 5"}],
  "stream": true
}

Stream Response Format

data: {"id":"gen-...","object":"chat.completion.chunk","choices":[{"delta":{"content":"1"},...}]}

data: {"id":"gen-...","object":"chat.completion.chunk","choices":[{"delta":{"content":","},...}]}

data: [DONE]

GET /v1/models

GET/v1/models

Returns the list of models available on your plan.

{
  "object": "list",
  "data": [
    {"id": "nex-agi/nex-n2-pro:free", "object": "model", "owned_by": "gateway"},
    {"id": "gpt-4o", "object": "model", "owned_by": "gateway"}
  ]
}

GET /v1/usage

GET/v1/usage

Returns your token usage for the current calendar month, broken down by model.

{
  "object": "usage",
  "period": {"start": "2026-06-01T00:00:00Z", "end": "2026-06-11T12:00:00Z"},
  "totals": {"input_tokens": 1500, "output_tokens": 800, "total_tokens": 2300},
  "plan_limits": {
    "monthly_input_tokens": 1000000,
    "monthly_output_tokens": 1000000,
    "allowed_models": ["*"]
  },
  "by_model": [
    {"model": "nex-agi/nex-n2-pro:free", "provider": "openrouter", "inp": 1500, "out": 800, "total": 2300}
  ]
}

Python — OpenAI SDK

Install the OpenAI SDK: pip install openai

from openai import OpenAI

client = OpenAI(
    base_url="https://ai.haiocloud.com/v1",
    api_key="sk-gw-your-key-here",
)

# Non-streaming
response = client.chat.completions.create(
    model="nex-agi/nex-n2-pro:free",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)

# Streaming
stream = client.chat.completions.create(
    model="nex-agi/nex-n2-pro:free",
    messages=[{"role": "user", "content": "Count to 5"}],
    stream=True,
)
for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="", flush=True)

JavaScript — OpenAI SDK

Install: npm install openai

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://ai.haiocloud.com/v1",
  apiKey: "sk-gw-your-key-here",
});

// Non-streaming
const response = await client.chat.completions.create({
  model: "nex-agi/nex-n2-pro:free",
  messages: [{ role: "user", content: "Hello!" }],
});
console.log(response.choices[0].message.content);

// Streaming
const stream = await client.chat.completions.create({
  model: "nex-agi/nex-n2-pro:free",
  messages: [{ role: "user", content: "Count to 5" }],
  stream: true,
});
for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content ?? "");
}

cURL Examples

Non-streaming

curl https://ai.haiocloud.com/v1/chat/completions \
  -H "Authorization: Bearer sk-gw-your-key-here" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "nex-agi/nex-n2-pro:free",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Streaming

curl -N https://ai.haiocloud.com/v1/chat/completions \
  -H "Authorization: Bearer sk-gw-your-key-here" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "nex-agi/nex-n2-pro:free",
    "messages": [{"role": "user", "content": "Count to 5"}],
    "stream": true
  }'

List Models

curl https://ai.haiocloud.com/v1/models \
  -H "Authorization: Bearer sk-gw-your-key-here"

Check Usage

curl https://ai.haiocloud.com/v1/usage \
  -H "Authorization: Bearer sk-gw-your-key-here"

Errors

All errors follow the OpenAI error shape:

{"error": {"message": "...", "type": "...", "code": "..."}}
HTTP StatuscodeMeaning
401invalid_api_keyMissing or invalid API key
400invalid_jsonRequest body is not valid JSON
400missing_modelThe model field is required
403model_not_allowedModel not permitted on your plan
429token_limit_exceededMonthly token quota reached
502provider_errorUpstream provider returned an error
500internal_errorUnexpected gateway error

Model Routing

Model prefixProviderExample
gpt-, o1-, o3-, o4-, chatgpt-OpenAIgpt-4o
gemini-, google/Google Geminigemini-1.5-pro
deepseek-, deepseek/DeepSeekdeepseek-chat
openrouter/OpenRouter (strips prefix)openrouter/auto
org/model (any slash)OpenRouter (as-is)nex-agi/nex-n2-pro:free
zai/Z.aizai/model
anything elseOpenAI (fallback)