HaioAI Gateway
A unified AI gateway supporting OpenRouter, OpenAI, Gemini, DeepSeek, Z.ai and custom providers — with API key management, usage tracking, and plan-based access control.
Overview
The gateway exposes an OpenAI-compatible API. Any client that works with OpenAI will work here — just change the base_url.
| Feature | Details |
|---|---|
| base_url | https://ai.haiocloud.com/v1 |
| Protocol | HTTPS only (HTTP redirects to HTTPS) |
| Format | JSON request & response, SSE for streaming |
| Auth | Bearer token or X-API-Key header |
Authentication
Include your API key in every request using one of these headers:
Authorization: Bearer sk-gw-xxxxxxxxxxxxxxxxxxxx
or
X-API-Key: sk-gw-xxxxxxxxxxxxxxxxxxxx
sk-gw-.Base URL
OpenAI Compatibility
The gateway is a drop-in replacement for the OpenAI API. Set your client's base URL and use your gateway API key:
base_url = "https://ai.haiocloud.com/v1"
api_key = "sk-gw-your-key"
gpt-* → OpenAI, gemini-* → Gemini, deepseek-* → DeepSeek, org/model → OpenRouter.POST /v1/chat/completions
Send a chat message and receive a completion. Compatible with the OpenAI Chat Completions API.
Request Body
| Field | Type | Required | Description |
|---|---|---|---|
| model | string | Yes | Model ID. See /v1/models for available models. |
| messages | array | Yes | Array of message objects: [{"role":"user","content":"..."}] |
| stream | boolean | No | Set true to receive SSE token stream. Default: false |
| temperature | number | No | Sampling temperature (0–2). Default: model default |
| max_tokens | integer | No | Maximum tokens to generate |
| top_p | number | No | Nucleus sampling probability |
All other OpenAI-compatible fields are forwarded as-is to the upstream provider.
Example Request
{
"model": "nex-agi/nex-n2-pro:free",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
],
"temperature": 0.7
}
Example Response
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1700000000,
"model": "nex-agi/nex-n2-pro:free",
"choices": [{
"index": 0,
"message": {"role": "assistant", "content": "Hello! How can I help?"},
"finish_reason": "stop"
}],
"usage": {
"prompt_tokens": 20,
"completion_tokens": 9,
"total_tokens": 29
}
}
Streaming
Add "stream": true to receive tokens as Server-Sent Events (SSE) as they are generated.
POST /v1/chat/completions
Content-Type: application/json
{
"model": "nex-agi/nex-n2-pro:free",
"messages": [{"role": "user", "content": "Count to 5"}],
"stream": true
}
Stream Response Format
data: {"id":"gen-...","object":"chat.completion.chunk","choices":[{"delta":{"content":"1"},...}]}
data: {"id":"gen-...","object":"chat.completion.chunk","choices":[{"delta":{"content":","},...}]}
data: [DONE]
GET /v1/models
Returns the list of models available on your plan.
{
"object": "list",
"data": [
{"id": "nex-agi/nex-n2-pro:free", "object": "model", "owned_by": "gateway"},
{"id": "gpt-4o", "object": "model", "owned_by": "gateway"}
]
}
GET /v1/usage
Returns your token usage for the current calendar month, broken down by model.
{
"object": "usage",
"period": {"start": "2026-06-01T00:00:00Z", "end": "2026-06-11T12:00:00Z"},
"totals": {"input_tokens": 1500, "output_tokens": 800, "total_tokens": 2300},
"plan_limits": {
"monthly_input_tokens": 1000000,
"monthly_output_tokens": 1000000,
"allowed_models": ["*"]
},
"by_model": [
{"model": "nex-agi/nex-n2-pro:free", "provider": "openrouter", "inp": 1500, "out": 800, "total": 2300}
]
}
Python — OpenAI SDK
Install the OpenAI SDK: pip install openai
from openai import OpenAI
client = OpenAI(
base_url="https://ai.haiocloud.com/v1",
api_key="sk-gw-your-key-here",
)
# Non-streaming
response = client.chat.completions.create(
model="nex-agi/nex-n2-pro:free",
messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)
# Streaming
stream = client.chat.completions.create(
model="nex-agi/nex-n2-pro:free",
messages=[{"role": "user", "content": "Count to 5"}],
stream=True,
)
for chunk in stream:
print(chunk.choices[0].delta.content or "", end="", flush=True)
JavaScript — OpenAI SDK
Install: npm install openai
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://ai.haiocloud.com/v1",
apiKey: "sk-gw-your-key-here",
});
// Non-streaming
const response = await client.chat.completions.create({
model: "nex-agi/nex-n2-pro:free",
messages: [{ role: "user", content: "Hello!" }],
});
console.log(response.choices[0].message.content);
// Streaming
const stream = await client.chat.completions.create({
model: "nex-agi/nex-n2-pro:free",
messages: [{ role: "user", content: "Count to 5" }],
stream: true,
});
for await (const chunk of stream) {
process.stdout.write(chunk.choices[0]?.delta?.content ?? "");
}
cURL Examples
Non-streaming
curl https://ai.haiocloud.com/v1/chat/completions \
-H "Authorization: Bearer sk-gw-your-key-here" \
-H "Content-Type: application/json" \
-d '{
"model": "nex-agi/nex-n2-pro:free",
"messages": [{"role": "user", "content": "Hello!"}]
}'
Streaming
curl -N https://ai.haiocloud.com/v1/chat/completions \
-H "Authorization: Bearer sk-gw-your-key-here" \
-H "Content-Type: application/json" \
-d '{
"model": "nex-agi/nex-n2-pro:free",
"messages": [{"role": "user", "content": "Count to 5"}],
"stream": true
}'
List Models
curl https://ai.haiocloud.com/v1/models \
-H "Authorization: Bearer sk-gw-your-key-here"
Check Usage
curl https://ai.haiocloud.com/v1/usage \
-H "Authorization: Bearer sk-gw-your-key-here"
Errors
All errors follow the OpenAI error shape:
{"error": {"message": "...", "type": "...", "code": "..."}}
| HTTP Status | code | Meaning |
|---|---|---|
| 401 | invalid_api_key | Missing or invalid API key |
| 400 | invalid_json | Request body is not valid JSON |
| 400 | missing_model | The model field is required |
| 403 | model_not_allowed | Model not permitted on your plan |
| 429 | token_limit_exceeded | Monthly token quota reached |
| 502 | provider_error | Upstream provider returned an error |
| 500 | internal_error | Unexpected gateway error |
Model Routing
| Model prefix | Provider | Example |
|---|---|---|
gpt-, o1-, o3-, o4-, chatgpt- | OpenAI | gpt-4o |
gemini-, google/ | Google Gemini | gemini-1.5-pro |
deepseek-, deepseek/ | DeepSeek | deepseek-chat |
openrouter/ | OpenRouter (strips prefix) | openrouter/auto |
org/model (any slash) | OpenRouter (as-is) | nex-agi/nex-n2-pro:free |
zai/ | Z.ai | zai/model |
| anything else | OpenAI (fallback) | — |