OpenAI-Compatible

HaioAI Gateway

A unified AI gateway supporting OpenRouter, OpenAI, Gemini, DeepSeek, Z.ai and custom providers — with API key management, usage tracking, and plan-based access control.

Overview

The gateway exposes an OpenAI-compatible API. Any client that works with OpenAI will work here — just change the base_url.

Feature	Details
base_url	https://ai.haiocloud.com/v1
Protocol	HTTPS only (HTTP redirects to HTTPS)
Format	JSON request & response, SSE for streaming
Auth	Bearer token or X-API-Key header

Authentication

Include your API key in every request using one of these headers:

Authorization: Bearer sk-gw-xxxxxxxxxxxxxxxxxxxx

X-API-Key: sk-gw-xxxxxxxxxxxxxxxxxxxx

API keys are issued by your administrator via the Django admin panel at /admin/. Keys start with sk-gw-.

Base URL

BASEhttps://ai.haiocloud.com/v1

OpenAI Compatibility

The gateway is a drop-in replacement for the OpenAI API. Set your client's base URL and use your gateway API key:

base_url  = "https://ai.haiocloud.com/v1"
api_key   = "sk-gw-your-key"

Model routing is automatic based on the model name prefix. gpt-* → OpenAI, gemini-* → Gemini, deepseek-* → DeepSeek, org/model → OpenRouter.

POST /v1/chat/completions

POST/v1/chat/completions

Send a chat message and receive a completion. Compatible with the OpenAI Chat Completions API.

Request Body

Field	Type	Required	Description
model	string	Yes	Model ID. See /v1/models for available models.
messages	array	Yes	Array of message objects: `[{"role":"user","content":"..."}]`
stream	boolean	No	Set `true` to receive SSE token stream. Default: `false`
temperature	number	No	Sampling temperature (0–2). Default: model default
max_tokens	integer	No	Maximum tokens to generate
top_p	number	No	Nucleus sampling probability

All other OpenAI-compatible fields are forwarded as-is to the upstream provider.

Example Request

{
  "model": "nex-agi/nex-n2-pro:free",
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello!"}
  ],
  "temperature": 0.7
}

Example Response

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1700000000,
  "model": "nex-agi/nex-n2-pro:free",
  "choices": [{
    "index": 0,
    "message": {"role": "assistant", "content": "Hello! How can I help?"},
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 20,
    "completion_tokens": 9,
    "total_tokens": 29
  }
}

Streaming

Add "stream": true to receive tokens as Server-Sent Events (SSE) as they are generated.

POST /v1/chat/completions
Content-Type: application/json

{
  "model": "nex-agi/nex-n2-pro:free",
  "messages": [{"role": "user", "content": "Count to 5"}],
  "stream": true
}

Stream Response Format

data: {"id":"gen-...","object":"chat.completion.chunk","choices":[{"delta":{"content":"1"},...}]}

data: {"id":"gen-...","object":"chat.completion.chunk","choices":[{"delta":{"content":","},...}]}

data: [DONE]

GET /v1/models

GET/v1/models

Returns the list of models available on your plan.

{
  "object": "list",
  "data": [
    {"id": "nex-agi/nex-n2-pro:free", "object": "model", "owned_by": "gateway"},
    {"id": "gpt-4o", "object": "model", "owned_by": "gateway"}
  ]
}

GET /v1/usage

GET/v1/usage

Returns your token usage for the current calendar month, broken down by model.

{
  "object": "usage",
  "period": {"start": "2026-06-01T00:00:00Z", "end": "2026-06-11T12:00:00Z"},
  "totals": {"input_tokens": 1500, "output_tokens": 800, "total_tokens": 2300},
  "plan_limits": {
    "monthly_input_tokens": 1000000,
    "monthly_output_tokens": 1000000,
    "allowed_models": ["*"]
  },
  "by_model": [
    {"model": "nex-agi/nex-n2-pro:free", "provider": "openrouter", "inp": 1500, "out": 800, "total": 2300}
  ]
}

Python — OpenAI SDK

Install the OpenAI SDK: pip install openai

from openai import OpenAI

client = OpenAI(
    base_url="https://ai.haiocloud.com/v1",
    api_key="sk-gw-your-key-here",
)

# Non-streaming
response = client.chat.completions.create(
    model="nex-agi/nex-n2-pro:free",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)

# Streaming
stream = client.chat.completions.create(
    model="nex-agi/nex-n2-pro:free",
    messages=[{"role": "user", "content": "Count to 5"}],
    stream=True,
)
for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="", flush=True)

JavaScript — OpenAI SDK

Install: npm install openai

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://ai.haiocloud.com/v1",
  apiKey: "sk-gw-your-key-here",
});

// Non-streaming
const response = await client.chat.completions.create({
  model: "nex-agi/nex-n2-pro:free",
  messages: [{ role: "user", content: "Hello!" }],
});
console.log(response.choices[0].message.content);

// Streaming
const stream = await client.chat.completions.create({
  model: "nex-agi/nex-n2-pro:free",
  messages: [{ role: "user", content: "Count to 5" }],
  stream: true,
});
for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content ?? "");
}

cURL Examples

Non-streaming

curl https://ai.haiocloud.com/v1/chat/completions \
  -H "Authorization: Bearer sk-gw-your-key-here" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "nex-agi/nex-n2-pro:free",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Streaming

curl -N https://ai.haiocloud.com/v1/chat/completions \
  -H "Authorization: Bearer sk-gw-your-key-here" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "nex-agi/nex-n2-pro:free",
    "messages": [{"role": "user", "content": "Count to 5"}],
    "stream": true
  }'

List Models

curl https://ai.haiocloud.com/v1/models \
  -H "Authorization: Bearer sk-gw-your-key-here"

Check Usage

curl https://ai.haiocloud.com/v1/usage \
  -H "Authorization: Bearer sk-gw-your-key-here"

Errors

All errors follow the OpenAI error shape:

{"error": {"message": "...", "type": "...", "code": "..."}}

HTTP Status	code	Meaning
401	invalid_api_key	Missing or invalid API key
400	invalid_json	Request body is not valid JSON
400	missing_model	The `model` field is required
403	model_not_allowed	Model not permitted on your plan
429	token_limit_exceeded	Monthly token quota reached
502	provider_error	Upstream provider returned an error
500	internal_error	Unexpected gateway error

Model Routing

Model prefix	Provider	Example
`gpt-`, `o1-`, `o3-`, `o4-`, `chatgpt-`	OpenAI	`gpt-4o`
`gemini-`, `google/`	Google Gemini	`gemini-1.5-pro`
`deepseek-`, `deepseek/`	DeepSeek	`deepseek-chat`
`openrouter/`	OpenRouter (strips prefix)	`openrouter/auto`
`org/model` (any slash)	OpenRouter (as-is)	`nex-agi/nex-n2-pro:free`
`zai/`	Z.ai	`zai/model`
anything else	OpenAI (fallback)	—