LLMWate API Documentation

Introduction

LLMWate provides a unified API gateway for accessing multiple AI models — including GPT-4o, Claude 3.5 Sonnet, Gemini, DeepSeek, Qwen, Meta Llama 3 and more — through a single, OpenAI-compatible interface.

One API key. 45+ models across 10 providers. One unified endpoint.

Base URL

https://api.llmwate.com/v1

Quick Start

Get up and running in 3 steps:

1
Get Your API KeySign up and create an API key from your dashboard.
2
Choose a ModelBrowse models at AI Playground or via GET /v1/models.
3
Make Your First RequestSend a chat completions request with your API key.

cURL

curl https://api.llmwate.com/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Hello!"}]}'

Authentication

All API requests require authentication via Bearer token:

Request Header

Authorization: Bearer YOUR_API_KEY

Manage your API keys on the API Keys Management page.

Chat Completions

Send a conversation and receive an AI-generated response. Fully compatible with the OpenAI Chat Completions API format.

POST/v1/chat/completions

Request Body

Parameter	Type	Required	Description
model	string	Yes	Model ID (e.g., gpt-4o, claude-3.5-sonnet, deepseek-chat)
messages	array	Yes	Array of message objects with role and content
temperature	float	No	Sampling temperature (0-2), default 1.0
max_tokens	integer	No	Maximum tokens to generate
stream	boolean	No	Enable server-sent events streaming, default false

cURL

curl https://api.llmwate.com/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "What is 2+2?"}], "temperature": 0.7, "max_tokens": 256}'

Python

from openai import OpenAI
client = OpenAI(api_key="YOUR_API_KEY", base_url="https://api.llmwate.com/v1")
response = client.chat.completions.create(model="gpt-4o", messages=[{"role": "user", "content": "What is 2+2?"}])
print(response.choices[0].message.content)

Node.js

import OpenAI from "openai";
const client = new OpenAI({api_key:"YOUR_API_KEY", baseURL:"https://api.llmwate.com/v1"});
const resp = await client.chat.completions.create({model:"gpt-4o",messages:[{"role":"user","content":"What is 2+2?"}]});
console.log(resp.choices[0].message.content);

Response

{"id":"chatcmpl-abc123","object":"chat.completion","created":1715623456,"model":"gpt-4o","choices":[{"index":0,"message":{"role":"assistant","content":"2+2 equals 4."},"finish_reason":"stop"}],"usage":{"prompt_tokens":12,"completion_tokens":8,"total_tokens":20}}

List Available Models

Get all available AI models. Supports optional category filtering.

GET/v1/models

Parameter	Type	Required	Description
category	string	No	Filter by category: general, coding, reasoning, vision, fast, cheap, chinese

cURL - All Models

curl https://api.llmwate.com/v1/models -H "Authorization: Bearer YOUR_API_KEY"

cURL - Coding Models

curl "https://api.llmwate.com/v1/models?category=coding" -H "Authorization: Bearer YOUR_API_KEY"

Response (partial)

{"models":[{"id":"gpt-4o","name":"GPT-4o","provider":"OpenAI","category":"general","context_length":128000,"pricing":{"prompt":0.0025,"completion":0.01}},{"id":"claude-3.5-sonnet","name":"Claude 3.5 Sonnet","provider":"Anthropic","category":"coding","context_length":200000,"pricing":{"prompt":0.003,"completion":0.015}}],"total":45}

General Coding Reasoning Vision Fast Cheap Chinese

45 models across 10 providers. Browse in the AI Playground.

Auto Router

Automatically select the best model for your task type. Returns recommended models ordered by preference.

GET/v1/models/auto?task=<type>

Parameter	Type	Required	Description
task	string	Yes	Task type: general, coding, reasoning, vision, fast, cheap, chinese

cURL

curl "https://api.llmwate.com/v1/models/auto?task=coding" -H "Authorization: Bearer YOUR_API_KEY"

Response

{"task":"coding","primary":{"id":"claude-3.5-sonnet","name":"Claude 3.5 Sonnet","provider":"Anthropic","reason":"Best overall coding performance"},"alternatives":[{"id":"gpt-4o","name":"GPT-4o","provider":"OpenAI"},{"id":"deepseek-chat","name":"DeepSeek Chat","provider":"DeepSeek"}]}

Task	Best For	Recommended
coding	Code generation, debugging, review	Claude 3.5 Sonnet, GPT-4o
reasoning	Complex reasoning, analysis, math	Claude 3.5 Sonnet, DeepSeek R1
vision	Image understanding, OCR	GPT-4o, Claude 3.5 Sonnet
fast	Quick responses, low latency	GPT-4o-mini, Claude 3 Haiku
cheap	Cost-effective inference	DeepSeek Chat, Qwen Turbo
chinese	Chinese language tasks	Qwen 2.5, DeepSeek Chat
general	General conversation	GPT-4o, Claude 3.5 Sonnet

Account Balance

Check your current account balance, usage, and quota.

GET/v1/balance

cURL

curl https://api.llmwate.com/v1/balance -H "Authorization: Bearer YOUR_API_KEY"

Response

{"balance":87.50,"plan":"enterprise","used_this_month":1250000,"quota_this_month":6000000,"quota_reset_at":"2026-06-01T00:00:00Z"}

Smart Routing

Per-API-key routing preferences let you prioritize specific providers and configure automatic failover chains. Set via the API Keys page.

Routing Modes

Mode	Behavior
auto	System defaults: use the best available healthy provider based on task type
preferred	Always try your preferred provider first; fall back to system chain on failure
failover	Only use the preferred provider; return error if it fails (no fallback)

Provider Health Circuit Breaker

Each provider has a circuit breaker: after 3 consecutive failures, it is marked unhealthy and skipped for 10 minutes before retry.

GET/v1/models/health

Response

{"providers":{"openai":{"status":"healthy","consecutive_failures":0,"last_failure":null},"siliconflow":{"status":"healthy","consecutive_failures":0,"last_failure":null},"deepseek":{"status":"unhealthy","consecutive_failures":3,"last_failure":"2026-05-28T10:23:00Z"}},"timestamp":"2026-05-28T10:30:00Z"}

POST/v1/models/health/reset

Reset all provider circuit breakers (admin).

Request Body

{"provider": "deepseek"} // optional, resets specific provider

Chat Status and Provider Health

Check which provider APIs are configured and their current status.

GET/v1/chat/status

cURL

curl https://api.llmwate.com/v1/chat/status -H "Authorization: Bearer YOUR_API_KEY"

Response

{"providers":{"openai":{"status":"configured"},"anthropic":{"status":"unconfigured"},"siliconflow":{"status":"configured"},"google":{"status":"unconfigured"},"deepseek":{"status":"unconfigured"},"qwen":{"status":"unconfigured"},"meta":{"status":"unconfigured"},"mistral":{"status":"unconfigured"},"cohere":{"status":"unconfigured"},"xai":{"status":"unconfigured"},"perplexity":{"status":"unconfigured"}}}

Configure additional provider API keys to enable more models.

Streaming Responses

Enable server-sent events (SSE) streaming for real-time token-by-token output. Set stream: true in the request body.

cURL - Streaming

curl https://api.llmwate.com/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "Write a story"}],
    "stream": true
  }'

Server-Sent Events Format

data: {"id":"chatcmpl-xxx","object":"chat.completion.chunk","created":1234567890,"model":"gpt-4o","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}

data: {"id":"chatcmpl-xxx","object":"chat.completion.chunk","created":1234567890,"model":"gpt-4o","choices":[{"index":0,"delta":{"content":" world"},"finish_reason":null}]}

data: [DONE]

GET/v1/models/auto?task=<type>

Parameter	Type	Required	Description
task	string	Yes	Task type: general, coding, reasoning, vision, fast, cheap, chinese

Error Codes

Code	Meaning	Resolution
401	Invalid or missing API key	Check your API key in the dashboard
403	Model not available for your plan	Upgrade your subscription plan
422	Invalid request parameters	Check request body format and types
429	Rate limit exceeded	Wait and retry, or upgrade your plan
500	Internal server error	Retry or contact support
503	Service temporarily unavailable	Check provider status and retry

Rate Limits

Rate limits are enforced per API key and vary by plan. Limits apply per minute (RPM) and per day (RPD).

Plan	Requests/min	Tokens/day	Notes
Basic	60	500,000	-
Pro	120	2,000,000	-
Enterprise	300	6,000,000	-
Unlimited	Unlimited	Unlimited	-

Rate limit headers are returned on every response:

Response Headers

X-RateLimit-Limit: 60
X-RateLimit-Remaining: 45
X-RateLimit-Reset: 1716891660

When exceeded, the API returns 429 Too Many Requests. Retry after the timestamp in X-RateLimit-Reset.

SDK Integration Guide

LLMWate uses an OpenAI-compatible API. Use the official OpenAI SDK with your LLMWate API key and base URL.

Python

Install: pip install openai

Basic Request

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_LLMWATE_API_KEY",
    base_url="https://api.llmwate.com/v1"
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Explain quantum computing in 2 sentences."}],
    temperature=0.7,
    max_tokens=512
)
print(response.choices[0].message.content)

Streaming Response

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_LLMWATE_API_KEY",
    base_url="https://api.llmwate.com/v1"
)

stream = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Write a Python function to reverse a string."}],
    stream=True
)
for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)

Async Request

import asyncio
from openai import AsyncOpenAI

async def main():
    client = AsyncOpenAI(
        api_key="YOUR_LLMWATE_API_KEY",
        base_url="https://api.llmwate.com/v1"
    )
    response = await client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": "What is 2+2?"}]
    )
    print(response.choices[0].message.content)

asyncio.run(main())

With System Prompt

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_LLMWATE_API_KEY",
    base_url="https://api.llmwate.com/v1"
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "You are a helpful Python expert."},
        {"role": "user", "content": "How do I sort a list in Python?"}
    ],
    temperature=0.5
)
print(response.choices[0].message.content)

Node.js / TypeScript

Install: npm install openai

Basic Request

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.LLMWATE_API_KEY,
  baseURL: "https://api.llmwate.com/v1"
});

const response = await client.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "Explain quantum computing in 2 sentences." }],
  temperature: 0.7,
  max_tokens: 512
});

console.log(response.choices[0].message.content);

Streaming Response

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.LLMWATE_API_KEY,
  baseURL: "https://api.llmwate.com/v1"
});

const stream = await client.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "Write a Python function to reverse a string." }],
  stream: true
});

for await (const chunk of stream) {
  if (chunk.choices[0].delta.content) {
    process.stdout.write(chunk.choices[0].delta.content);
  }
}

cURL

Basic Request

curl https://api.llmwate.com/v1/chat/completions \
  -H "Authorization: Bearer YOUR_LLMWATE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Hello!"}]}'

Streaming Response

curl https://api.llmwate.com/v1/chat/completions \
  -H "Authorization: Bearer YOUR_LLMWATE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Count to 5"}], "stream": true}'

With Parameters

curl https://api.llmwate.com/v1/chat/completions \
  -H "Authorization: Bearer YOUR_LLMWATE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "What is the capital of Japan?"}
    ],
    "temperature": 0.7,
    "max_tokens": 100
  }'

Go

Install: go get github.com/sashabaranov/go-openai

Basic Request

package main

import (
    "context"
    openai "github.com/sashabaranov/go-openai"
)

func main() {
    client := openai.NewClient("YOUR_LLMWATE_API_KEY")
    client.BaseURL = "https://api.llmwate.com/v1/"

    resp, err := client.CreateChatCompletion(
        context.Background(),
        openai.ChatCompletionRequest{
            Model: "gpt-4o",
            Messages: []openai.ChatCompletionMessage{
                {Role: "user", Content: "Explain quantum computing in 2 sentences."},
            },
        },
    )
    if err != nil {
        panic(err)
    }
    println(resp.Choices[0].Message.Content)
}

Java

Using Spring Boot RestTemplate:

Maven Dependency

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-web</artifactId>
</dependency>

Java Example

@RestController
public class LLMController {

    private final RestTemplate restTemplate;

    public LLMController() {
        this.restTemplate = new RestTemplate();
        this.restTemplate.getInterceptors().add((request, body, execution) -> {
            request.getHeaders().add("Authorization", "Bearer " + "YOUR_LLMWATE_API_KEY");
            request.getHeaders().add("Content-Type", "application/json");
            return execution.execute(request, body);
        });
    }

    @PostMapping("/chat")
    public String chat(@RequestBody Map<String, Object> request) {
        String url = "https://api.llmwate.com/v1/chat/completions";
        HttpEntity<Map<String, Object>> entity = new HttpEntity<>(request, new HttpHeaders());
        ResponseEntity<Map> response = restTemplate.postForEntity(url, entity, Map.class);
        Map body = response.getBody();
        List choices = (List) body.get("choices");
        Map message = (Map) ((Map) choices.get(0)).get("message");
        return (String) message.get("content");
    }
}

Ruby

Install: gem install openai

Basic Request

require "openai"

client = OpenAI::Client.new(
  api_key: ENV["LLMWATE_API_KEY"],
  uri_base: "https://api.llmwate.com/v1"
)

response = client.chat(
  parameters: {
    model: "gpt-4o",
    messages: [
      { role: "user", content: "Explain quantum computing in 2 sentences." }
    ],
    temperature: 0.7
  }
)

puts response.dig("choices", 0, "message", "content")

PHP

Basic Request (cURL)

<?php
$api_key = "YOUR_LLMWATE_API_KEY";
$url = "https://api.llmwate.com/v1/chat/completions";

$data = [
    "model" => "gpt-4o",
    "messages" => [
        ["role" => "user", "content" => "Explain quantum computing in 2 sentences."]
    ],
    "temperature" => 0.7
];

$ch = curl_init($url);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, json_encode($data));
curl_setopt($ch, CURLOPT_HTTPHEADER, [
    "Authorization: Bearer " . $api_key,
    "Content-Type: application/json"
]);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);

$response = curl_exec($ch);
curl_close($ch);

$result = json_decode($response, true);
echo $result["choices"][0]["message"]["content"];
?>

SDK Configuration Reference

Setting	Value	Description
API Key	string	Your LLMWate API key from the dashboard
Base URL	string	`https://api.llmwate.com/v1`
Default Model	string	gpt-4o (or any available model)
Timeout	integer	Request timeout in seconds (default varies by SDK)

Environment Variables

Recommended .env setup

# .env file
LLMWATE_API_KEY=lmw_your_api_key_here
LLMWATE_BASE_URL=https://api.llmwate.com/v1

Python with python-dotenv

from dotenv import load_dotenv
from openai import OpenAI
import os

load_dotenv()

client = OpenAI(
    api_key=os.getenv("LLMWATE_API_KEY"),
    base_url=os.getenv("LLMWATE_BASE_URL", "https://api.llmwate.com/v1")
)

Common Patterns

Structured Output (JSON Schema)

from openai import OpenAI

client = OpenAI(api_key="YOUR_KEY", base_url="https://api.llmwate.com/v1")

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Return JSON with name and age fields."}],
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "person",
            "schema": {
                "type": "object",
                "properties": {
                    "name": {"type": "string"},
                    "age": {"type": "integer"}
                },
                "required": ["name", "age"]
            }
        }
    }
)
# Access: response.choices[0].message.content (JSON string)

Function Calling / Tool Use

from openai import OpenAI

client = OpenAI(api_key="YOUR_KEY", base_url="https://api.llmwate.com/v1")

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "What is the weather in Tokyo?"}],
    tools=[
        {
            "type": "function",
            "function": {
                "name": "get_weather",
                "parameters": {
                    "type": "object",
                    "properties": {"location": {"type": "string"}},
                    "required": ["location"]
                }
            }
        }
    ]
)
# Tool calls available in response.choices[0].message.tool_calls

Vision (Image Understanding)

from openai import OpenAI

client = OpenAI(api_key="YOUR_KEY", base_url="https://api.llmwate.com/v1")

response = client.chat.completions.create(
    model="gpt-4o",  # or claude-3.5-sonnet for vision
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What is in this image?"},
                {"type": "image_url", "image_url": {"url": "https://example.com/image.jpg"}}
            ]
        }
    ]
)
print(response.choices[0].message.content)

Troubleshooting

Error	Cause	Solution
401 Unauthorized	Invalid API key	Check your API key in the dashboard
403 Forbidden	Model not available	Verify your plan includes the model
429 Rate Limited	Too many requests	Add delay between requests or upgrade plan
Connection Error	Network issues	Check your internet connection

API Tester

Interactive

Response

API Marketplace

Browse all available models with real-time pricing. Click any model to open it in the Playground.

Models

Providers