Introduction

LLMWate provides a unified API gateway for accessing multiple AI models — including GPT-4o, Claude 3.5 Sonnet, Gemini, DeepSeek, Qwen, Meta Llama 3 and more — through a single, OpenAI-compatible interface.

One API key. 45+ models across 10 providers. One unified endpoint.

Base URL
https://api.llmwate.com/v1

Quick Start

Get up and running in 3 steps:

  1. 1
    Get Your API KeySign up and create an API key from your dashboard.
  2. 2
    Choose a ModelBrowse models at AI Playground or via GET /v1/models.
  3. 3
    Make Your First RequestSend a chat completions request with your API key.
cURL
curl https://api.llmwate.com/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Hello!"}]}'

Authentication

All API requests require authentication via Bearer token:

Request Header
Authorization: Bearer YOUR_API_KEY

Manage your API keys on the API Keys Management page.

Chat Completions

Send a conversation and receive an AI-generated response. Fully compatible with the OpenAI Chat Completions API format.

POST/v1/chat/completions

Request Body

ParameterTypeRequiredDescription
modelstringYesModel ID (e.g., gpt-4o, claude-3.5-sonnet, deepseek-chat)
messagesarrayYesArray of message objects with role and content
temperaturefloatNoSampling temperature (0-2), default 1.0
max_tokensintegerNoMaximum tokens to generate
streambooleanNoEnable server-sent events streaming, default false
cURL
curl https://api.llmwate.com/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "What is 2+2?"}], "temperature": 0.7, "max_tokens": 256}'
Python
from openai import OpenAI
client = OpenAI(api_key="YOUR_API_KEY", base_url="https://api.llmwate.com/v1")
response = client.chat.completions.create(model="gpt-4o", messages=[{"role": "user", "content": "What is 2+2?"}])
print(response.choices[0].message.content)
Node.js
import OpenAI from "openai";
const client = new OpenAI({api_key:"YOUR_API_KEY", baseURL:"https://api.llmwate.com/v1"});
const resp = await client.chat.completions.create({model:"gpt-4o",messages:[{"role":"user","content":"What is 2+2?"}]});
console.log(resp.choices[0].message.content);
Response
{"id":"chatcmpl-abc123","object":"chat.completion","created":1715623456,"model":"gpt-4o","choices":[{"index":0,"message":{"role":"assistant","content":"2+2 equals 4."},"finish_reason":"stop"}],"usage":{"prompt_tokens":12,"completion_tokens":8,"total_tokens":20}}

List Available Models

Get all available AI models. Supports optional category filtering.

GET/v1/models
ParameterTypeRequiredDescription
categorystringNoFilter by category: general, coding, reasoning, vision, fast, cheap, chinese
cURL - All Models
curl https://api.llmwate.com/v1/models -H "Authorization: Bearer YOUR_API_KEY"
cURL - Coding Models
curl "https://api.llmwate.com/v1/models?category=coding" -H "Authorization: Bearer YOUR_API_KEY"
Response (partial)
{"models":[{"id":"gpt-4o","name":"GPT-4o","provider":"OpenAI","category":"general","context_length":128000,"pricing":{"prompt":0.0025,"completion":0.01}},{"id":"claude-3.5-sonnet","name":"Claude 3.5 Sonnet","provider":"Anthropic","category":"coding","context_length":200000,"pricing":{"prompt":0.003,"completion":0.015}}],"total":45}

General Coding Reasoning Vision Fast Cheap Chinese

45 models across 10 providers. Browse in the AI Playground.

Auto Router

Automatically select the best model for your task type. Returns recommended models ordered by preference.

GET/v1/models/auto?task=<type>
ParameterTypeRequiredDescription
taskstringYesTask type: general, coding, reasoning, vision, fast, cheap, chinese
cURL
curl "https://api.llmwate.com/v1/models/auto?task=coding" -H "Authorization: Bearer YOUR_API_KEY"
Response
{"task":"coding","primary":{"id":"claude-3.5-sonnet","name":"Claude 3.5 Sonnet","provider":"Anthropic","reason":"Best overall coding performance"},"alternatives":[{"id":"gpt-4o","name":"GPT-4o","provider":"OpenAI"},{"id":"deepseek-chat","name":"DeepSeek Chat","provider":"DeepSeek"}]}
TaskBest ForRecommended
codingCode generation, debugging, reviewClaude 3.5 Sonnet, GPT-4o
reasoningComplex reasoning, analysis, mathClaude 3.5 Sonnet, DeepSeek R1
visionImage understanding, OCRGPT-4o, Claude 3.5 Sonnet
fastQuick responses, low latencyGPT-4o-mini, Claude 3 Haiku
cheapCost-effective inferenceDeepSeek Chat, Qwen Turbo
chineseChinese language tasksQwen 2.5, DeepSeek Chat
generalGeneral conversationGPT-4o, Claude 3.5 Sonnet

Account Balance

Check your current account balance, usage, and quota.

GET/v1/balance
cURL
curl https://api.llmwate.com/v1/balance -H "Authorization: Bearer YOUR_API_KEY"
Response
{"balance":87.50,"plan":"enterprise","used_this_month":1250000,"quota_this_month":6000000,"quota_reset_at":"2026-06-01T00:00:00Z"}

Smart Routing

Per-API-key routing preferences let you prioritize specific providers and configure automatic failover chains. Set via the API Keys page.

Routing Modes

ModeBehavior
autoSystem defaults: use the best available healthy provider based on task type
preferredAlways try your preferred provider first; fall back to system chain on failure
failoverOnly use the preferred provider; return error if it fails (no fallback)

Provider Health Circuit Breaker

Each provider has a circuit breaker: after 3 consecutive failures, it is marked unhealthy and skipped for 10 minutes before retry.

GET/v1/models/health
Response
{"providers":{"openai":{"status":"healthy","consecutive_failures":0,"last_failure":null},"siliconflow":{"status":"healthy","consecutive_failures":0,"last_failure":null},"deepseek":{"status":"unhealthy","consecutive_failures":3,"last_failure":"2026-05-28T10:23:00Z"}},"timestamp":"2026-05-28T10:30:00Z"}
POST/v1/models/health/reset

Reset all provider circuit breakers (admin).

Request Body
{"provider": "deepseek"} // optional, resets specific provider

Chat Status and Provider Health

Check which provider APIs are configured and their current status.

GET/v1/chat/status
cURL
curl https://api.llmwate.com/v1/chat/status -H "Authorization: Bearer YOUR_API_KEY"
Response
{"providers":{"openai":{"status":"configured"},"anthropic":{"status":"unconfigured"},"siliconflow":{"status":"configured"},"google":{"status":"unconfigured"},"deepseek":{"status":"unconfigured"},"qwen":{"status":"unconfigured"},"meta":{"status":"unconfigured"},"mistral":{"status":"unconfigured"},"cohere":{"status":"unconfigured"},"xai":{"status":"unconfigured"},"perplexity":{"status":"unconfigured"}}}

Configure additional provider API keys to enable more models.

Streaming Responses

Enable server-sent events (SSE) streaming for real-time token-by-token output. Set stream: true in the request body.

cURL - Streaming
curl https://api.llmwate.com/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "Write a story"}],
    "stream": true
  }'
Server-Sent Events Format
data: {"id":"chatcmpl-xxx","object":"chat.completion.chunk","created":1234567890,"model":"gpt-4o","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}

data: {"id":"chatcmpl-xxx","object":"chat.completion.chunk","created":1234567890,"model":"gpt-4o","choices":[{"index":0,"delta":{"content":" world"},"finish_reason":null}]}

data: [DONE]
GET/v1/models/auto?task=<type>
ParameterTypeRequiredDescription
taskstringYesTask type: general, coding, reasoning, vision, fast, cheap, chinese

Error Codes

CodeMeaningResolution
401Invalid or missing API keyCheck your API key in the dashboard
403Model not available for your planUpgrade your subscription plan
422Invalid request parametersCheck request body format and types
429Rate limit exceededWait and retry, or upgrade your plan
500Internal server errorRetry or contact support
503Service temporarily unavailableCheck provider status and retry

Rate Limits

Rate limits are enforced per API key and vary by plan. Limits apply per minute (RPM) and per day (RPD).

PlanRequests/minTokens/dayNotes
Basic60500,000-
Pro1202,000,000-
Enterprise3006,000,000-
UnlimitedUnlimitedUnlimited-

Rate limit headers are returned on every response:

Response Headers
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 45
X-RateLimit-Reset: 1716891660

When exceeded, the API returns 429 Too Many Requests. Retry after the timestamp in X-RateLimit-Reset.

SDK Integration Guide

LLMWate uses an OpenAI-compatible API. Use the official OpenAI SDK with your LLMWate API key and base URL.

Python

Install: pip install openai

Basic Request
from openai import OpenAI

client = OpenAI(
    api_key="YOUR_LLMWATE_API_KEY",
    base_url="https://api.llmwate.com/v1"
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Explain quantum computing in 2 sentences."}],
    temperature=0.7,
    max_tokens=512
)
print(response.choices[0].message.content)
Streaming Response
from openai import OpenAI

client = OpenAI(
    api_key="YOUR_LLMWATE_API_KEY",
    base_url="https://api.llmwate.com/v1"
)

stream = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Write a Python function to reverse a string."}],
    stream=True
)
for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)
Async Request
import asyncio
from openai import AsyncOpenAI

async def main():
    client = AsyncOpenAI(
        api_key="YOUR_LLMWATE_API_KEY",
        base_url="https://api.llmwate.com/v1"
    )
    response = await client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": "What is 2+2?"}]
    )
    print(response.choices[0].message.content)

asyncio.run(main())
With System Prompt
from openai import OpenAI

client = OpenAI(
    api_key="YOUR_LLMWATE_API_KEY",
    base_url="https://api.llmwate.com/v1"
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "You are a helpful Python expert."},
        {"role": "user", "content": "How do I sort a list in Python?"}
    ],
    temperature=0.5
)
print(response.choices[0].message.content)

Node.js / TypeScript

Install: npm install openai

Basic Request
import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.LLMWATE_API_KEY,
  baseURL: "https://api.llmwate.com/v1"
});

const response = await client.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "Explain quantum computing in 2 sentences." }],
  temperature: 0.7,
  max_tokens: 512
});

console.log(response.choices[0].message.content);
Streaming Response
import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.LLMWATE_API_KEY,
  baseURL: "https://api.llmwate.com/v1"
});

const stream = await client.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "Write a Python function to reverse a string." }],
  stream: true
});

for await (const chunk of stream) {
  if (chunk.choices[0].delta.content) {
    process.stdout.write(chunk.choices[0].delta.content);
  }
}

cURL

Basic Request
curl https://api.llmwate.com/v1/chat/completions \
  -H "Authorization: Bearer YOUR_LLMWATE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Hello!"}]}'
Streaming Response
curl https://api.llmwate.com/v1/chat/completions \
  -H "Authorization: Bearer YOUR_LLMWATE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Count to 5"}], "stream": true}'
With Parameters
curl https://api.llmwate.com/v1/chat/completions \
  -H "Authorization: Bearer YOUR_LLMWATE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "What is the capital of Japan?"}
    ],
    "temperature": 0.7,
    "max_tokens": 100
  }'

Go

Install: go get github.com/sashabaranov/go-openai

Basic Request
package main

import (
    "context"
    openai "github.com/sashabaranov/go-openai"
)

func main() {
    client := openai.NewClient("YOUR_LLMWATE_API_KEY")
    client.BaseURL = "https://api.llmwate.com/v1/"

    resp, err := client.CreateChatCompletion(
        context.Background(),
        openai.ChatCompletionRequest{
            Model: "gpt-4o",
            Messages: []openai.ChatCompletionMessage{
                {Role: "user", Content: "Explain quantum computing in 2 sentences."},
            },
        },
    )
    if err != nil {
        panic(err)
    }
    println(resp.Choices[0].Message.Content)
}

Java

Using Spring Boot RestTemplate:

Maven Dependency
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-web</artifactId>
</dependency>
Java Example
@RestController
public class LLMController {

    private final RestTemplate restTemplate;

    public LLMController() {
        this.restTemplate = new RestTemplate();
        this.restTemplate.getInterceptors().add((request, body, execution) -> {
            request.getHeaders().add("Authorization", "Bearer " + "YOUR_LLMWATE_API_KEY");
            request.getHeaders().add("Content-Type", "application/json");
            return execution.execute(request, body);
        });
    }

    @PostMapping("/chat")
    public String chat(@RequestBody Map<String, Object> request) {
        String url = "https://api.llmwate.com/v1/chat/completions";
        HttpEntity<Map<String, Object>> entity = new HttpEntity<>(request, new HttpHeaders());
        ResponseEntity<Map> response = restTemplate.postForEntity(url, entity, Map.class);
        Map body = response.getBody();
        List choices = (List) body.get("choices");
        Map message = (Map) ((Map) choices.get(0)).get("message");
        return (String) message.get("content");
    }
}

Ruby

Install: gem install openai

Basic Request
require "openai"

client = OpenAI::Client.new(
  api_key: ENV["LLMWATE_API_KEY"],
  uri_base: "https://api.llmwate.com/v1"
)

response = client.chat(
  parameters: {
    model: "gpt-4o",
    messages: [
      { role: "user", content: "Explain quantum computing in 2 sentences." }
    ],
    temperature: 0.7
  }
)

puts response.dig("choices", 0, "message", "content")

PHP

Basic Request (cURL)
<?php
$api_key = "YOUR_LLMWATE_API_KEY";
$url = "https://api.llmwate.com/v1/chat/completions";

$data = [
    "model" => "gpt-4o",
    "messages" => [
        ["role" => "user", "content" => "Explain quantum computing in 2 sentences."]
    ],
    "temperature" => 0.7
];

$ch = curl_init($url);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, json_encode($data));
curl_setopt($ch, CURLOPT_HTTPHEADER, [
    "Authorization: Bearer " . $api_key,
    "Content-Type: application/json"
]);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);

$response = curl_exec($ch);
curl_close($ch);

$result = json_decode($response, true);
echo $result["choices"][0]["message"]["content"];
?>

SDK Configuration Reference

SettingValueDescription
API KeystringYour LLMWate API key from the dashboard
Base URLstringhttps://api.llmwate.com/v1
Default Modelstringgpt-4o (or any available model)
TimeoutintegerRequest timeout in seconds (default varies by SDK)

Environment Variables

Recommended .env setup
# .env file
LLMWATE_API_KEY=lmw_your_api_key_here
LLMWATE_BASE_URL=https://api.llmwate.com/v1
Python with python-dotenv
from dotenv import load_dotenv
from openai import OpenAI
import os

load_dotenv()

client = OpenAI(
    api_key=os.getenv("LLMWATE_API_KEY"),
    base_url=os.getenv("LLMWATE_BASE_URL", "https://api.llmwate.com/v1")
)

Common Patterns

Structured Output (JSON Schema)
from openai import OpenAI

client = OpenAI(api_key="YOUR_KEY", base_url="https://api.llmwate.com/v1")

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Return JSON with name and age fields."}],
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "person",
            "schema": {
                "type": "object",
                "properties": {
                    "name": {"type": "string"},
                    "age": {"type": "integer"}
                },
                "required": ["name", "age"]
            }
        }
    }
)
# Access: response.choices[0].message.content (JSON string)
Function Calling / Tool Use
from openai import OpenAI

client = OpenAI(api_key="YOUR_KEY", base_url="https://api.llmwate.com/v1")

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "What is the weather in Tokyo?"}],
    tools=[
        {
            "type": "function",
            "function": {
                "name": "get_weather",
                "parameters": {
                    "type": "object",
                    "properties": {"location": {"type": "string"}},
                    "required": ["location"]
                }
            }
        }
    ]
)
# Tool calls available in response.choices[0].message.tool_calls
Vision (Image Understanding)
from openai import OpenAI

client = OpenAI(api_key="YOUR_KEY", base_url="https://api.llmwate.com/v1")

response = client.chat.completions.create(
    model="gpt-4o",  # or claude-3.5-sonnet for vision
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What is in this image?"},
                {"type": "image_url", "image_url": {"url": "https://example.com/image.jpg"}}
            ]
        }
    ]
)
print(response.choices[0].message.content)

Troubleshooting

ErrorCauseSolution
401 UnauthorizedInvalid API keyCheck your API key in the dashboard
403 ForbiddenModel not availableVerify your plan includes the model
429 Rate LimitedToo many requestsAdd delay between requests or upgrade plan
Connection ErrorNetwork issuesCheck your internet connection

API Tester

Interactive
Response

API Marketplace

Browse all available models with real-time pricing. Click any model to open it in the Playground.

45
Models
10
Providers
7
Categories
8
Free Models
Loading models...