Documentation

Developer Guide

Everything you need to call the Lexora inference API from your app.

Quickstart

Lexora is an OpenAI-compatible API. Change one line in your existing code and your requests route to distributed GPU nodes at a fraction of the cost.

1. Get API Key

Sign up and generate a key from the dashboard.

2. Change base URL

Point your OpenAI SDK at api.lexora.network/v1.

3. Send requests

Identical SDK, 95% lower cost.

Authentication

All requests require a sk-lexora-… API key in the Authorization header.

Get your key at Dashboard → API Keys.

bash
curl https://api.lexora.network/v1/chat/completions \
  -H "Authorization: Bearer sk-lexora-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"meta-llama/Llama-3.2-3B-Instruct","messages":[{"role":"user","content":"Hello"}]}'

Chat Completions

Endpoint: POST /v1/chat/completions — identical to the OpenAI spec.

Python (openai SDK)

python
from openai import OpenAI

client = OpenAI(
    base_url="https://api.lexora.network/v1",
    api_key="sk-lexora-YOUR_KEY",
)

# Streaming (recommended)
stream = client.chat.completions.create(
    model="meta-llama/Llama-3.2-3B-Instruct",
    messages=[{"role": "user", "content": "Explain quantum entanglement simply."}],
    stream=True,
)

for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="", flush=True)

Node.js / TypeScript

typescript
import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.lexora.network/v1",
  apiKey: process.env.DEPIN_API_KEY,
});

const stream = await client.chat.completions.create({
  model: "meta-llama/Llama-3.2-3B-Instruct",
  messages: [{ role: "user", content: "Hello!" }],
  stream: true,
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content ?? "");
}

Non-streaming response

python
response = client.chat.completions.create(
    model="meta-llama/Llama-3.2-3B-Instruct",
    messages=[{"role": "user", "content": "What is 2+2?"}],
    stream=False,
    max_tokens=512,
    temperature=0.7,
)

print(response.choices[0].message.content)
Parameters
modelstringRequired. See Models section.
messagesarrayRequired. OpenAI message format.
streambooleanDefault true. SSE streaming.
max_tokensintegerDefault 512. Max 32768.
temperaturefloatDefault 0.7. Range 0–2.

Image Generation

Endpoint: POST /v1/images/generations — returns base64 PNG.

Image generation runs on consumer GPUs and can take 15–90 seconds. Set a client timeout of at least 120 seconds. Each image costs $0.002 and is deducted from your balance on completion.
python
import base64, requests

resp = requests.post(
    "https://api.lexora.network/v1/images/generations",
    headers={"Authorization": "Bearer sk-lexora-YOUR_KEY"},
    json={
        "model": "black-forest-labs/FLUX.1-schnell",
        "prompt": "A futuristic cityscape at dawn, cinematic lighting",
        "width": 768,
        "height": 768,
        "num_inference_steps": 4,   # schnell default — do not increase beyond 4
        "guidance_scale": 0.0,      # schnell is guidance-free
        "n": 1,
    },
    timeout=120,
)

data = resp.json()["data"][0]["b64_json"]
with open("output.png", "wb") as f:
    f.write(base64.b64decode(data))
Image parameters
modelstringRequired. Only FLUX.1-schnell is live.
promptstringRequired. Describe the image.
width / heightinteger256–2048. Default 768x768.
num_inference_stepsinteger1–50. Default 4 (schnell).
guidance_scalefloat0–20. Default 0.0 (schnell is guidance-free).
nintegerAlways 1 (only one image per request).

Available Models

Language Models

Model IDContextPriceStatus
meta-llama/Llama-3.2-3B-Instruct128K$0.04 / 1M tokensLive
meta-llama/Llama-3.1-8B-Instruct128K$0.10 / 1M tokensPipeline
meta-llama/Llama-3.1-70B-Instruct128KTBDPipeline

Image Models

Model IDOutputPriceStatus
black-forest-labs/FLUX.1-schnellPNG$0.002 / imageLive
black-forest-labs/FLUX.1-devPNGTBDPipeline
stabilityai/stable-diffusion-xl-base-1.0PNGTBDPipeline

Pricing

All prices are Beta Pricing and may adjust as the network scales. Deductions happen at job completion — failed jobs are not charged.

Llama 3.2 3B

$0.04 / 1M tokens

input + output combined

Llama 3.1 8B

$0.10 / 1M tokens

pipeline

FLUX.1 schnell

$0.002 / image

any resolution up to 2048px

Add credits at Dashboard → Billing. Unused balance never expires.

Error Codes

HTTPCauseFix
401Invalid or revoked API keyCheck your key at Dashboard → API Keys
402Insufficient balanceAdd credits at Dashboard → Billing
429Daily free limit reached (5/day)Add credits or wait for daily reset
503No nodes available for modelRetry in 30s — nodes may be loading
504Job timed out (>120s)Retry; long queues during peak hours

Something missing? Open an issue on GitHub