Pay-per-token AI

Pay only when tokens are generated.

No idle GPU costs. No reserved instances. No minimums. Lexora bills per million tokens served — exactly what you used, nothing more.

Every request is billed by token count, not wall-clock GPU time. Idle = $0.

No commit-or-lose pricing tiers. Burst from 0 to 1000 requests, scale back to 0. The bill matches the workload.

KBs cost zero until you query them. Inference is the only meter that moves.

Inference jobs route to the worker network. You get hosted reliability with a pricing model that doesn't punish low utilization.

Lexora vs Reserved GPU rentals

Side-by-side breakdown of what matters.

Feature

Lexora

Reserved GPU rentals

Billing unit

Per million tokens

Per GPU-hour

Idle cost

Full hourly rate

Minimum commit

None

Hour blocks / monthly

Scale to zero

Yes — true zero

Manual teardown

Burst scaling

Automatic

Spin up new instance

Knowledge base

Included free

Build it yourself

Move your inference workload to a pricing model that matches your usage. $1 signup credit, no card required.

/pay-per-token-ai-api