All systems normal
v0.9.4 · 5 models
/ llm infrastructure

Open-source LLMs,
served fast. Priced honestly.

Production inference for Llama, Mistral, Qwen, and Phi. One API. Per-token pricing from $0.04/1M.

One endpoint.
Any open model.

OpenAI-compatible REST API. Bring your existing client, swap the base URL, pick a model. Streaming, tool use, structured output — all of it.

OpenAI-compatible
Drop-in replacement. No rewrites.
Streaming by default
Tokens as they're generated.
Self-hosted models
Your data stays on infrastructure we control.
1from openai import OpenAI
2
3client = OpenAI(
4 base_url="https://llmrack.com/v1",
5 api_key="rl_live_...",
6)
7
8stream = client.chat.completions.create(
9 model="llama-3.1-8b",
10 stream=True,
11 messages=[{"role": "user", "content": "Explain RAG in 2 sentences."}],
12)
13
14for chunk in stream:
15 print(chunk.choices[0].delta.content or "", end="", flush=True)

Built for production
from the first token.

Low-latency inference

CPU-optimized quantized models. Warm keep-alive between requests.

Multiple open models

Llama, Mistral, Qwen, Phi. Pick by changing a string.

Honest pricing

Per-token billing. No minimums. No egress fees.

Simple API

OpenAI-compatible. Streaming, tool calls, JSON mode — same schema.

Self-hosted

Your infrastructure, your data. No third-party model vendors.

Private by default

Zero retention on responses. Your prompts never train anything.

Pay per token.
Nothing else.

Tiers unlock higher rate limits and throughput, not better rates.

Free
$0forever
10,000 tokens / day
10 requests / minute
All open models
1 API key
Community support
popular
Pro
$10/ month
500,000 tokens / day
100 requests / minute
Unlimited API keys
Email support
Usage analytics
Business
$50/ month
5M tokens / day
500 requests / minute
SSO + team seats
Priority support, 4h SLA
Dedicated throughput

Questions we
actually get asked.

Yes. Point your OpenAI client at https://llmrack.com/v1 and swap the API key. Chat, streaming, tool calls, and JSON mode all use the same request/response schema.
/ ship today

Start building.
Free tier, no card.