TokenFactory

Leading open-source models, served at record speed.

One API for every open model — running on the same agent-native inference cloud that powers Synaptix. Best-in-class latency and throughput, pay-as-you-go.

Get an API key See benchmarks See the platform

40+

Open & frontier models

Max context window

99.99%

Enterprise SLA

−50%

Batch inference savings

TokenFactory services

Four ways to run open models in production.

From a single API call to dedicated multi-region deployments — TokenFactory meets you wherever you are.

Inference service

Access and run powerful open-source AI models through a single OpenAI-compatible API. Sub-second latency, 99.9% uptime, pay per token.

Batch inference

Process millions of requests asynchronously at up to 50% lower cost. Ideal for evaluations, embeddings, document pipelines and offline workflows.

Post-training service

Fine-tune, distill and align open models on your proprietary data. LoRA, full SFT, DPO and RL — without managing GPUs.

Enterprise-grade inference

Deploy and scale models on dedicated infrastructure with guaranteed uptime, private networking and custom SLAs.

Available models

A catalog that ships with the frontier.

New open releases are evaluated, optimized and added within days — same API, no migration.

gpt-oss-120B

General · Reasoning

128K context

Kimi-K2.5

Long context · Agents

2M context

Qwen3-Coder-480B-A35B-Instruct

Code · MoE

256K context

GLM-5

Multilingual · Reasoning

128K context

DeepSeek V3.2

Reasoning · MoE

128K context

MiniMax M2.1

Multimodal · Agents

1M context

Nemotron 3 Super

Enterprise · Reasoning

128K context

Llama 4 405B

General purpose

256K context

Mistral Large 3

European · Tooling

128K context

…and more added every month.

Quickstart

OpenAI-compatible. Drop-in in 3 lines.

Pythonapi.tokenfactory.ai

from openai import OpenAI

client = OpenAI(
  base_url="https://api.tokenfactory.ai/v1",
  api_key="tf_live_…",
)

resp = client.chat.completions.create(
  model="gpt-oss-120b",
  messages=[{"role": "user", "content": "Summarize this report."}],
)
print(resp.choices[0].message.content)

cURLapi.tokenfactory.ai

curl https://api.tokenfactory.ai/v1/chat/completions \
  -H "Authorization: Bearer $TF_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-v3.2",
    "messages": [{"role":"user","content":"Hello"}]
  }'

Pricing