OmniRoute — Free AI Gateway for Multi-Provider LLMs

v3.7

By the Numbers

160+

AI Providers

29

MCP Tools

13

Routing Strategies

1200+

Tests Passing

40+

Languages

4

Protocols (MCP/A2A/ACP/Responses)

What's New in v3.7

160+ providers, Memory System, Skills Framework, Vision Bridge, Responses API, 29 MCP tools, and much more.

psychology

Memory System v3.5

Persistent conversational memory across sessions. Extraction, injection, retrieval, and summarization modules keep your AI context-aware between conversations.

extension

Skills Framework v3.5

Extensible skill registry with built-in and custom skills. Sandboxed execution, interception/injection pipeline, and 4 MCP skill tools for agent control.

visibility

Vision Bridge v3.6

Cross-provider image and vision support. Automatically translates image inputs between OpenAI, Anthropic, and Gemini formats. 51 test scenarios validated.

api

Responses API v3.4

Full OpenAI Responses API support. Requests are internally translated to Chat Completions format, dispatched, and response-streamed back in Responses API events.

cookie

Web/Cookie Providers v3.6

Use your ChatGPT Plus, Grok, Perplexity Pro, Blackbox, and Meta AI subscriptions as API providers. Just paste your session cookie — no API key needed.

travel_explore

10 Search Providers v3.7

Perplexity, Serper, Brave, Exa, Tavily, Google PSE, Linkup, SearchAPI, You.com, and SearXNG. Ground AI responses with real-time web data.

build

29 MCP Tools v3.7

22 core tools + 3 memory tools + 4 skill tools. Budget guards, route simulation, session snapshots, DB health, pricing sync, and agent-assisted memory/skill management.

sync_alt

Context Handoff v3.6

Cross-session context relay strategy. Transfer conversation context between providers seamlessly. 13 routing strategies including context-relay and context-optimized.

How OmniRoute Works

Install once, connect providers, and code non-stop with automatic 4-tier fallback routing.

terminal

Claude Code

code

Codex CLI

smart_toy

Gemini CLI

edit_note

Cursor / Cline

apps

Any CLI Tool

hub OmniRoute

Tier 1 · Subscription

workspace_premium

Claude, Codex, Gemini

Tier 2 · API Key

vpn_key

OpenAI, Groq, DeepSeek

Tier 3 · Cheap

savings

GLM $0.6/1M, MiniMax

Tier 4 · Free

star

iFlow, Kiro, Amazon Q

Get Started in 60 Seconds

Install globally, connect your providers, and start coding with smart auto-fallback routing.

1

Install Globally

Run one command to install OmniRoute globally on your system.

$ npm install -g omniroute

2

Connect Providers

Open Dashboard and add your API keys or OAuth connections. Free providers available!

Dashboard → Providers → Connect

3

Point Your CLI

Configure Claude Code, Cursor, Cline, or any OpenAI-compatible tool.

http://localhost:20128/v1

🐳

Prefer Docker?

Run OmniRoute as a container with persistent data volume.

$ docker run -d -p 20128:20128 -v omniroute-data:/app/data diegosouzapw/omniroute:latest

Why Choose OmniRoute?

See how OmniRoute compares to alternatives.

Feature	OmniRoute	LiteLLM
Providers Supported	160+	200+
Free Tier Routing	✓	✗
Dashboard UI	✓	✗
Semantic Cache	✓	✗
Circuit Breaker	✓	✓
13 Routing Strategies	✓	✗
LLM Evaluations	✓	✗
Translator Playground	✓	✗
CLI Tools Manager	✓	✗
Custom Combos	✓	✗
MCP Server (29 tools)	✓	✗
A2A Protocol (Agent-to-Agent)	✓	✗
Desktop App	✓	✗
Usage Analytics	✓	✓
Cost Management	✓	✓
Docker Deploy	✓	✓
Media Playground (Image/Video/Audio/TTS)	✓	✗
Registered Keys API	✓	✗
Auto-Combo Engine (Self-Healing)	✓	✗
Web Search Providers (5)	✓	✗
Per-Model Combo Routing	✓	✗
160+ Provider Icons (SVG)	✓	✗
Memory System (Persistent)	✓	✗
Skills Framework (Extensible)	✓	✗
Vision Bridge (Cross-Provider)	✓	✓
Responses API	✓	✓
Web/Cookie Providers (5)	✓	✗
10 Search Providers	✓	✗
Context Handoff / Relay	✓	✗
Self-hosted & Free	✓	✓

160+ Providers Ready

Connect via OAuth, API Key, Web Cookie, or use completely free providers.

Free (Unlimited or High Quota)

OAuth / Subscription

API Key

Web / Cookie

Self-Hosted

iFlow AI

Qwen Code

Kiro AI

Gemini CLI

Claude Code

OpenAI

Anthropic

Google AI

Antigravity

OpenClaw

Groq

DeepSeek

xAI (Grok)

Mistral

Together AI

Fireworks

Perplexity

Cerebras

Cohere

OpenRouter

GLM (ZhipuAI)

MiniMax

Moonshot

Nebius

NVIDIA

SiliconFlow

Sambanova

Novita AI

Chutes AI

Kluster AI

InfiniAI

Targon

AI21 Labs

Lambda

Lepton AI

Deepgram

AssemblyAI

NanoBanana

HuggingFace

Vertex AI

Alibaba DashScope

LongCat AI

Pollinations

Cloudflare AI

Scaleway

AI/ML API

Puter AI

OpenCode Zen

OpenCode Go

Kimi Coding

Alibaba Coding

ElevenLabs

Cartesia

PlayHT

Ollama Cloud

Amazon Q

GitLab Duo

Cline (OAuth)

Kimi Coding

ChatGPT Web

Grok Web

Perplexity Web

Blackbox Web

CrofAI

Azure OpenAI

Amazon Bedrock

Vertex AI Partners

DeepInfra

Meta Llama API

Databricks

Snowflake Cortex

Venice.ai

Poe

Heroku AI

IBM watsonx

Runway

Docker Model Runner

ComfyUI

LM Studio

vLLM

Brave Search

Exa Search

Tavily Search

Serper

…and 100+ more including Morph, Featherless AI, FriendliAI, NanoGPT, Predibase, Bytez, Reka, NLP Cloud, Clarifai, DataRobot, GigaChat, and OpenAI/Anthropic-compatible custom providers.

Powerful Features

Everything you need to route, monitor, and optimize your AI usage.

route Routing & Reliability

swap_vert

Smart 4-Tier Fallback

Subscription → API Key → Cheap → Free. Automatic switching when quota runs out, zero downtime.

account_tree

Intra-Family Model Fallback New

When a model is unavailable, automatically falls back to sibling models in the same family before returning an error.

balance

13 Routing Strategies + Auto-Combo v3.7

Priority, weighted, round-robin, context-relay, fill-first, P2C, random, least-used, cost-optimized, strict-random, auto-combo, LKGP, and context-optimized. Per-combo or global.

offline_bolt

Circuit Breaker

Auto-open and close per-provider with configurable cooldowns. Self-healing after failures.

shield

Anti-Thundering Herd

Mutex + automatic rate-limiting for API key providers. Prevents quota exhaustion spikes.

fingerprint

Request Idempotency

5-second dedup window for duplicate requests. Saves tokens and prevents double-sends.

psychology Intelligence

cached

Semantic Cache New

Two-tier cache (exact + semantic similarity) reduces cost and latency for repeated queries.

translate

Format Translation

Seamless OpenAI ↔ Claude ↔ Gemini format translation. Use any model with any client.

psychology_alt

Think Tag Parsing

Automatically parse and handle <think> tags from reasoning models like DeepSeek R1.

security

Prompt Injection Guard New

Built-in protection against prompt injection attacks on your AI endpoints.

route

Task-Aware Smart Routing New

Automatically selects the best model based on content type — coding, analysis, vision, summarization. 7 task types.

monitoring Monitoring & Analytics

timer

Real-Time Quota Tracking

Live token consumption, reset countdown, and cost estimation per provider.

analytics

Usage Analytics New

Full dashboard with tokens, costs, trends over time. Filter by provider, model, or period.

payments

Costs & Budget New

Track spending with editable per-model pricing. Set budget alerts and limits.

health_and_safety

Health Monitor New

Dashboard with healthcheck per provider, token validation, and auto-refresh status.

science

LLM Evaluations New

Golden set testing with 4 match strategies: exact, contains, regex, custom JS function.

code Developer Experience

chat

Translator Playground New

Built-in Chat Tester and Test Bench. Test any model in real-time from the dashboard.

build

CLI Tools Manager New

Configure Claude Code, Codex, OpenClaw, Kilo, Droid, and Cline directly from the dashboard.

tune

Custom Combos New

Create unlimited model combinations with 6 balancing strategies. Fine-tune routing per combo.

group

Multi-Account Support

Add multiple accounts per provider. Round-robin load balancing and automatic failover.

image

Media Playground v3

Full media generation: Image (NanoBanana, SD WebUI, ComfyUI), Video, Music, Audio Transcription (2GB, Deepgram, AssemblyAI), and Text-to-Speech (ElevenLabs, Cartesia, PlayHT).

cloud_sync

Cloud Sync

Sync config across devices via Cloudflare Workers. 300+ global edge locations.

key

API Key Access Controls New

Create scoped API keys with model restrictions, time-based access schedules, and enable/disable toggles.

folder_open

Connection Groups New

Organize provider connections by environment (dev/prod). Accordion view with smart auto-switch.

hub Protocol & Integration

extension

MCP Server (29 Tools) v3.7

Model Context Protocol server with 29 agent-control tools (22 core + 3 memory + 4 skills). 3 transports: stdio, SSE, Streamable HTTP. 10 scoped auth levels.

lan

A2A v0.3 Protocol New

Agent-to-Agent orchestration with JSON-RPC 2.0, task streaming, SSE heartbeat, and smart-routing skill.

desktop_windows

Desktop App New

Native Electron app for Windows, macOS, and Linux. System tray, auto-update, offline support, single-instance lock.

price_check

External Pricing Sync New

3-tier pricing resolution synced from LiteLLM. User overrides → synced → defaults. Opt-in via settings.

memory Memory, Skills & AI Intelligence

psychology

Memory System v3.5

Persistent memory across sessions with extraction, injection, retrieval, and summarization. 3 MCP memory tools for agent-controlled recall.

extension

Skills Framework v3.5

Extensible skill registry with built-in and custom skills. Sandboxed executor, interception/injection pipeline, A2A integration, and 4 MCP skill tools.

visibility

Vision Bridge v3.6

Cross-provider image support. Translates vision inputs between OpenAI, Anthropic, and Gemini formats automatically. 51 test scenarios.

api

Responses API v3.4

Full OpenAI Responses API compatibility. TransformStream converts Chat Completions SSE chunks into Responses API event format.

cookie

Web/Cookie Providers v3.6

Use ChatGPT Plus, Grok, Perplexity Pro, Blackbox, and Meta AI subscriptions as API providers via session cookies.

policy

Compliance & MITM v3.7

Provider audit module for compliance enforcement. Optional MITM proxy with cert management, DNS handling, and target routing.

Beautiful Dashboard

Monitor everything in real-time. Manage providers, combos, analytics, and more.

Deploy Anywhere

Run locally, in a container, on a VM, or at the edge.

📦

npm

Install globally for local development

npm install -g omniroute

🐳

Docker

Container with persistent data volume

docker run -d diegosouzapw/omniroute

🖥️

VM / VPS

Deploy on Akamai, AWS, DigitalOcean

nginx → Docker → omniroute

⚡

Cloudflare Workers

Edge deployment with D1 database

wrangler deploy

Never Stop Coding

🤖 Free AI Provider for your favorite coding agents

By the Numbers

What's New in v3.7

Memory System v3.5

Skills Framework v3.5

Vision Bridge v3.6

Responses API v3.4

Web/Cookie Providers v3.6

10 Search Providers v3.7

29 MCP Tools v3.7

Context Handoff v3.6

How OmniRoute Works

Get Started in 60 Seconds

Install Globally

Connect Providers

Point Your CLI

Prefer Docker?

Why Choose OmniRoute?

160+ Providers Ready

Powerful Features

Smart 4-Tier Fallback

Intra-Family Model Fallback New

13 Routing Strategies + Auto-Combo v3.7

Circuit Breaker

Anti-Thundering Herd

Request Idempotency

Semantic Cache New

Format Translation

Think Tag Parsing

Prompt Injection Guard New

Task-Aware Smart Routing New

Real-Time Quota Tracking

Usage Analytics New

Costs & Budget New

Health Monitor New

LLM Evaluations New

Translator Playground New

CLI Tools Manager New

Custom Combos New

Multi-Account Support

Media Playground v3

Cloud Sync

API Key Access Controls New

Connection Groups New

MCP Server (29 Tools) v3.7

A2A v0.3 Protocol New

Desktop App New

External Pricing Sync New

Memory System v3.5

Skills Framework v3.5

Vision Bridge v3.6

Responses API v3.4

Web/Cookie Providers v3.6

Compliance & MITM v3.7

Beautiful Dashboard

Deploy Anywhere

npm

Docker

VM / VPS

Cloudflare Workers