3 min read
Building an AI app can now take just minutes. With developer tools like the AI SDK, teams can build both AI frontends and backends that accept prompts and context, reason with an LLM, call actions, and stream back results.
But going to production requires reliability and stability at scale. Teams that connect directly to a single LLM provider for inference create a fragile dependency: if that provider goes down or hits rate limits, so does the app. As AI workloads become mission-critical, the focus shifts from integration to reliability and consistent model access. Fortunately, there's a better way to run.
AI Gateway, now generally available, ensures availability when a provider fails, avoiding low rate limits and providing consistent reliability for AI workloads. It's the same system that has powered v0.app for millions of users, now battle-tested, stable, and ready for production for our customers.
The case for flexibility
AI capabilities have grown and changed with incredible speed. Reasoning, barely a concept a year ago, is now a standard capability across models. Tool use and the Model Context Protocol (MCP) went from experiments to widely adopted standards in under a year.
An essential tool for success in AI is staying flexible and adapting to the new frontier. Whether AI native or leveraging AI integrations, teams want the ability to use the best model available at any given time while keeping systems composable and avoiding lock-in.
Building in the flexibility to switch between models requires sophisticated engineering. APIs can fail without warning, rate limits vary by provider, and keys must be managed and secured per vendor. Even tracking spend means juggling multiple dashboards with different top-up programs and spend controls. Engineering around these failure modes is time-consuming and slows teams down.
AI Gateway handles this complexity for you so you can scale confidently and quickly.
Great developer experience, trusted infrastructure
The AI SDK now sees more than 2 million downloads per week and powers apps like Browserbase and Perplexity. While AI SDK standardized provider APIs and made switching as simple as a single line of code, the harder problem is ensuring availability when a provider fails. The AI Gateway brings the same developer experience together with globally available, production-grade infrastructure.
Swap a model string and you’re testing a new provider in seconds:
import { streamText } from 'ai'
const result = streamText({ model: 'xai/grok-4', // defaults to Vercel AI Gateway prompt: 'How does Vercel AI Gateway have no markup on tokens?'})
AI Gateway supports hundreds of models, you can view the library here.
With AI Gateway you can use any vendor, any model, through a single API. Built on AI SDK 5, it supports hundreds of models today without requiring you to manage API keys, rate limits, or provider accounts. The Gateway handles authentication, usage tracking, failover, billing, and more.
Developers want speed, reliability, and choice. AI Gateway is built for teams creating AI apps, agents, RAG systems, or search and chat experiences who:
Evaluate or swap models dynamically
Need higher rate limits than a single vendor can provide
Want access to new frontier models as soon as they launch
Can’t afford a single point of failure
Need visibility into model usage and cost without juggling several dashboards
Zero markup, high reliability
And all this with zero markup: bring your own keys and contracts, with no added markup on model prices.
Just as CDNs transformed networking with redundancy, failover, and optimization, AI Gateway will transform inference reliability for your apps. Vercel's CDN powers trillions of requests a year[1], and sits at the core of AI Gateway providing sub-20ms latency.
Getting started with AI Gateway
The AI Gateway is now generally available. Try it for free, or browse the model library to see supported providers and chat with a model of your choice.
Through the AI Gateway, we want to help you build fast, stay reliable, and keep up with the rapid pace of AI innovation without being slowed down by infrastructure, or locked into any one model or provider.
Frequently Asked Questions
- ↑https://vercel.com/blog/gartner-mq-visionary-2025 - 115B weekly requests served by Vercel