Request Collapsing
Vercel uses request collapsing to protect uncached routes during high traffic. It reduces duplicate work by combining concurrent requests into a single function invocation within the same region. This feature is especially valuable for high-scale applications.
When a request for an uncached path arrives, Vercel invokes the origin function and stores the response in the cache. In most cases, any following requests are served from this cached response.
However, if multiple requests arrive while the initial function is still processing, the cache is still empty. Instead of triggering additional invocations, Vercel's CDN collapses these concurrent requests into the original one. They wait for the first response to complete, then all receive the same result.
This prevents overwhelming the origin with duplicate work during traffic spikes and helps ensure faster, more stable performance.
Suppose a new blog post is published and receives 1,000 requests at once. Without request collapsing, each request would trigger a separate function invocation, which could overload the backend and slow down responses, causing a cache stampede.
With request collapsing, Vercel handles the first request, then holds the remaining 999 requests until the initial response is ready. Once cached, the response is sent to all users who requested the post.
Request collapsing is supported for:
Was this helpful?