Bot Management

Bots generate nearly half of all internet traffic. While many bots serve legitimate purposes like search engine crawling and content aggregation, others originate from malicious sources. Bot management encompasses both observing and controlling all bot traffic. A key component of this is bot protection, which focuses specifically on mitigating risks from automated threats that scrape content, attempt unauthorized logins, or overload servers.

Bot management systems analyze incoming traffic to identify and classify requests based on their source and intent. This includes:

  • Verifying and allowing legitimate bots that correctly identify themselves
  • Monitoring bot traffic patterns and resource consumption
  • Detecting and challenging suspicious traffic that behaves abnormally
  • Enforcing browser-like behavior by verifying navigation patterns and cache usage

To effectively manage bot traffic and protect against harmful bots, various techniques are used, including:

  • Signature-based detection: Inspecting HTTP requests for known bot signatures
  • Rate limiting: Restricting how often certain actions can be performed to prevent abuse
  • Challenges: Using JavaScript checks to verify human presence
  • Behavioral analysis: Detecting unusual patterns in user activity that suggest automation

With Vercel, you can use:

  • Managed rulesets to challenge specific bot traffic
  • Rate limiting and challenge actions with WAF custom rules to prevent bot activity from reaching your application
  • DDoS protection to defend your application against bot driven attacks
  • Observability and Firewall to monitor bot patterns, traffic sources, and the effectiveness of your bot management strategies

Bot protection managed ruleset is available on all plans

With Vercel, you can use the bot protection managed ruleset to challenge non-browser traffic from accessing your applications. It filters out automated threats while allowing legitimate traffic.

  • It identifies clients that violate browser-like behavior and serves a javascript challenge to them.
  • It prevents requests that falsely claim to be from a browser such as a curl request identifying as Chrome.
  • It automatically excludes verified bots, such as Google's crawler, from evaluation.

To learn more about how the ruleset works, review the Challenge section of Firewall actions. To understand the details of what get logged and how to monitor your traffic, review Firewall Observability.

For trusted automated traffic, you can create custom WAF rules with bypass actions that will allow this traffic to skip the bot protection ruleset.

You can apply the ruleset to your project in log or challenge mode. Learn how to configure the bot protection managed ruleset.

Bot Protection does not work when a reverse proxy (e.g. Cloudflare, Azure, or other CDNs) is placed in front of your Vercel deployment. This setup significantly degrades detection accuracy and performance, leading to a suboptimal end-user experience.

Reverse proxies interfere with Vercel's ability to reliably identify bots:

  • Obscured detection signals: Legitimate users may be incorrectly challenged because the proxy masks signals that Bot Protection relies on.
  • Frequent re-challenges: Some proxies rotate their exit node IPs frequently, forcing Vercel to re-initiate the challenge on every IP change.

AI bots managed ruleset is available on all plans

Vercel's AI bots managed ruleset allows you to control traffic from AI bots that crawl your site for training data, search purposes, or user-generated fetches.

  • It identifies and filters requests from known AI crawlers and bots.
  • It provides options to log or deny these requests based on your preferences.
  • The list of known AI bots is automatically maintained and updated by Vercel.

When new AI bots emerge, they are automatically added to Vercel's managed list and will be handled according to your existing configured action without requiring any changes on your part.

You can apply the ruleset to your project in log or deny mode. Learn how to configure the AI bots managed ruleset.

Vercel maintains and continuously updates a comprehensive directory of known legitimate bots from across the internet. This directory is regularly updated to include new legitimate services as they emerge. Attack Challenge Mode and bot protection automatically recognize and allow these bots to pass through without being challenged. You can block access to some or all of these bots by writing WAF custom rules with the User Agent match condition or Signature-Agent header. To learn how to do this, review WAF Examples.

To prove that bots are legitimate and verify their claimed identity, several methods are used:

  • IP Address Verification: Checking if requests originate from known IP ranges owned by legitimate bot operators (e.g., Google's Googlebot, Bing's crawler).
  • Reverse DNS Lookup: Performing reverse DNS queries to verify that an IP address resolves back to the expected domain (e.g., an IP claiming to be Googlebot should resolve to *.googlebot.com or *.google.com).
  • Cryptographic Verification: Using digital signatures to authenticate bot requests through protocols like Web Bot Authentication, which employs HTTP Message Signatures (RFC 9421) to cryptographically verify automated requests.

Submit a bot request if you are a SaaS provider and would like to be added to this list.

Bot nameCategoryDescriptionDocumentation
adagiobotadvertisingAdagiobot is a web crawler that analyzes websites for advertising demand optimization, helping publishers maximize revenue through real-time bidding analysis and performance insights. AdagioBot fetches /ads.txt, /app-ads.txt and /sellers.json files to comply with IAB Supply Chain Validation.View
adidxbotadvertisingAdIdxBot is the crawler used by Bing Ads for quality control of ads and their destination websites. It has multiple user agent variants including desktop, iPhone, and Windows Phone versions.View
adsbot-googleadvertisingAdsBot-Google is Google's web crawler used for quality control of Google Ads.View
adsenseadvertisingThe AdSense crawler visits participating sites in order to provide them with relevant ads.View
adyen-webhookwebhookAdyen’s webhooks (Notification API) send encrypted, real-time HTTP callbacks for key payment and account events—automating order fulfillment, settlement reconciliation, and risk-management workflows.View
ahrefsbotsearch_engine_optimizationPowers the database for both Ahrefs, a marketing intelligence platform, and Yep, an independent, privacy-focused search engine.View
ahrefssiteauditsearch_engine_optimizationPowers Ahrefs’ Site Audit tool. Ahrefs users can use Site Audit to analyze websites and find both technical SEO and on-page SEO issues.View
algoliasearch_engine_crawlerThe Algolia Crawler extracts content from your site and makes it searchable.View
amazon-kendraai_assistantAmazon Kendra is a managed information retrieval and intelligent search service that uses natural language processing and advanced deep learning model.View
amazon-qai_assistantAmazon Q Business is a generative artificial intelligence (generative AI)-powered assistant that you can tailor to your business needs.View
amazonbotai_crawlerAmazonbot is Amazon's web crawler used to improve our services, such as enabling Alexa to more accurately answer questions for customers.View
apis-googlesearch_engine_crawlerCrawling preferences addressed to the APIs-Google user agent affect the delivery of push notification messages by Google APIs.View
apple-podcastsfeed_fetcherApple Podcasts crawler that only accesses URLs associated with registered content on Apple Podcasts. Does not follow robots.txt.View
applebotai_crawlerApplebot powers search features in Apple's ecosystem (Spotlight, Siri, Safari) and may be used to train Apple's foundation models for generative AI features.View
artemis-web-crawleraggregatorArtemis is a calm web reader with which you can follow websites and blogs.View
baiduspidersearch_engine_crawlerBaiduspider is Baidu’s web crawler that indexes websites for inclusion in its Chinese-market search results.View
barkrowlersearch_engine_optimizationBarkrowler is Babbar's web crawler that fuels and updates their graph representation of the web, providing SEO tools for the marketing community.View
better-stackmonitorBetter Stack is a platform for monitoring and alerting on your applications.View
bingbotsearch_engine_crawlerBingbot is Microsoft's web crawler used for indexing websites for Bing Search.View
blexbotsearch_engine_optimizationBLEXBot is SE Ranking's web crawler that helps analyze websites for SEO purposes, including backlink analysis, rank tracking, and website auditing. The bot is part of SE Ranking's all-in-one SEO platform used by marketing professionals and agencies.View
brightbotmonitorBrightbot is Bright Data's crawler layer that monitors the health of websites and enforces ethical web data collection. It prevents access to non-public information and blocks interactive endpoints that could be abused, acting as a guardian for ethical data collection.View
buffer-link-preview-botpreviewHelps Buffer users create better social media posts by generating rich previews when they share linksView
ccbotai_crawlerCCBot is operated by the Common Crawl Foundation to crawl web content for AI training and research. Common Crawl is a non-profit organization that maintains an open repository of web crawl data that is universally accessible for research and analysis.View
chatgpt-operatorai_assistantHandles user-initiated requests from ChatGPT operator accessing external content; not used for automated crawling or AI training.View
chatgpt-userai_assistantHandles user-initiated requests in ChatGPT, accessing external content to provide real-time information; not used for automated crawling or AI training.View
checklymonitorCheckly is a platform for monitoring and alerting on your applications.View
chrome-lighthouseanalyticsPageSpeed Insights (PSI) reports on the user experience of a page on both mobile and desktop devices, and provides suggestions on how that page may be improved.View
chrome-privacy-preserving-prefetch-proxypage_previewChrome's Privacy Preserving Prefetch Proxy service that fetches /.well-known/traffic-advice to enable privacy-preserving prefetch hints.View
claude-searchbotai_assistantClaude-SearchBot navigates the web to improve search result quality for users. It analyzes online content specifically to enhance the relevance and accuracy of search responses.View
claude-userai_assistantClaude-User supports Claude AI users. When individuals ask questions to Claude, it may access websites using a Claude-User agent.View
claudebotai_crawlerClaudeBot helps enhance the utility and safety of our generative AI models by collecting web content that could potentially contribute to their training.View
cookiebotmonitorCookiebot automates compliance with cookie laws and helps you manage your cookie consent preferences.View
criteobotadvertisingCriteoBot is a crawler operated by Criteo that analyzes web content to serve relevant contextual ads. The bot respects robots.txt directives and crawl delays, and only accesses publicly available content.View
customerio-webhookswebhookCustomer.io's webhook service for event-driven marketing automation and customer data platform.View
datadog-synthetic-monitoring-robotmonitorDatadog's automated monitoring service that performs synthetic tests to verify website availability and performance.View
dataforseobotsearch_engine_optimizationDataForSeoBot is a backlink checker bot operated by DataForSEO that crawls websites to build and maintain their backlink database. The bot respects robots.txt directives and crawl delays, and is used to provide SEO data and analytics services.View
detectifymonitorDetectify is a web security scanner that performs automated security tests on web applications and attack surface monitoring.View
duckassistbotai_assistantDuckAssistBot is a web crawler for DuckDuckGo Search that crawls pages in real-time for AI-assisted answers, which prominently cite their sources. This data is not used in any way to train AI models.View
duckduckbotsearch_engine_crawlerDuckDuckBot is a web crawler for DuckDuckGo. DuckDuckBot’s job is to constantly improve search results and offer users the best and most secure search experience possible.View
facebook-webhookswebhookFacebook's webhook service that delivers real-time event notifications for Meta platform events and changes.View
facebookexternalhitpreviewFetches content for shared links on Meta platforms to generate rich previews.View
feedfetcherfeed_fetcherFeedfetcher is used for crawling RSS or Atom feeds for Google News and PubSubHubbub.View
geedoproductsearchbotecommerceGeedoProductSearch is a web crawler operated by Geedo SIA that indexes product information from e-commerce websites. The crawler respects robots.txt directives and can be configured for crawl speed and behavior through standard crawl-delay settings.View
gemini-deep-researchai_assistantGemini Deep Research is Google's AI-powered research tool that performs comprehensive multi-step research on complex topics, analyzing web content to provide detailed insights and answers.View
github-camopreviewGitHub's image proxy serviceView
github-hookshotwebhookGitHub's webhooks for events like push, pull request, etc.View
google-cloudvertexbotai_assistantCrawling preferences addressed to the Google-CloudVertexBot user agent affect crawls requested by the site owners' for building Vertex AI Agents. It has no effect on Google Search or other products.View
google-extendedai_crawlerGoogle-Extended is a standalone product token that web publishers can use to manage whether their sites help improve Gemini Apps and Vertex AI generative APIs, including future generations of models that power those products. Grounding with Google Search on Vertex AI does not use web pages for grounding that have disallowed Google-Extended. Google-Extended does not impact a site's inclusion or ranking in Google Search.View
google-image-proxypreviewGoogle's image caching proxy service used by Gmail and other Google services to cache and serve images.View
google-inspectiontoolmonitorCrawling preferences addressed to the Google-InspectionTool user agent affect Search testing tools such as the Rich Result Test and URL inspection in Search Console. It has no effect on Google Search or other products.View
google-pagerendererpage_previewUpon user request, Google Page Renderer fetches and renders web pages.View
google-publisher-centerfeed_fetcherGoogle Publisher Center fetches and processes feeds that publishers explicitly supplied for use in Google News landing pages.View
google-read-aloudaccessibilityUpon user request, Google Read Aloud fetches and reads out web pages using text-to-speech (TTS).View
google-safetymonitorThe Google-Safety user agent handles abuse-specific crawling, such as malware discovery for publicly posted links on Google properties. As such it's unaffected by crawling preferences.View
google-site-verifierverificationGoogle Site Verifier fetches Search Console verification tokens.View
google-storebotecommerceCrawling preferences addressed to the Storebot-Google user agent affect all surfaces of Google Shopping (for example, the Shopping tab in Google Search and Google Shopping).View
googlebotsearch_engine_crawlerCrawling preferences addressed to the Googlebot user agent affect Google Search (including Discover and all Google Search features), as well as other products such as Google Images, Google Video, Google News, and Discover.View
googleothersearch_engine_crawlerCrawling preferences addressed to the GoogleOther user agent don't affect any specific product. GoogleOther is the generic crawler that may be used by various product teams for fetching publicly accessible content from sites. For example, it may be used for one-off crawls for internal research and development. It has no effect on Google Search or other products.View
gpt-actionsai_assistantEnables ChatGPT to interact with external APIs and retrieve real-time information from the web in response to user-initiated requests; allows access to up-to-date content without being used for automated crawling or AI training.View
gptbotai_crawlerCrawls web content to improve OpenAI's generative AI models; respects 'robots.txt' directives to exclude sites from training data.View
hetrixtools-uptime-monitoring-botmonitorHetrixTools Uptime Monitoring Bot is used by HetrixTools's monitoring services to perform various checks on websites, including uptime and performance monitoring.View
hookdeckwebhookA reliable Event Gateway for event-driven applicationsView
hydrozenmonitorHydrozen is a tool for monitoring availability of your websites, Cronjobs, APIs, Domains, SSL etc.View
imagesiftbotai_crawlerImageSiftBot is a web crawler that scrapes the internet for publicly available images to support Hive's suite of web intelligence products.View
inngestwebhookInngest is a platform for building event-driven applications.View
linkedinbotpreviewLinkedInBot is a bot that renders links shared on LinkedIn.View
lumarsearch_engine_optimizationThe Lumar website intelligence platform is used by SEO, engineering, marketing and digital operations teams to monitor the performance of their site’s technical health, and ensure a high-performing, revenue-driving website.View
meta-externalagentai_crawlerThe Meta-ExternalAgent crawler crawls the web for use cases such as training AI models or improving products by indexing content directly.View
meta-externalfetcheruser_initiatedThe Meta-ExternalFetcher crawler performs user-initiated fetches of individual links to support specific product functions. Because the fetch was initiated by a user, this crawler may bypass robots.txt rules.View
microsoftpreviewpreviewMicrosoftPreview generates page snapshots for Microsoft products. It has desktop and mobile variants, with Chrome version dynamically updated to match the latest Microsoft Edge version.View
momenticbotuser_initiatedMomentic is a AI-powered platform for software testing. It allows you to write reliable end-to-end tests for web apps in a simple and intuitive way using natural language.View
adsnaversearch_engine_crawlerNaver's ad crawler that periodically visits registered ad landing pages to collect on-page content for effective ad matching and ranking. It ignores robots.txt for URLs registered in the ad system.View
naver-bluenopreviewNaver's preview-snippet crawler that fetches summary information (titles, descriptions, images) when users insert links in Naver services such as blogs or cafés. It operates on demand and respects robots.txt.View
naverbotsearch_engine_crawlerNaver's web crawler (also known as Yeti) is used by Naver, South Korea's largest search engine, to crawl and index web content.View
newrelic-minionsmonitorNew Relic Synthetic monitoring infrastructure that performs API checks and virtual browser instances to monitor websites and applications from global locationsView
oai-searchbotai_assistantIndexes websites for inclusion in ChatGPT's search results; does not crawl content for AI model training.View
paypalwebhookPayPal delivers real-time event notifications for payments, subscriptions, and account updates.View
perplexity-userai_assistantHandles user-initiated requests in Perplexity, accessing external content to provide real-time information; not used for automated crawling or AI training.View
perplexitybotai_assistantIndexes websites for inclusion in Perplexity's search results; does not crawl content for AI model training.View
petalbotsearch_engine_crawlerPetalBot is a web crawler operated by Huawei's Petal Search engine. It crawls both PC and mobile websites to build an index database for Petal search engine and to provide content recommendations for Huawei Assistant and AI Search services.View
pingdom-botmonitorPingdom Bot is used by Pingdom's monitoring services to perform various checks on websites, including uptime and performance monitoring.View
pinterest-botaggregatorPinterest's web crawler that indexes content for their platform. It crawls websites to collect metadata for Pins, including images, titles, descriptions, and prices. The crawler also helps maintain Pin data accuracy and detect broken links.View
pulsepoint-crawleradvertisingA web crawler used by PulsePoint, a digital advertising technology company, for content indexing and ads.txt verification.View
qatechmonitorThe QA.tech web agent browses the website and identifies potential test cases, and executes tests against a web applicationView
qstashwebhookQStash is a platform for building event-driven applications.View
razorpay-webhookwebhookRazorpay’s webhooks enable merchants to receive secure, real-time HTTP callbacks for key payment events—automating reconciliation, notifications, and downstream workflows.View
amazon-route-53-health-check-servicemonitorAmazon Route 53 Health Check ServiceView
sanity-webhookswebhookSanity's webhook service that delivers real-time event notifications for content changes and other events.View
seekportbotsearch_engine_crawlerSeekportBot is the web crawler for Seekport, a German search engine operated by SISTRIX. The bot crawls and indexes web content while respecting robots.txt directives and crawl delays.View
semrush-site-auditsearch_engine_optimizationSemrush Site Audit is a powerful website crawler that analyzes the health of a website by checking for on-page and technical SEO issues, including duplicate content, broken links, HTTPS implementation, hreflang attributes, and more.View
semrushsearch_engine_optimizationSemrush is a platform for SEO, content marketing, competitor research, PPC and social media marketing.View
sentry-uptime-monitoring-botmonitorSentry's Uptime Monitoring Bot performs health checks on configured URLs to monitor the availability and reliability of web services.View
seobilitysearch_engine_crawlerSeobility is a browser-based online SEO software that helps you improve your website’s search engine rankings.View
seznambotsearch_engine_crawlerSeznamBot is the web crawler operated by Seznam.cz, the leading Czech search engine. The bot crawls and indexes web content for Seznam's search results, respecting robots.txt directives and crawl delays.View
site24x7monitorSite24x7 Bot is used by Site24x7's monitoring services to perform various checks on websites, including uptime and performance monitoring.View
statuscakemonitorStatusCake is a website monitoring service that checks the uptime and performance of your website.View
stripe-webhookswebhookStripe's webhook service that delivers real-time event notifications for payment processing and account updates.View
svixwebhooksvix is a webhook service for sending events to webhooks.View
twitterbotpreviewFetches content for shared links on X/Twitter to generate rich previews.View
uptime-robotmonitorUptime Robot is a platform for monitoring and alerting on your applications.View
v0botai_crawlerBot for v0 services.View
vercel-favicon-botpreviewVercel Favicon BotView
vercelflagsmonitorvercel flagsView
vercel-screenshot-botpreviewVercel Screenshot BotView
verceltracingmonitorvercel tracingView
yahoo-ad-monitoringadvertisingYahoo Ad Monitoring crawls landing pages of URLs listed with Yahoo advertising services to analyze content quality, ensure ad relevance, and improve user experience by maintaining accurate ad listings.View
yahoo-slurpsearch_engine_crawlerYahoo! Slurp is the web crawler (robot) used by Yahoo! Search to discover and index web pages for its search engine.View
yandexbotsearch_engine_crawlerYandexBot is a web crawler operated by Yandex, a major Russian search engine.View
Last updated on June 9, 2025