robots.txt
Scan your website to see how ready it is for AI agents. We check multiple emerging standards — from robots.txt and Markdown negotiation to MCP, OAuth, Agent Skills and agentic commerce.
Please find below a manually curated and researched list of users agents I came across. It's impressive to see how many of the bots active today flat out do not respect robots.txt settings — or claim to do it but ignore them. This list is updated regularly, whenever I spot new user agents and look into their behavior. There is no JavaScript, here no fancy search.
Related contents:
A list of AI agents and robots to block.
This list contains AI-related crawlers of all types, regardless of purpose. We encourage you to contribute to and implement this list on your own site. See information about the listed crawlers and the FAQ.
Related contents:
The open content licensing standard for the AI-first Internet
Really Simple Licensing (RSL) is an evolution of the early ideas behind the widely adopted RSS standard, which provided a machine-readable framework for publishers to syndicate content to third-party clients and crawlers in exchange for traffic.
Related contents: