Free Robots.txt Generator
Build a valid robots.txt file with a guided form. Select user-agents, configure allow/disallow rules, add your sitemap URL, and set crawl-delay — then copy or download the file, ready for your website.
Valid Syntax
Outputs correctly formatted robots.txt every time
Live Preview
See the robots.txt update in real time as you configure
Copy & Download
Copy to clipboard or download the file directly
Quick Presets
One-click presets for WordPress, Next.js, and more
Quick Presets
Rules
robots.txt Preview(1 group)
User-agent: * Disallow:
How to Use Your robots.txt File
- Configure your user-agent groups, rules, and sitemap URL above.
- Click “Copy to Clipboard” or “Download” to get the generated file.
- Upload the
robots.txtfile to the root of your website (e.g.https://example.com/robots.txt). - Verify it is accessible at
/robots.txtand test with Google Search Console.
Why Robots.txt Matters for SEO
A properly configured robots.txt file controls how search engines crawl your site, protects sensitive areas, and ensures your crawl budget is spent on the pages that matter most.
Crawl Budget Benefits
- Block low-value pages (admin, staging, search results) so crawlers focus on your important content
- Prevent duplicate content issues by blocking parameterised URLs and internal search pages
- Point crawlers to your sitemap for faster discovery of new and updated pages
- Control AI bot access to protect your content from unauthorised training data usage
Implementation Best Practices
- Always place robots.txt at your domain root so crawlers can find it
- Use noindex meta tags instead of robots.txt if you need to prevent indexing entirely
- Test your robots.txt with Google Search Console before deploying changes
- Never block CSS or JavaScript files that Google needs to render your pages
Frequently Asked Questions
Everything you need to know about robots.txt files, crawler directives, and crawl management
What is a robots.txt file?
A robots.txt file is a plain text file placed at the root of a website that tells search engine crawlers which pages or sections of the site they are allowed or not allowed to access. It follows the Robots Exclusion Protocol and is the first file crawlers check before indexing a site. Getting this right is a fundamental part of any technical SEO strategy.
Where should I put my robots.txt file?
The robots.txt file must be placed at the root directory of your website so it is accessible at https://yourdomain.com/robots.txt. If it is placed in a subdirectory or has a different filename, search engine crawlers will not find or respect it. For subdomains, each subdomain needs its own robots.txt file.
Does robots.txt block pages from appearing in Google?
Not exactly. A Disallow directive in robots.txt prevents crawlers from accessing the page, but Google may still index the URL if other pages link to it — it will just appear without a description snippet. To fully prevent indexing, use a noindex meta tag or X-Robots-Tag HTTP header instead.
What is the difference between Allow and Disallow in robots.txt?
The Disallow directive tells crawlers not to access a specific path or directory. The Allow directive overrides a Disallow for a more specific path. For example, you can use Disallow: /private/ but Allow: /private/public-page to let crawlers access just that one page within a blocked directory.
What is crawl-delay in robots.txt?
The Crawl-delay directive tells crawlers how many seconds to wait between requests. It is supported by Bing, Yandex, and some other crawlers, but Google does not honour it — Google uses its own crawl rate settings in Search Console. A crawl-delay of 10 means the bot should wait 10 seconds between each page request.
Should I include my sitemap URL in robots.txt?
Yes. Adding a Sitemap directive to your robots.txt file helps search engines discover your XML sitemap more quickly. While search engines can find sitemaps through other means (like Search Console), including it in robots.txt is a widely recommended best practice that ensures all compliant crawlers can locate your sitemap.
Can I use robots.txt to block AI crawlers like GPTBot?
Yes. Many AI companies have published their crawler user-agent names (such as GPTBot for OpenAI, Google-Extended for Google AI training, and CCBot for Common Crawl). You can add a User-agent block for each and set Disallow: / to prevent them from crawling your content. However, compliance depends on the crawler respecting the Robots Exclusion Protocol.
How do I test if my robots.txt is working correctly?
You can test your robots.txt using Google Search Console's robots.txt Tester, which lets you enter URLs and check whether they are blocked or allowed. You can also verify it by visiting yourdomain.com/robots.txt directly in a browser to confirm the file is accessible and formatted correctly.
Related Resources
More tools and services to improve your technical SEO
SEO Services
Technical SEO, crawl optimisation, and content strategy for B2B companies.
Explore Services →Schema Generator
Generate structured data markup for articles, products, and more.
Try Tool →SERP Preview Tool
Preview how your title tags and meta descriptions appear in Google search results.
Try Tool →FAQ Schema Generator
Generate valid FAQ structured data in JSON-LD for Google rich results.
Try Tool →Need Help with Technical SEO and Crawl Management?
Robots.txt is just one piece of the technical SEO puzzle. Our experts can audit your crawl configuration, fix indexing issues, and build a strategy that ensures search engines find and rank your most important pages.