Create a robots.txt File Online
robots.txt tells search engine crawlers which pages and directories to crawl or ignore. A misconfigured robots.txt can accidentally block your entire site from Google — here's how to do it right.
Key rules to know
- User-agent: * — applies to all crawlers
- Disallow: /admin/ — blocks /admin/ from all bots
- Allow: / — explicitly allows root (sometimes needed with a Disallow rule)
- Sitemap: https://yoursite.com/sitemap.xml — tells Google where your sitemap is
- Crawl-delay: is ignored by Googlebot — use Google Search Console instead
Common mistakes to avoid
- Disallow: / — blocks your ENTIRE site. Don't do this accidentally
- Blocking CSS and JS files — Google needs them to render your pages
- Adding meta noindex here — robots.txt and noindex are different mechanisms
- Forgetting to declare your sitemap URL
Frequently Asked Questions
Does robots.txt prevent pages from being indexed?
Not reliably — Disallow stops crawling, not indexing. Google can index a page it's never crawled if it finds the URL from links. Use noindex meta tags to prevent indexing.
Where does robots.txt go?
Exactly at yourdomain.com/robots.txt — it must be at the root of the domain.
