A robots.txt file is a plain-text file placed in the root of your website (e.g. example.com/robots.txt). It tells search engine crawlers which pages or sections of your site they are allowed or not allowed to access.
Disallow prevents Googlebot from crawling a URL, but it does not guarantee the page will be removed from search results. If another site links to a blocked page, Google may still index a snippet. Use a noindex meta tag to fully remove a page from results.
Disallow tells a crawler not to visit the specified path. Allow explicitly grants access and is used to override a broader Disallow rule for a specific sub-path.
The asterisk (*) is a wildcard that matches all crawlers. Rules under User-agent: * apply to every bot that reads the robots.txt file.
The file must be placed in the root directory of your website — the same level as your homepage. For most sites the URL is https://yourdomain.com/robots.txt.
Yes. Adding a Sitemap: directive in your robots.txt makes it easy for any crawler to discover and index your site content, even if it did not receive your sitemap through Search Console.
Crawl-delay tells a bot to wait a set number of seconds between requests. This prevents aggressive crawling from overloading your server. Note: Googlebot ignores Crawl-delay — manage Google's crawl rate through Search Console instead.
Yes. You can define separate rule blocks for different bots. For example, you can allow Googlebot everywhere while blocking a specific scraper bot entirely.
Yes, indirectly. Accidentally blocking important pages from crawling can prevent them from appearing in search results. Always review your robots.txt before uploading it to a live site.
Yes, the robots.txt generator is completely free. You can create and download as many files as you need with no account or subscription required.
No. This tool runs entirely in your browser using JavaScript. Nothing you configure is transmitted to SosialHits or any third party.