Create robots.txt files to control how search engines crawl and index your website.
Enter one path per line
Optional delay between requests
Add custom rules for specific bots
# Configure settings to generate robots.txt
* - All robotsGooglebot - Google search crawlerBingbot - Bing search crawlerfacebookexternalhit - Facebook crawlerTwitterbot - Twitter crawlerA robots.txt file is a text file placed in the root directory of a website that tells search engine crawlers which pages or sections of your site they can or cannot access and index. It's part of the Robots Exclusion Protocol and helps control how search engines interact with your website content.
Select your default behavior (allow all or block all), add specific paths to disallow, enter your sitemap URL, optionally set a crawl delay, and add any custom rules for specific bots. The tool generates a properly formatted robots.txt file that you can download or copy to your website's root directory.
The robots.txt file must be placed in the root directory of your website. For example, if your website is https://example.com, the robots.txt file should be accessible at https://example.com/robots.txt. It won't work if placed in a subdirectory.
User-agent: * means the rules apply to all web crawlers and search engine bots. You can also specify rules for specific bots like 'User-agent: Googlebot' for Google's crawler or 'User-agent: Bingbot' for Bing's crawler to apply different rules to different search engines.
Disallow tells search engine crawlers not to access specific paths or pages. Allow explicitly permits access to paths that might otherwise be blocked by a broader Disallow rule. For example, you might disallow /admin/ but allow /admin/public/ for public administrative resources.
Yes, including your sitemap URL in robots.txt helps search engines discover and crawl your sitemap more easily. Use the format 'Sitemap: https://example.com/sitemap.xml'. You can include multiple sitemap URLs if you have several sitemaps for different sections of your site.
Crawl delay specifies the number of seconds a crawler should wait between requests to your server. While it can help prevent server overload, most major search engines like Google ignore this directive. It's more useful for smaller sites or when dealing with aggressive crawlers that might burden your server.
Robots.txt only provides suggestions to well-behaved crawlers; it's not a security mechanism. While most legitimate search engines respect robots.txt directives, malicious bots may ignore them. To truly protect content, use password protection, authentication, or server-level access controls instead.