Robots.txt Generator

Create robots.txt files to control how search engines crawl and index your website.

Robot Settings

Enter one path per line

Optional delay between requests

Add custom rules for specific bots

Important Notes:

  • • robots.txt must be placed in your website's root directory
  • • Rules apply to all search engines unless specified otherwise
  • • Disallow rules are suggestions, not enforced security
  • • Test your robots.txt with Google Search Console

Generated robots.txt

# Configure settings to generate robots.txt

Common User-Agents:

  • * - All robots
  • Googlebot - Google search crawler
  • Bingbot - Bing search crawler
  • facebookexternalhit - Facebook crawler
  • Twitterbot - Twitter crawler

Frequently Asked Questions

What is a robots.txt file?

A robots.txt file is a text file placed in the root directory of a website that tells search engine crawlers which pages or sections of your site they can or cannot access and index. It's part of the Robots Exclusion Protocol and helps control how search engines interact with your website content.

How do I use the robots.txt generator?

Select your default behavior (allow all or block all), add specific paths to disallow, enter your sitemap URL, optionally set a crawl delay, and add any custom rules for specific bots. The tool generates a properly formatted robots.txt file that you can download or copy to your website's root directory.

Where should I place the robots.txt file?

The robots.txt file must be placed in the root directory of your website. For example, if your website is https://example.com, the robots.txt file should be accessible at https://example.com/robots.txt. It won't work if placed in a subdirectory.

What does 'User-agent: *' mean?

User-agent: * means the rules apply to all web crawlers and search engine bots. You can also specify rules for specific bots like 'User-agent: Googlebot' for Google's crawler or 'User-agent: Bingbot' for Bing's crawler to apply different rules to different search engines.

What's the difference between Allow and Disallow?

Disallow tells search engine crawlers not to access specific paths or pages. Allow explicitly permits access to paths that might otherwise be blocked by a broader Disallow rule. For example, you might disallow /admin/ but allow /admin/public/ for public administrative resources.

Should I include my sitemap URL in robots.txt?

Yes, including your sitemap URL in robots.txt helps search engines discover and crawl your sitemap more easily. Use the format 'Sitemap: https://example.com/sitemap.xml'. You can include multiple sitemap URLs if you have several sitemaps for different sections of your site.

What is crawl delay and should I use it?

Crawl delay specifies the number of seconds a crawler should wait between requests to your server. While it can help prevent server overload, most major search engines like Google ignore this directive. It's more useful for smaller sites or when dealing with aggressive crawlers that might burden your server.

Can robots.txt block all access to my site?

Robots.txt only provides suggestions to well-behaved crawlers; it's not a security mechanism. While most legitimate search engines respect robots.txt directives, malicious bots may ignore them. To truly protect content, use password protection, authentication, or server-level access controls instead.