Common search engine FAQs answered by experts

What is robots.txt used for?

Robots.txt is used to give search engine crawlers instructions about which parts of your website they should or should not request.

For business owners and in-house marketers, robots.txt matters because crawl access can affect how quickly search engines find your pages, understand your site, and update search results. It does not directly create rankings, calls, forms, bookings, or sales, but a wrong rule can hide the pages that bring those results. A dental office, law firm, pest control company, or lawn care business can have strong service pages and still lose search visibility if crawlers are blocked from the wrong folder, script, or page type.

A robots.txt file usually lives at yourdomain.com/robots.txt. Search engines check it before crawling your site. The file can tell crawlers not to request admin areas, internal search results, duplicate filtered pages, staging folders, or other low-value URLs. This helps crawlers spend more time on the pages that matter: service pages, location pages, blog answers, proof pages, and contact paths.

Robots.txt is not a privacy tool. If a blocked URL is linked somewhere else, it may still appear in search results without a normal title or description. It also does not remove indexed pages. For that, you usually need a noindex tag, a 404 or 410 response, a redirect, or a removal request in Google Search Console.

Use case	What it means	What to do
Block low-value crawl paths	Search engines do not need to crawl cart filters, internal search pages, or admin URLs.	Disallow those patterns after checking they are not needed for rankings or users.
Protect crawl budget	Large sites can waste crawler attention on duplicate or thin URLs.	Review crawl data in Google Search Console, Screaming Frog, Ahrefs, or Semrush.
Keep money pages open	Your main service and location pages need to be crawlable.	Test your highest-value URLs before and after editing robots.txt.
Point to your sitemap	The file can list your XML sitemap location.	Add the sitemap line so crawlers can find your main URL list faster.

Good example: A local HVAC site blocks internal search result pages and WordPress admin URLs, but allows crawlers to access service pages, city pages, CSS, JavaScript, images, and its sitemap.

Bad example: A redesigned website launches with “Disallow: /” left over from staging. Google can no longer crawl the site, so rankings, quote requests, and phone calls can drop fast.

Before editing robots.txt, check the page type and business impact. Ask: Does this page bring qualified traffic? Does Google need it to understand a service? Does the page support a location, review, gallery, or conversion path? Blocking a weak tag archive is often fine. Blocking a “roof repair Orlando” page is not.

Open your robots.txt file and look for broad rules, especially “Disallow: /”.
Test your top service pages in Google Search Console URL Inspection.
Run a crawl in Screaming Frog to see which URLs are blocked.
Confirm CSS, JavaScript, and image folders are not blocked if they affect layout or page rendering.
Add your XML sitemap location if it is missing.

Robots.txt works best as part of technical SEO, not as a quick fix for bad content. We use it to reduce crawl waste, protect the pages that drive leads, and keep search engines focused on the URLs that support your pipeline. If your site has crawl, indexing, or technical issues, our SEO services can help connect the fix to traffic and lead quality. If crawl issues come from bloated themes, unstable plugins, or server setup, our WordPress hosting work can remove the technical blockers behind them.

Does HTTPS affect SEO?

How long does SEO take to show results?

What makes a backlink high-quality?

Do blog posts help SEO?

What is SEO, and what does it stand for?

How does site speed affect SEO?