robots.txt is a simple text file on your website that tells search engine crawlers which URLs they may crawl and which URLs they should skip.
Think of it as crawl traffic control, not a privacy lock. We use robots.txt to reduce wasted crawling on pages that do not help customers, like admin areas, filtered internal search results, or endless URL variations that can pop up on WordPress, eCommerce, and booking sites. It also helps when a site is under heavy load, since it can cut back crawler requests to sections you do not want scanned.
What robots.txt controls and what it does not
robots.txt controls crawling, not access. If a URL is blocked in robots.txt, Google may still show the URL in search results if other pages link to it, it just may appear without a description because Google could not fetch the content. If you truly need something kept out of search results, use a noindex tag or header, or put the content behind login or password protection.
It is also public. Anyone can visit /robots.txt, so it is not a place to “hide” sensitive folders. For Orlando and Central Florida businesses, this matters when you have patient portals, client intake forms, or proposal PDFs, those belong behind access controls, not behind a robots rule.
| Directive | What it does | Common use on business sites |
|---|---|---|
| User-agent | Targets a specific crawler (or all crawlers with *) | Set one rule set for all bots, and a separate set for a special crawler if needed |
| Disallow | Asks the crawler not to fetch matching paths | Block /wp-admin/, internal search pages, cart and checkout steps, staging folders |
| Allow | Overrides a broader Disallow for a specific path | Allow /wp-admin/admin-ajax.php while blocking the rest of /wp-admin/ |
| Sitemap | Lists the full URL of your XML sitemap | Help crawlers find your sitemap quickly |
A practical example
Here is a common setup we see on WordPress:
User-agent: *
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php
Sitemap: https://example.com/sitemap.xmlTwo mistakes we see a lot during technical SEO work are blocking CSS/JS or image folders (which can hurt how Google understands your pages) and blocking a page you later try to remove with noindex (Google cannot see noindex if crawling is blocked).
If you want the sitemap line done correctly, it helps to understand what the sitemap is and what it is not, our XML sitemap FAQ breaks that down in plain English.