How to Stop Search Engine Crawlers From Crawling Your Site

There are times when we don’t want Google, Bing, etc. to crawl our website into their search index. This is a common issue for new websites in development, or for ongoing websites that are being redesigned. This “new” website typically resides under a development URL such as: dev.mydepartment.ca.gov.

Fortunately the “Web” thought of this problem a long time ago. The solution is called robots.txt.

This is a simple txt file you place in the root of your domain, and it provides directives to search engine vendors of what to not crawl, etc. And the major search engines do follow these directives. You can find out more detailed information about the robots.txt at: https://en.wikipedia.org/wiki/Robots_exclusion_standard

So to simply block all indexing on a particular website, place these simple two lines of directives within your robots.txt, and you are done:

user-agent: *

disallow: /