IBM Support

How to prevent a Web crawler from indexing content on your Domino server

Technote (FAQ)


Question

Web crawlers are indexing content for a Lotus Domino Web server. How can you prevent this indexing from happening?

Answer

You can prevent indexing for the whole server or for certain pages, or prevent access by specific Web crawlers.

To prevent all content from being indexed, you can use a robots.txt file. Web crawlers respect the rules that are set in a special file named robots.txt on your Web site. The content of this file specifies what can and cannot be indexed on your Web server. You should locate the robots.txt at the top-level directory of your domain, for example http://www.myserver.com/robots.txt. You can set up a robots.txt file on a Domino server by putting the file into ..\data\domino\html\robots.txt. Or you create a substitution rule pointing to the same location.

You configure this file to exclude the content on your server. The creation of this file is beyond the scope of Lotus Domino configuration. For information regarding the configuration of the robots.txt file, refer to the following Web site:


If you want to have most of your Web site content indexed, except for a few pages, then you can include a Robots META tag in the HTML source of the page. For example:

<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">

Adding this tag to HTML page content prevents Web crawlers from indexing that page. And, you can be specific in which pages are excluded from indexing.

To deny access to a specific Web crawler, you can set fields that allow or deny certain IP addresses in the Server document. Refer to the topic "Restricting access by IP address on the Web server" in the Domino Administrator Help.

Document information

More support for: IBM Domino

Software version: 8.0, 8.5, 9.0

Operating system(s): AIX, IBM i, Linux, Solaris, Windows, z/OS

Reference #: 1225945

Modified date: 30 June 2006