Home > Features > Robot and Spider Control • Bookmark   • Newsletters   • Register Search Options

FEATURE

Robot and Spider Control

Robot and Spider Control

December 17, 2003
Text size: 
Get XBIZ News
XBIZ Research
Will virtual reality boost the paysite market?
Yes, it will soon
  40.12%
Yes, but in a few years
  37.21%
No
  22.67%
Out of 172 votes. Results based on votes submitted by members of XBIZ.net social network.

" The other way to stop most ‘bots’ from searching or indexing your page is to use META exclusion tags. This is often the only way that Webmasters on virtual or free hosts without full server access can hope to control a spider’s wanderings and reports... "

Editor’s note: Search engine spiders are typically the only kind of spiders that Webmasters want to see hanging around. These robots quietly crawl their way around the World Wide Web seeking out every page they can find, and reporting their contents back to their search engine masters. This is usually a welcome operation as it often leads to more ‘free’ traffic – but occasionally robots find their way into places we wish they wouldn’t, exposing sensitive information for the world to see… Here’s how to help prevent this from happening: ~ Stephen

Before submitting your site to the search engines, you will want to consider what pages and links you want the search engine "robot" (the program that indexes your site) to "spider" (follow), and what pages you don’t want it to follow – since you may have pages with sensitive information, a ‘scrap directory’ full of "work in progress," or a protected "members area" that you would not like listed.

This goal can easily be achieved in two ways. The first way is with a robots.txt file placed in the root directory of your Website, but you must have full domain privileges in order for this to work. While this article is not meant to deal with the intricacies of the robots.txt file, a quick word of warning is in order: never leave this file empty, as it will indicate to some robots that you do not want any part of your site indexed.

The other way to stop most ‘bots’ from searching or indexing your page is to use META exclusion tags. This is often the only way that Webmasters on virtual or free hosts without full server access can hope to control a spider’s wanderings and reports on a page-by-page basis. The syntax is simple:

<META name="ROBOTS" content="ALL,NONE,INDEX,FOLLOW,NOINDEX,NOFOLLOW">

The default value for the robots tag is "ALL" which allows the robot to index the page, then spider all links, indexing the linked pages too. "NONE" performs the opposite, disallowing the robot from either indexing the page, or spidering the links on it, in essence ignoring the page altogether.

"INDEX" indicates that robots should include this page in their search engines, while "FOLLOW" means that robots should follow (spider) the links on this page. Conversely, a value of "NOINDEX" allows links from the page to be spidered, even though the page itself is not indexed, while a value of "NOFOLLOW" allows the page to be indexed, but no links from the page are to be spidered.

Some Sample Snippets
Here’s some example robot controlling META tags, which would be put in between your document’s <HEAD> and </HEAD> tags:

<META name="ROBOTS" content="NOINDEX">
- This will prevent the bot from indexing that page.

<META name="ROBOTS" content="NOFOLLOW">
- This allows the page to be indexed, but any hyperlinks in that page will not be spidered.

<META name="ROBOTS" content="NOINDEX,NOFOLLOW">
- Is a combination of the two, where the page will not be indexed, and other links will not be followed. This tag may also prevent some mirroring software from downloading the page.

While there are many other META tags that can be used to improve your rankings, controlling what’s ranked is the first step, after which it’s wiser to invest your time in optimizing your description and keywords tags in order to boost your search engine rankings, which is the subject of my next article…


BUSINESS ANALYSIS

Camming Issue: Studio 20 Snapshot

As one of camming’s most vibrant capitals, Romania boasts a bevy of beautiful and talented models, along with an advanced technological infrastructure and professional cam studios that support this... More »

Camming Issue: An Evolving Business

One of the most visual examples of the growing maturity and success of today’s live webcam industry is found in the emergence of professional cam studios. These facilities evolved from simple locations... More »

Camming Issue: Models vs. Affiliates

One of the most contentious conversations among cam models today involves affiliate promotions that include unauthorized photos or videos of the models. This isn’t a matter of a model’s, network’s,... More »
XBIZ NEWSLETTERS
Stay informed of the latest industry developments. Get XBIZ newsletters delivered to your inbox. Subscribe today!
Enter email address:

* To manage existing subscriptions click here.






POPULAR PRODUCTS & SERVICES
Submit your press release to
multiple news outlets with 1 click.
Subscribe to RSS news feeds or
add free content to your website.
Access XBIZ news and articles
with your mobile device.
Subscribe to XBIZ Premiere magazine, the industry's leading adult retail trade publications, delivering the most timely and comprehensive business news and information to producers and retailers of adult products.

UPCOMING EVENTS

XBIZ 2017

Jan 09 - Jan 13
Hollywood, CA

XBIZ Awards 2017

Jan 12 - Jan 12
Los Angeles, CA

ANME Founders Show

Jan 14 - Jan 15
Los Angeles Marriott Burbank Airport

Everything To Do With Sex Show

Jan 20 - Jan 22
Montréal, Québec
Everyday thousands of business professionals browse XBIZ's industry directory for quality products and services. Not listed yet? Your company could be losing potential new business. Submit your company today!
Use XBIZ RSS feeds to stay informed of the latest industry developments or as a content syndication tool for your website!