educational

Robot and Spider Control

Editor’s note: Search engine spiders are typically the only kind of spiders that Webmasters want to see hanging around. These robots quietly crawl their way around the World Wide Web seeking out every page they can find, and reporting their contents back to their search engine masters. This is usually a welcome operation as it often leads to more ‘free’ traffic – but occasionally robots find their way into places we wish they wouldn’t, exposing sensitive information for the world to see… Here’s how to help prevent this from happening: ~ Stephen

Before submitting your site to the search engines, you will want to consider what pages and links you want the search engine "robot" (the program that indexes your site) to "spider" (follow), and what pages you don’t want it to follow – since you may have pages with sensitive information, a ‘scrap directory’ full of "work in progress," or a protected "members area" that you would not like listed.

This goal can easily be achieved in two ways. The first way is with a robots.txt file placed in the root directory of your Website, but you must have full domain privileges in order for this to work. While this article is not meant to deal with the intricacies of the robots.txt file, a quick word of warning is in order: never leave this file empty, as it will indicate to some robots that you do not want any part of your site indexed.

The other way to stop most ‘bots’ from searching or indexing your page is to use META exclusion tags. This is often the only way that Webmasters on virtual or free hosts without full server access can hope to control a spider’s wanderings and reports on a page-by-page basis. The syntax is simple:

<META name="ROBOTS" content="ALL,NONE,INDEX,FOLLOW,NOINDEX,NOFOLLOW">

The default value for the robots tag is "ALL" which allows the robot to index the page, then spider all links, indexing the linked pages too. "NONE" performs the opposite, disallowing the robot from either indexing the page, or spidering the links on it, in essence ignoring the page altogether.

"INDEX" indicates that robots should include this page in their search engines, while "FOLLOW" means that robots should follow (spider) the links on this page. Conversely, a value of "NOINDEX" allows links from the page to be spidered, even though the page itself is not indexed, while a value of "NOFOLLOW" allows the page to be indexed, but no links from the page are to be spidered.

Some Sample Snippets
Here’s some example robot controlling META tags, which would be put in between your document’s <HEAD> and </HEAD> tags:

<META name="ROBOTS" content="NOINDEX">
- This will prevent the bot from indexing that page.

<META name="ROBOTS" content="NOFOLLOW">
- This allows the page to be indexed, but any hyperlinks in that page will not be spidered.

<META name="ROBOTS" content="NOINDEX,NOFOLLOW">
- Is a combination of the two, where the page will not be indexed, and other links will not be followed. This tag may also prevent some mirroring software from downloading the page.

While there are many other META tags that can be used to improve your rankings, controlling what’s ranked is the first step, after which it’s wiser to invest your time in optimizing your description and keywords tags in order to boost your search engine rankings, which is the subject of my next article…

Copyright © 2024 Adnet Media. All Rights Reserved. XBIZ is a trademark of Adnet Media.
Reproduction in whole or in part in any form or medium without express written permission is prohibited.

More Articles

opinion

To Cloud or Not to Cloud, That Is the Question

Let’s be honest. It just sounds way cooler to say your business is “in the cloud,” right? Buzzwords make everything sound chic and relevant. In fact, someone uninformed might even assume that any hosting that is not in the cloud is inferior. So what’s the truth?

Brad Mitchell ·
opinion

Upcoming Visa Price Changes to Registration, Transaction Fees

Visa is updating its fee structure. Effective April 1, both the card brand’s initial nonrefundable application fee and annual renewal fee will increase from $500 to $950. Visa is also introducing a fee of 10 cents for each transaction, and 10 basis points — 0.1% — on the payment volume of certain merchant accounts.

Jonathan Corona ·
opinion

Unpacking the New Digital Services Act

Do you hear the word “regulation” and get nervous? When it comes to the EU’s Digital Services Act (DSA), you shouldn’t worry. If you’re complying with the most up-to-date card brand regulations, you can breathe a sigh of relief.

Cathy Beardsley ·
opinion

The Perils of Relying on ChatGPT for Legal Advice

It surprised me how many people admitted that they had used ChatGPT or similar services either to draft legal documents or to provide legal advice. “Surprised” is probably an understatement of my reaction to learning about this, as “horrified” more accurately describes my emotional response.

Corey D. Silverstein ·
profile

WIA Profile: Holly Randall

If you’re one of the many regular listeners to Holly Randall’s celebrated podcast, you are already familiar with her charming intro spiel: “Hi, I’m Holly Randall and welcome to my podcast, ‘Holly Randall Unfiltered.’ This is the show about sex, the adult industry and the people in it.

Women In Adult ·
trends

What's Hot Now: Leading Content Players on Trending Genres, Monetization Strategies

The juggernaut creator economy hurtles along, fueled by ever-ascendant demand for personality-based authenticity and intimacy — yet any reports of the demise of the traditional paysite are greatly exaggerated.

Alejandro Freixes ·
opinion

An Ethical Approach to Global Tech Staffing

One thing my 24-year career as a technologist working to support the online adult entertainment industry has taught me about is the power of global staffing. Without a doubt, I have achieved significantly more business success as a direct result of hiring abroad.

Brad Mitchell ·
opinion

Finding the Right Payment Partner

Whenever I am talking with businesses that are just getting started, one particular question comes up a lot: “How do I get a merchant account?” It’s a simple question, but it has a complicated answer.

Jonathan Corona ·
opinion

The Taxman Cometh for Every Business

February may be the month of romance, but it is also a time when we need to think about something that inspires very little love: taxes. April is not far away, and the taxman is always waiting. This year, federal and most state income taxes are due Monday, April 15.

Cathy Beardsley ·
opinion

The Continuous Journey of Legal Compliance in Adult

The adult entertainment industry is teeming with opportunity but is also fraught with challenges, from anticipating consumer behavior to keeping up with technological innovation. The most labyrinthine of all challenges, however, is the world of legal compliance.

Corey D. Silverstein ·
Show More