® Www.sssdirect.com

Sssdirect.com - How to keep robots out of your web site

THE ROBOTS.TXT FILE

You know that search engines have been created to help people find information quickly on the Internet, and the search engines acquire much of their information through robots (also known as spiders or crawlers), that look for web pages for them.

The spiders or crawlers robots explore the web looking for and recording all kinds of information. They usually start with URL submitted by users, or from links they find on the web sites, the sitemap files or the top level of a site.

Once the robot accesses the home page then recursively accesses all pages linked from that page. But the robot can also check out all the pages that can find on a particular server.

After the robot finds a web page it works indexing the title, the keywords, the text, etc. But sometimes you might want to prevent search engines from indexing some of your web pages like news postings, and specially marked web pages (in example: affiliate´s pages), but whether individual robots comply to these conventions is pure voluntary.

ROBOTS EXCLUSION PROTOCOL

So if you want robots to keep out from some of your web pages, you can ask robots to ignore the web pages that you don´t want indexed, and to do that you can place a robots.txt file on the local root server of your web site.

In example if you have a directory called e-books and you want to ask robots to keep out of it, your robots.txt file should read:

User-agent: * Disallow: e-books/

When you don´t have enough control over your server to set up a robots.txt file, you can try adding a META tag to the head section of any HTML document.

In example, a tag like the following tells robots not to index and not to follow links on a particular page:

meta name="ROBOTS" content="NOINDEX, NOFOLLOW"

Support for the META tag among robots is not so frequent as the Robots Exclusion Protocol, but most of major web indexes currently support it.

NEWS POSTINGS

If you want to keep the search engines out of your news postings, you can create an an "X-no-archive" line in of your postings' headers:

X-no-archive: yes

But although common news clients, allow you to add an X-no-archive line to the headers of your news postings, some of them don´t permit you to do so.

The problem is that most search engines assume that all information they find is public unless marked otherwise.

So be careful because though the robot and archive exclusion standards may help keep your material out of major search engines there are some others that respect no such rules.

If you're highly concerned about the privacy of your e-mail and Usenet postings, you must use some anonymous remailers and PGP. You can read about it here:

http://www.well.com/user/abacard/remail.html http://www.io.com/~combs/htmls/crypto.html
http://world.std.com/~franl/pgp/

Even if you are not particularly concerned about privacy, remember that anything you write will be indexed and archived somewhere for eternity, so use the robots.txt file as much as you need it.

Written by Dr. Roberto A. Bonomi


Tags: Robots, Robots.txt, Robots Exclusion Protocol, Marketing Internet Marketing, Home Business, Seo

Which Off-Page Factors Matter Most To The Search Engines

The exact way in which the search engines rank the pages of a web site is not known to anyone outside of the inner circle of engineers working for Google, Yahoo, MSN etc. However there are some general principles on what is considered to be positive facto

The Role Of The Robots.txt File To Improve Site Ranking!

Not many web master take the time to use a robots.txt file for their website. For search engine spiders that use the robots.txt to see what directories to search through, the robots.txt file can be very helpful in keeping the spiders indexing your actual

Seven Myths About Search Engines Demystified

Today there is a lot of information available on the web about how to get good search engine rankings, some information is good and some information is bad. Over the few years search engines have been around, some myths have developed. Some of these myths

Search Engine Robots Or Web Crawlers

Search Engine Robots or Web Crawlers

Search Engine Optimization (SEO) The Myth?

Search Engine Optimization. Big words. Sounds Important. What is it? If you do a search in google for the web definition of search engine optimization, you get a lot of different results. Some simply say that search engine optimization is the generic te

Search Engine Optimization: Natural Linking Strategies

Search Engine Optimization (SEO) can be the difference between a small, barely profitable or visible website and a traffic magnet website. There are a lot of ways, both good and bad, to influence the search engines.

Search Engine Basics

This article is all about the basics if you are new to Search Engine this is the article to get you pointed in the right direction and on the way to getting your site up in the Search Engines and staying there.

Search Engines And The Gurus

People do not change just because they are on the web. They are still the same people they were before the Internet became a popular phenomenon. Some are arrogant. Some are nice. Some are mean. Others are modest. The list of traits is the same for people

How To Ensure The Search Engines Find Your Website

One of the most fundamental aspects of search engine optimisation (SEO) is ensuring that the pages within your website are as accessible as possible to the search engines. It's not only the homepage of a website that can be indexed, but also the internal

Getting To The Top Of The Search Engines.

Climbing to the Top. Have you ever had a friend call you and say, “Hey, I googled myself today and actually found me!” Cool, you think. You go to your computer and try to google your friend as well. Nothing comes up. You tell her. “Oh, well, you have to

Discover The On-Page Factors That Influence Search Engine Optimization

These factors all relate to your own site and include your domain name, page names, meta tags, keyword density, titles, headings and last but not least the content. If you know how to optimize these, you will improve your chances of a better ranking. But

Be Seen In Search Engines, NOW!

When people search using Google, Google uses a ranking system to show the order of the results. One of the things Google takes in to account is keyword relevance, therefore keyword optimization is important. Placing your keywords in the meta-tags is not e