All About Robots, User-agent:, Crawlers & Spiders
What is robot text, how does it look like
and how can you use it?
Webmasters and Search Engine optimization companies make standard
use of this meta tag. Some people call them "Fresh Tags".
They are very useful and can help you to get more often crawled
by the bots and spiders.
You can place them were you want in your meta data section,
which is located between the header section of your HTML code.
Tags placed in between the <head> and </head> tag.
|
- <meta name="robots" content="index,follow">
- <meta name="robots" content="noindex,follow">
- <meta name="robots" content="index,nofollow">
- <meta name="robots" content="noindex,nofollow">
- <meta name="revisit-after" content="7 days">
|
| If you would like to specify the bot or crawler replace "robots" with "msnbot" or "googlebot". |
Command Googlebot not to crawl a outgoing link on a page
Meta tags can exclude all outgoing links on a page, but you can also command Googlebot not to follow individual text links by adding rel="nofollow" to a hyperlink.
When Googlebot sees the attribute rel="nofollow" on text links, those links won't get any votes when Google ranks the web sites for the search results.
Example, if you would exchange links with SEO WATCH:
<a href=http://www.SEO-watch.com/>SEO for Google Marketing </a>
You could replace it with:
<a href=http://www.SEO-watch.com/ rel="nofollow">SEO for Google Marketing </a>. |
Robot.txt & User-agent files
The content of the robots Meta Tag contains directives separated
by commas.
The currently defined directives are [NO]INDEX and [NO]FOLLOW.
The INDEX directive specifies if an indexing robot should index
the page. The follow directive specifies if a robot is to follow
links on the web page.
The defaults are INDEX and FOLLOW.
The values all and none set all directives on or off: ALL=INDEX,FOLLOW
and NONE=NOINDEX,NOFOLLOW.
You can place as an alternative the robots text into the root
of your server, to control which of your web pages will be listed
in the databases of the Search Engine results and which directories
are forbidden for the crawler or bot. Some stuff you do not
want publicly displayed and when not restricting the robots
it will get public. You should save the text file as robots.txt
and insert the following lines to forbid specific files and
directories, the below record describes the default access policy
for any robot symbolized by the *:
Disallow: /cgi-bin
Disallow
The value of this field specifies a partial URL that is not
to be visited. This can be a full path, or a partial path; any
URL that starts with this value will not be retrieved. For example: |
|
This disallows all files in this directory
whereas |
|
would disallow /help/index.html but allow /help.html
You do not need to allow specific files and directories as the
robots are allowed to any files not declared as disallowed.
For more detailed information to improve ranking in Google, yahoo or MSN,
please contact us by inquiry form or e-mail us
|