For people who are asking “what is the reason not to show
cache of a website”, here is a brief rundown of possible explanations.
Someone left the following comment a week ago. I have tried
twice to send an email to the person but their mail service says the mailbox is
unavailable. Normally I would leave matters at that but this question comes up
from time to time so I thought I would write a brief article about it. To the
original commenter, I hope you see this. Your site should be fine.
The reason why your site is not cached in Google is the
“NoArchive” directive that you found in the settings. Turning that off
effectively removes the “noarchive” directive from your pages. Google and other
search engines will now cache your pages.
Does the “NoArchive” directive affect rankings or search
visibility? Not that I have ever seen. People sometimes use this directive to
prevent search engines from copying their content for redistribution. That is,
if you do something on your Web pages to prevent visitors from copying and
pasting the content elsewhere, a search engine cache page can still be used to
copy and paste the content. Preventing the search engine from caching makes it
a little more difficult to copy and paste.
I should point out to people who want to prevent copying of
their content that every time a visitor hits your page their browser copies
everything to their local hard drive. The visitor can ALWAYS grab your content
and republish it elsewhere. Streaming content doesn’t work that way, but there
are programs available on the Web that will record streaming feeds.
Remember the old adage: If you put something on the
Internet, it’s out there to stay.
Search engines may not cache a page while showing it in
their SERPs for at least one other reason. For example, in the comment above,
the writer checked his “robots.txt” file. If you block a URL in your
“robots.txt” file but a search engine finds links pointing to that URL, the
search engine may show that URL in its results. These listings are sometimes
referred to as “URL-only” or “Uncrawled listings”.
Just getting DISMAL TRAFFIC to your Website? Let's change all that. CLICK HERE to contact Reflective Dynamics...
Google won’t show these URLs in its safe search mode but if
you loosen the search restrictions you may see them from time to time. The
search engine displays the uncrawled URL to let you know it’s there but that it
doesn’t know anything about the content of the page.
For this reason, when you want to remove content from a
search engine’s listings they recommend that you allow them to crawl the pages
but that you embed “noindex” directives in the meta tags. Robots meta directives
look like this:
<meta name=”robots” content=”" > (No
restrictions)
<meta name=”robots” content=”noindex” > (Do not index
this page)
<meta name=”robots” content=”noarchive” > (Do not
archive/cache this page)
<meta name=”robots” content=”nofollow” > (Do not
follow any links on this page)
<meta name=”robots” content=”noodp” > (Do not use the
Open Directory Project description in SERPs)
<meta name=”robots” content=”noydir” > (Do not use the
Yahoo! Directory description in SERPs)
You can combine them, separated by commas:
<meta name=”robots”
content=”noindex,noarchive,nofollow,noodp,noydir” >
You can give positive directives as well, although most SEOs
feel these are redundant and unnecessary:
<meta name=”robots” content=”index,archive,follow” >
Another reason why a search engine may now show cache data
for a page is that it has only just recently crawled that page and has not
fully integrated the data into its index. This situation is usually temporary
and I believe usually only lasts for a few days.
Finally, if you serve your page content through some special
AJAX or Javascript code, the search engine may only cache the Javascript code
and not actually cache the contents that the code displays.
No comments:
Post a Comment