Archive

Archive for April, 2007

Google sitemaps : Requesting removal of content from your index

April 21st, 2007 No comments

HI all

wanna remove few url or content from you site. now you can do it right in you google sitemaps account. check out the new google sitemap feature .

As a site owner, you control what content of your site is indexed in search engines. The easiest way to let search engines know what content you don’t want indexed is to use a robots.txt file or robots meta tag. But sometimes, you want to remove content that’s already been indexed. What’s the best way to do that?

As always, the answer begins: it depends on the type of content that you want to remove. Our webmaster help center provides detailed information about each situation. Once we recrawl that page, we’ll remove the content from our index automatically. But if you’d like to expedite the removal rather than wait for the next crawl, the way to do that has just gotten easier.

For sites that you’ve verified ownership for in your webmaster tools account, you’ll now see a new option under the Diagnostic tab called URL Removals. To get started, simply click the URL Removals link, then New Removal Request. Choose the option that matches the type of removal you’d like.

Individual URLs
Choose this option if you’d like to remove a URL or image. In order for the URL to be eligible for removal, one of the following must be true:

Once the URL is ready for removal, enter the URL and indicate whether it appears in our web search results or image search results. Then click Add. You can add up to 100 URLs in a single request. Once you’ve added all the URLs you would like removed, click Submit Removal Request.

A directory
Choose this option if you’d like to remove all files and folders within a directory on your site. For instance, if you request removal of the following:

http://www.example.com/myfolder

this will remove all URLs that begin with that path, such as:

http://www.example.com/myfolder

http://www.example.com/myfolder/page1.html

http://www.example.com/myfolder/images/image.jpg

In order for a directory to be eligible for removal, you must block it using a robots.txt file. For instance, for the example above, http://www.example.com/robots.txt could include the following:

User-agent: Googlebot
Disallow: /myfolder

Your entire site
Choose this option only if you want to remove your entire site from the Google index. This option will remove all subdirectories and files. Do not use this option to remove the non-preferred version of your site’s URLs from being indexed. For instance, if you want all of your URLs indexed using the www version, don’t use this tool to request removal of the non-www version. Instead, specify the version you want indexed using the Preferred domain tool (and do a 301 redirect to the preferred version, if possible). To use this option, you must block the site using a robots.txt file.

Cached copies

Choose this option to remove cached copies of pages in our index. You have two options for making pages eligible for cache removal.

Using a meta noarchive tag and requesting expedited removal
If you don’t want the page cached at all, you can add a meta noarchive tag to the page and then request expedited cache removal using this tool. By requesting removal using this tool, we’ll remove the cached copy right away, and by adding the meta noarchive tag, we will never include the cached version. (If you change your mind later, you can remove the meta noarchive tag.)

Changing the page content
If you want to remove the cached version of a page because it contained content that you’ve removed and don’t want indexed, you can request the cache removal here. We’ll check to see that the content on the live page is different from the cached version and if so, we’ll remove the cached version. We’ll automatically make the latest cached version of the page available again after six months (and at that point, we likely will have recrawled the page and the cached version will reflect the latest content) or, if you see that we’ve recrawled the page sooner than that, you can request that we reinclude the cached version sooner using this tool.

Checking the status of removal requests
Removal requests show as pending until they have been processed, at which point, the status changes to either Denied or Removed. Generally, a request is denied if it doesn’t meet the eligibility criteria for removal.

To reinclude content
If a request is successful, it appears in the Removed Content tab and you can reinclude it any time simply by removing the robots.txt or robots meta tag block and clicking Reinclude. Otherwise, we’ll exclude the content for six months. After that six month period, if the content is still blocked or returns a 404 or 410 status message and we’ve recrawled the page, it won’t be reincluded in our index. However, if the page is available to our crawlers after this six month period, we’ll once again include it in our index.

Requesting removal of content you don’t own

But what if you want to request removal of content that’s located on a site that you don’t own? It’s just gotten easier to do that as well. Our new Webpage removal request tool steps through the process for each type of removal request.

Since Google indexes the web and doesn’t control the content on web pages, we generally can’t remove results from our index unless the webmaster has blocked or modified the content or removed the page. If you would like content removed, you can work with the site owner to do so, and then use this tool to expedite the removal from our search results.

If you have found search results that contain specific types of personal information, you can request removal even if you’ve been unable to work with the site owner. For this type of removal, provide your email address so we can work with you directly.

If you have found search results that shouldn’t be returned with SafeSearch enabled, you can let us know using this tool as well.

You can check on the status of pending requests, and as with the version available in webmaster tools, the status will change to Removed or Denied once it’s been processed. Generally, the request is denied if it doesn’t meet the eligibility criteria. For requests that involve personal information, you won’t see the status available here, but will instead receive an email with more information about next steps.

What about the existing URL removal tool?
If you’ve made previous requests with this tool, you can still log in to check on the status of those requests. However, make any new requests with this new and improved version of the tool.

Webmasters Can Now Auto-Discover With Sitemaps

April 13th, 2007 No comments

Since working with Google and Microsoft to support a single format for submission with Sitemaps, we have continued to discuss further enhancements to make it easy for webmasters to get their content to all search engines quickly.

All search crawlers recognize robots.txt, so it seemed like a good idea to use that mechanism to allow webmasters to share their Sitemaps. You agreed and encouraged us to allow robots.txt discovery of Sitemaps on our suggestion board. We took the idea to Google and Microsoft and are happy to announce today that you can now find your sitemaps in a uniform way across all participating engines. To do this, simply add the following line to your robots.txt file:

Sitemap: http://www.example.com/sitemap.xml

Please provide the complete URL for your Sitemap on this line. We will pick it up wherever you put it in your robots.txt file. This directive is not specific to user-agent. If you have multiple Sitemaps, you can point to your Sitemap index file on this line. Details about the Sitemaps protocol including this addition are available on the protocol website — http://www.sitemaps.org.

If you prefer, you can continue to issue Sitemaps to Yahoo! Search by simply inputting the URL for your Sitemap and submitting. Or add feeds to a site you are already managing under ‘My Sites’ in Site Explorer. This also allows us to provide more feedback to you about what we are doing with the sitemap.

We’re also happy to have some east coasters, Ask and IBM, announce their support for Sitemaps. The more the merrier!

We’ll also be sharing more this week at SES NY.

If you have other thoughts about how we can collaborate with other search engines on standards such as robots.txt, we’d love to hear from you — visit our suggestion board.
Priyank Garg
Product Manager, Yahoo! Search

Sources :http://www.ysearchblog.com/archives/000437.html

Google Algorithm’s Top 10 (Assumed) Positive Factors

April 6th, 2007 No comments

1.    Keyword Use In Title Tags –  “Notice number one – that you have HTML title tags that reflect the key terms you want your page to be found for. That’s been the advice since I first starting writing about SEO back in 1996. Eleven years later – and even in the age of it’s all about links — it remains the top ranked tip by so many experts. – Danny Sullivan, Search Engine Land.

2.    Global Link Popularity of Site (The overall link weight/authority as measured by links from any and all sites across the web – both link quality and quantity) – “Think of a web page as a town. If a city has freeways, airports, train stations, bus shelters and a port, that’s a good indicator that it is an important hub. That orphaned web page with no links pointing to it? It may as well be a hidden tribe of Amazons that no one has discovered.” – Lucas Ng (a.k.a. shor), Fairfax Digital online marketing analyst.

3. Anchor Text of Inbound Link – “Anchor text of the inbound link is one of the most concise assessments another person can make about what your site/page is ‘about’.” – Mike McDonald, WebProNews

4. Link Popularity within Site’s Internal Link Structure (Refers to the number and importance of internal links pointing to the target page) – “As mentioned on my blog, you can pulse a page’s rankings by including and excluding links to it from your home page.” – Russ Jones, Virante CTO.

5. Age of Site (Not the date of original registration of the domain, but rather the launch of indexable content seen by the search engines) – “We have seen new sites flourish as long as they have a clear connection to the ‘parent’ site that has already gained trust.” – Chris Boggs, Search Engine Land Associate Editor.

6. Topical Relevance of Inbound Links To Site (The subject-specific relationship between the sites/pages linking to the target page and the target keyword) – “We seem to have moved from analysis of simply anchor text, to including surrounding text and probably even page theme.” – Caveman, SEO/SEM Consultant.

7. Link Popularity of Site In Topical Community (The link weight/authority of the target website amongst its topical peers in the online world) – ” I’ve seen one of my sites goes from #39 to #1 right after I got 1 link… from the #1 spot on the keyword I was trying to get” – Guillaume Bouchard, CEO NVI Solutions.

8. Keyword Use in Body Text (Using the targeted search term in the visible, HTML text of the page) – “If you are writing about ‘dogs’ then you should naturally use keywords related to ‘dogs’ within your content. If you don’t have keywords within your content it can become hard to rank for those terms.” – Neil Patel, Pronet Advertising.

9. Global Link Popularity of Linking Site – “This is why people bought PageRank 7 site links for lots more than PageRank 6 links. The links were very valuable, and the information on how strong they were was very valuable (this is why it’s also very hard to GET an accurate read on anymore without an SEO shaman). – Todd Malicoat, Stuntdubl SEO Consulting.

10. Rate of New Inbound Links to Site (The frequency and timing of external sites linking to given domain) – “I don’t think getting fifty links overnight will kill you. Especially if those links are bringing traffic and from quality sites. Getting 100K links overnight and having no visitors or search queries as a result smells abit fishy no matter how you look at it.” – Rae Hoffman, Principal, Sugarrae SEO Consulting.