Getting Large Sites Indexed by Google and Other SEs

SEO Question: I have a 100,000+ page website. Is there any easy way to ensure all major search engines completely index my website?

Crawler-Friendly Sites Sit on Good Foundation
Crawler-Friendly Sites Sit on Good Foundation

SEO Answer: Search engines are constantly changing their crawl priorities. Crawl too deeply and get many low quality pages while increasing indexing time and costs. Crawl too shallow and you don’t get down to the relevant pages. Crawl depth is a balancing act.

There is no way to ensure all pages get and stay indexed…they change their crawl priorities constantly. Having said that, you can set your site up to make it as crawler friendly as possible.

Five big things to look at are

  • content duplication - are your page titles or meta description tags nearly duplicate (for example thin content pages that are cross referenced by topic and location)? or do other sites publish the same content (for example an affiliate feed or a wikipedia article)? are search engines indexing many pages with similar content (for example different model color or splitting feedback for one item across many pages)?
  • link authority - does your site have real high quality links? how does your link profile compare with leading competing sites? what features or interactive elements are on your site that would make people desire to link to you instead of an older and more established competing site?
  • site growth rate - does your site grow at a rate consistent with its own history? how does your growth rate compare with the growth rate of competing sites in the same vertical?
  • internal link structure - is every valuable page on your site linked to from other pages on your site? do you force search engines to go through long loops rather than providing parallel navigation to similar priority pages? do you link to low value noisy pages (sometimes a search engine indexing less pages is better than more)?
  • technical issues - don’t feed the search engines cookies or session IDs, and try to use clean descriptive URLs

Some signs of health are

  • you don’t have pages you don’t want getting indexed - wasting link equity on low quality pages means you have less authority to spread across your higher quality pages
  • most the pages you want indexed are getting indexed, actively crawled, and are not stuck in Google’s supplemental index - supplemental problems and / or reduced indexing or crawl priority are common on sites with heavy content duplication, wonky link profiles, or many dead URLs
  • your site is building natural link equity over time and people are actively talking about your brand - if you have to request every link you get then you are losing market share to competitors who get free high quality editorial links
  • you see a growing traffic trend from search engines for relevant search queries - this is really what matters. this includes getting more traffic, higher quality traffic, and searchers landing on the appropriate page for their query.

things you can do if conditions are less than ideal

  • focus internal link equity at important high value pages (for example, on your internal sitemap consider featuring new product categories, new and seasonal items, or link to your most important categories sitewide)
  • trim the site depth (by placing multiple options on a single page instead of offering many near duplicate pages) or come up with ways to make the page level content more unique (such as user feedback)
  • cut out the fat - if many low value pages are getting indexed block their indexing by doing something like nuking them / not linking to them / integrating their information into other higher value pages
  • use descriptive page relevant URLs / page titles / meta descriptions - this helps ensure the right page ranks for the right query and that search engines will be more inclined to deeply crawl and index your site
  • restructure site to be more top / mid / bottom heavy - if a certain section of your site is overrepresented in the search results consider changing your internal link structure to place more weight on other sections. in addition you can add features or ideas which make the under-represented pages more attractive to link at
  • use Sitemaps - while you should link to all quality pages of your site from your site and use internal link structure to help them understand what pages are important you can also help search engines understand page relationships using the open sitemap standard

Comments

Tag:

Furl

Have a bookmark! -

Leave a Reply

Name (required)

Mail (will not be published) (required)

Website