Sunday, June 5, 2011

Time taken for Google to de-index 301 redirect pages

I changed a bunch of the URL's on a site that's in the Alexa to 100,000 - i.e. a moderately busy site that the Google bot visits every day. The old URL's now do a 301 redirected to the new structure. At the time that I did the switch over and every day for the next 60 days I took a snapshot of how many URL's where indexed by Google in each section. I used the following search command in Google:
inurl:/old/folder/pattern site:mysite.com
I entered the number of URL's into a spreadsheet for each of 3 folder patterns. Each pattern started off with 13, 18, and 87 URL's in Google's index. The objective of the exercise was to see how long it would take Google to de-index these pages. Here is a chart of the results:




The folder pattern with 87 URL's is shown against the right axis and the other two against the left.
Expectations:
My expectation was that as soon as Google found the new URL's (it found almost all of them within 5 days) that it would rapidly de-index the old URL's. Remember that I'm telling Google that this is a permanent (301) not temporary (302) redirect.
Actual results:
  1. It took around 55 days to naturally de-index all the pages. Much longer than I was expecting.
  2. The de-indexing for the 2 smaller collections of pages was linear.
  3. The de-indexing for the larger collection of folders was sudden and this happened after 48 days.
There are other techniques for de-indexing pages from Google. For example, Google's Webmaster Tools has a place for you to enter the URL's you want to remove and you can also add the pattern to your robots.txt file which might have de-indexed them faster. The objective of this exercise was not to rapidly de-index those pages but to see how Google naturally de-indexed them over time when given a 301 redirect directive.
My surprise is how long it took to do that.
I'm not going to show a chart of the indexing of the new URL's because it's exactly as you would expect with the line rising rapidly up to the previous values. As I mentioned, 93% of the new links had been indexed within 5 days of them appear on the site and 100% had been indexed by day 13.