More threads by Zhivko

Zhivko

Member
Joined
Mar 15, 2019
Messages
52
Reaction score
20
I am trying to convince Google to drop about 3000 URLs from their index.
  • Some of them I marked with 404,
  • some of them I marked with "noindex",
  • some of them I redirected.
This happened in February, more than 2 months ago. Google keeps crawling those URLs (even thought here are NO internal links to them!) and keep showing them in the GSC.

I don't want to spend a few days manually sending each of those URLs for removal in the GSC. But I don't want to wait years for Google to realise those URLs no longer exist.

Any ideas?
 
Use a 410 error code instead of 404.

404 means "not found".

410 means "permanently removed".

List of HTTP status codes - Wikipedia
410 Gone Indicates that the resource requested is no longer available and will not be available again. This should be used when a resource has been intentionally removed and the resource should be purged. Upon receiving a 410 status code, the client should not request the resource in the future. Clients such as search engines should remove the resource from their indices.

Add this to your .htaccess file:

Code:
# need to 410 these URLs
Redirect 410 {old URL}
 
Google will check pages it knows about for a long time. It does not mean they are in the search index. It just means Google is aware of them and wants to see how they behave every once in a while.

If a page returns 404 or 410, then it is removed from the index. But Google will check every once in a while, because things change.

If it's marked as noindex, it is removed from the index, but Google will check every once in a while.

If it redirects.... you get the picture.

You don't need to remove them. Google is reporting to you that it knows that they 404, 410, 301, 302, noindex, disallow etc. And that they have removed them from the index already.

Only worry about URLs reported that you do want in the index. Use the reports to determine if you have made a mistake.
 
@Tiggerito In my case, the URLs are still indexed. And that's about 3000 URLs, compared to about 1000 "real" URLs. So it sux.

I will try using 410, thank you all for the feedback.
 
How are you determining that they are indexed?

410 is a slightly stronger signal than 404 and may cause Googlebot to check them less frequently once they have seen the 410 status.

Having a valid xml sitemap can also push Googlebot to crawl your important pages more often.
 
Google keeps 410's, 404's, "noindex" directives in their index for much longer than you would expect. From our experience it's been months.
 
Google keeps 410's, 404's, "noindex" directives in their index for much longer than you would expect. From our experience it's been months.
A quick update - about a month and a half later, GSC shows 1000 fewer URLs. There were 4000, now there are 3000.

For now, I am willing to believe that what I did works. It just takes time.
 
A quick update - about a month and a half later, GSC shows 1000 fewer URLs. There were 4000, now there are 3000.

For now, I am willing to believe that what I did works. It just takes time.

Yeah, it's just a factor of time.
 

Login / Register

Already a member?   LOG IN
Not a member yet?   REGISTER

LocalU Event

Trending: Most Viewed

  Promoted Posts

New advertising option: A review of your product or service posted by a Sterling Sky employee. This will also be shared on the Sterling Sky & LSF Twitter accounts, our Facebook group, LinkedIn, and both newsletters. More...
Top Bottom