Getting Bulk PA Data for 404s with URL Profiler

I’ve been using URL Profiler on-and-off for a few months now, mainly for full-on link analysis – especially when it comes to penalty removal and disavow work. However, as I’m sure other folks have discovered, there’s a few other cheeky ways that the software can be put to good use. I found one, and after a chat with Patrick (one of URLP’s founders), I thought it’d be a good idea to throw it up as a quick blog post.

The challenge – 404orama!

I have a client who – despite only having a 1,000-page website – has over 5,000 404 (Page Not Found) errors associated with it. Over 5,000! (Pity it’s not over 9,000, otherwise I could use this. Anyhow…)

The number is so high due to a variety of reasons:

  • They’ve redesigned the site a few times in the past, which has included URL changes, but have never redirected old URLs to the new URLs,
  • A lot of random and/or duplicate URLs have been auto-generated due to a bug or two caused by their CMS system,
  • Simply due to pages being removed by the client’s internal teams (for archiving purposes) but not being redirected.

When you’re dealing with such a high quantity of 404s, it’s difficult to know where to start. My plan was to get PA (Page Authority) data on every URL, so that I could at least work through the list bit-by-bit starting with those with the most SEO value and therefore the most urgent to fix.

Enter URL Profiler. One of the many bits of data that it can grab is none other than PA. This gave me an idea…

The process

The process was dead simple. Instead of putting in a list of external URLs (as one might do when using it to conduct link analysis), I put in the whole list of 5k+ internal URLs, which was collated using a mix of Google Search Console data and a full-site Screaming Frog crawl.

I asked URLP to find PA data on all of them, let it run, and boom: PA data on 5k+ URLs. Sort from highest PA to lowest and that’s your priority order sorted.

URL Profiler results spreadsheet screenshot
The only problem? I now have the delightful task of figuring out where they should be redirected to. Hopefully chunks of them will follow patterns, and that I won’t need to run through all 5k+ individually(!), but either way – wish me luck…!

No Comments

Post a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.