There’s been some great articles recently on the topic of white-hat vs. grey-hat vs. black-hat SEO, regarding what’s working and what Google’s doing about the situation, and so I wanted to weigh in with my recent observations.
While the likes of “SEO evangelist” (and one of my SEO heroes) Rand Fishkin argue that white-hat really does work and should be the way to go, others have vented their frustrations, saying that Google isn’t making it easy to be a pure white-hat anymore; that you may have to turn a little ‘grey’ in order to succeed. In the meantime, black-hats are the ones who are ultimately winning – while we all wonder, worry and play by the rules, they’re doing fine and reaping all the benefits the top of the SERPs can bring.
As that self-confessed link spammer put it in the SEOmoz comment I’ve linked to above: “Don’t blame me for doing it, blame Google for not fixing it.“ And that’s part of the problem: it’s all well and good being white-hat, honest and ethical, and for Google to ask us to do it this way, but they’re not exactly doing us any favours if they’re not weeding out the link spammers as well.
I’ve been doing link building full-time for over 3 years. In all that time, I’ve looked at a lot of backlink profiles, mainly competitors of my clients. I have to say though, that of all the clients and industries I’ve looked into, I’ve seen more dodgy backlink profiles in the last few months than I’ve seen in all the years previously. What makes it worse is that they’re the ones who are currently ranking well in (and often at the top of) Google for their industry head terms.
In this post, I’m going to share my experiences – the types of links I’ve seen that seem to be working for these sites – and what I think Google should be thinking of doing in order to counter it (which is certainly easy for me to say, and easier said than done, I’m sure).
Disclaimer: Before I continue, I just want to say that I won’t be naming and shaming. Apart from the fact that I don’t want to get into any legal or libel trouble (*coughcowardcough*), it’s simply not my style. What I am hoping is that fellow SEOs will read this, nod along and say “yep, I’m seeing this too, and I’m equally annoyed and gutted by it.” Besides, when Eppie Vojt wrote a similar, excellent post outing a website and its crappy links, the site being scrutinised had been spotted and banned by Google before his post had been published (while the sites I’m talking about are still ranking).
Is black-hat victorious? A backlinks case study
The first (and main) site I will be looking at is in a similar vertical to the one Eppie outed – a very competitive financial sector. Similar to AutoInsuranceQuotesEasy.com (previously) ranking 2nd for “car insurance” in the US, my example was* ranking 2nd for its head term in the UK – it is an affiliate site, with a non-branded exact-match domain URL, fighting some of the biggest household-name brands in the UK (and – for the most part – beating them). I know this industry and its SERPs pretty well, as I used to work on a site that is currently one of the ones ranking near it (read: below it). I’ll call this site Example X (or EX).
* I say “was” because when I first started typing this post, it was 2nd. However, it’s now 8th. Typical eh? Still, it’s still on the 1st page, and therefore it hasn’t been entirely nuked by Google, unlike Eppie’s example. Perhaps it took a hit from the recent blog network devaluation, if it’d been guilty of that practice, too – a quick glance at the overall backlinks profile suggests that it could have been.
The second cluster of sites I will be examining are in a very competitive industry relating to law and legal services. I have been working on a client’s site and while we’re trying everything we can that is white-hat and by-the-book, the people ranking at the top are pretty much as black-hat as can be… I’ll call this group of sites Cluster Y (or CY).
I have mainly used link analysis tool Majestic SEO to analyse these sites’ ‘best’ backlinks, based on their ACRank score. I love using Open Site Explorer, but I find Majestic’s results more interesting in this instance. Besides, at a quick glance, many of Majestic’s top 100 links for Example X overlap with OSE’s top 100, which are ordered by Page Authority.
Example X‘s strongest links
When Example X is thrown into Majestic SEO (root domain, Fresh index), its top 100 backlinks range from 7 to 4 in ACRank. The majority of them reveal themselves to be blog comments (similar to Eppie’s example: 43% of AutoInsuranceQuotesEasy.com’s backlinks were comments), but there’s more to it than that. These blog comments have a lot in common:
- The majority of them are not at all relevant to EX‘s industry, including sites to do with technology, software, gardening, fashion, politics and even gambling,
- The majority of the comments are not relevant to EX‘s industry – some of them themselves are dodgy, including gambling, adult and Viagra comment links,
- Some are in English, but a lot of them are in foreign languages, including French, Russian and Japanese, and therefore also on foreign domains (e.g. .fr and .jp),
- Many of them have hundreds (if not thousands) of blog comments, which are not only dofollow but are also seemingly unregulated (i.e. they go live instantly without needing approval). This also means that each of these pages links out to hundreds or even thousands of external sites,
- As you can imagine, the comments themselves are pointless and contribute absolutely nothing to the posts; they’re your typical “great post” type comments in broken English (possibly produced by spinning software),
- In some cases, you can actually see ‘broken’ HTML code, where either the commenter hasn’t configured the link properly, or the blog doesn’t allow – or have the capability – to show in-comment links (i.e. the link is coming from the commenter’s ‘Name’ instead).
For those that still have a comment form and have the capability to add to the almost endless list of spammy, useless comments, most of them have a CAPTCHA form, which suggests one of two things: 1) that the blog commenting process isn’t entirely automated – someone still has to manually enter the data into the fields and type in the correct CAPTCHA code each time, or 2) it is automated, and CAPTCHA-filling software is being used (kudos to @Andrew_Isidoro for pointing out that such software exists, as I – somewhat naïvely – had no idea). Call me a sceptic, but my money’s on the latter.
Cluster Y‘s general backlinks
I know what you’re thinking: for Example X, I’ve only really properly assessed 100 links. It might be what Majestic considers to be its 100 best links, but it is just 100. The site has more than 5,000 in total, so I’ve only really scrutinised 1-2% of its backlink profile.
Of course, I have trawled through the entire list of 2,500 backlinks that Majestic came back with (and, let’s be fair, it’s not much better)! But it is true that it might’ve been the case that the rest of its links are perfect, by-the-book and ticking all the boxes as far as Google is concerned, and to be fair, the likes of Rand argue that Google has to allow for a bit of spam, as no one has a 100% perfect link profile (after all, if your competitors get dodgy sites to link to you, it shouldn’t be considered your fault, but if the majority of your links are spammy, it’s more likely the case that it’s your own doing). That said, even that’s hotly debated at the moment (see Reason #1 of this post), but I digress…
Cluster Y is a different story though. I’ve looked at 5 sites all ranking for the industry head terms on page 1. They have only a few hundred links each. While there are some non-relevant blog comments à la Example X, there’s also a lot of footer and blogroll links going on. And guess what? They’re all coming from completely off-topic sites. Out of the hundreds of backlinks each of the sites has, 4 out of 5 of them do not have what I would classify as ONE relevant link. Ironically, the one that does ranks the lowest out of all 5 of them, but still also has a lot of non-relevant links as well. The site I’ve worked on ranks lower than all 5 of them, and I’ve tried to keep it relevant as much as possible.
So what’s the problem?
The problem is that Google has guidelines and Google employees – like Matt Cutts and his Webmaster Help videos – like to recommend white-hat ways of doing SEO. For the most part, these sites are doing the absolute opposite and succeeding from it. How so?
Relevance – Google say it themselves:
“Your site’s ranking in Google search results is partly based on analysis of those sites that link to you. The quantity, quality, and relevance of links count towards your rating. The sites that link to you can provide context about the subject matter of your site, and can indicate its quality and popularity.” (Emphasis added.)
Wil Reynolds has commented on this as well. As I’ve established above, there’s barely any instance of relevance in either Example X or Cluster Y‘s backlinks. The only relevance I can see is the keyword anchor text: the site might be able blue widgets and the anchor text might be “blue widgets,” “blue widget,” “cheap blue widgets,” etc., but the sites these links are coming from are not about blue widgets – instead they’re blog posts on a different subject, with hundreds of blog comments linking out to completely different sites in completely different industries each time.
Large numbers of out-bound links – Back in 2009, Cutts stated that pages shouldn’t have more than 100 links on it – internal or external – if it can be helped. However, since then, it has been highlighted as more of a technical guideline than a quality guideline, and it’s particularly fine if “your page is sufficiently authoritative.” Now going by to Majestic’s AC Rank, these pages are considered sufficiently authoritative, which is precisely the reason they’re being spammed to death (that and the fact that they’re dofollow, etc.)
But c’mon, Goog… These sites have links in the 100s, if not 1,000s – one even has about 4,000 out-bound links on it, due to the excessive number of comments on the page. Maybe it should be a quality guideline? You could argue that it’d be unfair for the website to become devalued by Google and to suffer because of something outside of its control, but then again, it is in the site’s webmaster’s control: they could consider deleting the comments and/or consider disabling comments from appearing whatsoever. After all, it’d probably be better to have no comments at all than a plethora of nonsense, adding absolutely squat to the topic of conversation. This also ties in with Google’s stance on ‘bad neighbourhoods’ and the fact that for these blogs, where you’re linking out to is also important, not just where you’re being linked from.
The white-hat’s dilemma
What really sucks about all this is the fact that these mad (black-)hatters are getting away with it.
Yes, you can be white-hat and work hard and slave away on getting 100% genuine, honest, by-the-book links, but when your competitors can spend 5 minutes using a bit of automated software to get really powerful – but really irrelevant and dodgy – links and succeed, then what’s the incentive for the white-hat? Work hard, be good and still not rank?! Sod that! Clients won’t thank their agencies for it, and Marketing Directors won’t thank their in-house team for it. And this is precisely the gauntlet Google is running by not tidying up this crap: “if you can’t beat ’em, join ’em”, may become the case for many honest, hard-working white-hats who just aren’t seeing the benefits from all their hard work.
But what’s a search engine giant to do?
Right, now we enter the part of the post where I enter ridiculous self-righteousness…
It’s easy for me to say “this is what Google should do” – especially as I don’t think I could ever work for them, as their crazy interviews questions leave me utterly dazed, confused and feeling stupid – but that said, I think the above certainly highlights that there is a problem and that the problem needs to be seen to and fixed. Easier said than done, yes…
The first thing they should be thinking of doing is turning up the ‘relevancometer’. It’s true that completely irrelevant, off-topic sites can and do link out – for example, someone on a gardening forum might ask its members for advice on car insurance, just because they trust the community and value their opinion – but when every link is irrelevant and off-topic then that seems a little more than coincidence…
I think they should also consider making excessive numbers of links a quality guideline, not just a technical one. I can’t see why a page’s links overall – not just the most recent ones – cannot and shouldn’t be given slightly less value as more links flood onto a page. Obviously it’s a part of Google’s PageRank algorithm – that more links on a page means less ‘link juice’ is allocated to those pages – but I’m talking about a limit where that drops even further; maybe not in terms of PageRank, but in terms of looking at a page, and if it hits a certain limit, its value is affected slightly, and then if it hits another limit, it’s affected even more, and so on.
In fact, if this isn’t a consideration already, what about working out a ratio between page size/copy length and the number of links on the page? For instance, I’ve linked out to a lot of sites within this post, but then it’s also 3,000 words long. 30 links for every 3,000 words is a bit different to having 100s within less than that. Naturally, having comments following a blog post – genuine or otherwise – is going to affect that ratio significantly, but I think it’s safe to say that natural, genuine blog comments are going to be longer and more substantial than your typical, spammy “great post” style comments.
Not only should they consider turning up the ‘relevancometer’ from an on-site copy or general domain point of view, but also take into account the relevancy of other out-bound links. If you’re the only link about “blue widgets” on the page and you’re featured among dozens of other links, each one linking to a site on a completely different subject, then it really only spells one thing: link farm.
While I’d say the above should be serious considerations, we’ll call the next lot ‘maybes’ – I don’t even know if it’d be feasible (or even possible) to assess some of these criteria…
By language/location: Arguably, this depends on industry. For example, a car insurance website is probably not going to get many international links because it’s specific to an individual country, although it might get international links if it does something worthy on a global scale, e.g. a successful viral campaign that’s admired worldwide. Other companies in other industries might be global. However, the EX and CY examples above are UK-specific, but have an irregular amount of links from sites from European and Asian countries. It’d be great if Google could measure the authenticity of these links in a way that works out if the site being linked to warrants the link, i.e. are they the subject of the discussion, or just another random link appearing on the page?
By blogging platform: Speaking of foreign language sites, I noticed that some of the sites being spammed like crazy in the comments were French and that most of them had a very similar blog structure and set-up, so I’m wondering if Google has the ability to discount or discredit certain blogging platforms, assuming this is a blogging platform targeting the French-speaking market? Again, this is a bit unfair for all those who are using their blog properly and are strict when it comes to monitoring comments, but, as I’ve argued above, it could be the blogging platform developers’ fault for not configuring its features properly for its users, e.g. maybe the spam filter isn’t strong enough, or there isn’t even one in place. I don’t think this too outrageous a consideration, especially given the fact that some blogging platforms are seen to be more SEO-friendly than others…
By the extent of the visibility of ‘bad’ code: By “visibility,” I mean exactly what I say: not just bad code by the search engine spiders’ standards, but bad code that people browsing the site can actually see for themselves, too. For example, when these spammy blog comments have been entered irrespective of the blog’s guidelines and you can actually see the <> of broken HTML code (I even saw the  of broken BB code, in a couple of instances), then this could – and maybe should – be an indication of improper on-site technical practices. It goes without saying that Google likes clean code (insofar as the site is easy to navigate for both search engines and humans), although I can’t say that I’ve come across any reference to visibly bad code, but maybe that’s because it’s just so obvious that it’s probably not a good thing to have on your site.
Again, I’ll repeat: it’s easy for me to say all this – a Google engineer might read this and say “we KNOW, get off our case!” and it’s clear that – even recently – they are apparently working very hard to try and solve the problems associated with spam affecting their index. Even so, I’m hoping that some of my observations indicate the seriousness of the problem and help them in their quest for a search engine that’s swayed by spam as little as possible (if that is even possible and can ever be achieved). Heh, I was even going to say that maybe they should consider hiring someone who’s worked in the SEO industry as part of their engineering team, although it looks like they’ve already something done this for their Google Webmaster Trends Analyst team.
Whatever the case and whatever the future holds, here’s to hoping that black-hats shall rue the day (one day) and that white-hats will see their efforts pay off…
[Image credits: Black-Hat Stormtrooper by Jay Malone; White-Hat R2D2 (a.k.a. one of the coolest pics I’ve ever seen) by Brittney Le Blanc]