The Noisy Little Monkey Blog

ghost-referrer-spam.png

Ghost Referrer Spam

Posted in analytics by Steven Mitchell on 19-Oct-2015 14:00:15

Referrer spam is a real pain in all of its forms. It pollutes your lovely clean Google Analytics data with fake visits to your website, generated by automated scripts. Typically, these visits have no engagement – a 100% bounce rate, 0 second time on site and a really stupid referring domain such as “buy-cheap-online.info”.

This can really wreck your statistics by skewing your conversion rates and acquisition channel ratios. Taking action to prevent this type of thing is a wise thing to do.

Before we begin

As always, we recommend creating separate filtered and non-filtered views of your website in Google Analytics. You should use your filtered view for analysis, but keep the non-filtered around as a control group, to check that your filters are working.

We also recommend annotating the changes you make in Google Analytics, so that in two years when you’re scratching your head trying to work out what happened today to make your referral traffic drop off a cliff, you can see it at a glance.

The Obvious

  • Head to the admin section of your property
  • Click on “View Settings”
  • Check “Exclude all hits from known bots and spiders

Bot Filtering

Filter the spam.

  • Head back to the admin section of your property.
  • Go to Tracking Info
  • Go to Referral Exclusion List
  • You will need to add referral exclusions for all known Referral Spammers.

exclude semalt

There is a fairly comprehensive list of these to be found here: http://www.ohow.co/wp-content/uploads/Referrer_Spam_List.txt

However, it is useful if you are able to identify referral spam yourself.

  • Head to the reporting tab
  • Head to Acquisition > All Traffic > Referrals

refferal nonsense

If it has a 100% bounce rate, 1.00 page/session and an Avg. Session Duration of 00:00:00 this is a huuuuge red flag and the site is almost certainly spam. If you are suspicious but not totally sure, try Googling the domain and see if you get a load of results from disgruntled webmasters complaining about their analytics data. That’s usually a dead giveaway!

Advanced "future proofing"

The problem with the above two steps is that new spammers are born all the time, so you will have to periodically check your Analytics for new referrers, which can be annoying.
We can’t automatically filter all of the new ones, but we can automatically filter less sophisticated ones. Doing this also has the advantage of stopping Ghost Event Spam.

  • Head to the reporting tab
  • Select a really big timeframe so that we can look at lots of data
  • Head to Audience > Technology > Network.
  • Select Hostnames as the Primary Dimension
  • Show as many rows as you can

We are looking to whitelist valid hostnames – by extension, blacklisting everything else. The only hostnames you should be accepting are domains that legitimately have your Google Analytics code on them. This means your domain (which would include any subdomains) and perhaps if you have a separate shop or blog domain that you also run the same analytics code on. Some unusual but valid examples include translate.google.com or any third party e-commerce shopping cart services, which you will want to keep. Forget everything else, even google.com or (not set).

  • Make a note of your valid domains.
  • Head back to the Admin tab.
  • Select Filters in the View column
  • New Filter
  • Filter Name > Valid Hostnames
  • Filter Type > Custom
  • Include
  • Filter Field > Hostname
  • Filter Pattern:

Populate the filter pattern field by listing your valid domains, separated by a pipe, with no spaces.

For example:

If your valid domains are:

  • shopdomain.com
  • noistylittlemonkey.com
  • secretnNLMarea.com

We would enter them into this field as below:

shopdomain.com|noisylittlemonkey.com|secretnlmarea.com

Use the verification tool to check you’re happy with the results. If you are, save the filter!

And you are done! Just remember, if you move domain, add a new domain, add a new third party checkout service, remember to add it to the filter, or the data from that domain will be treated as spam!

Won't this stop me from seeing legitimate referral traffic?

No. What we are doing here is qualifying that the tracking information being sent to Google is originating from your domain! Think about it - there is nothing to stop me from taking your Analytics tracking ID from your page source and adding it to all of my own website, completely screwing your data. This is the mechanism whcih some spammers use, particularly Ghost Event Spam.

Thanks to the dudes @ http://www.ohow.co/what-is-referrer-spam-how-stop-it-guide/ for re-assuring us that our methods are valid.

Tags: analytics

Steven Mitchell

Steven Mitchell

Ste likes to mess about with the techie side of SEO. As such his blogs are mainly about SEO or rants about bad web development practice.