Ghost Referrer Spam
Referrer spam is a real pain in all of its forms. It pollutes your lovely clean Google Analytics data with fake visits to your website, generated by automated scripts. Typically, these visits have no engagement – a 100% bounce rate, 0 second time on site and a really stupid referring domain such as “buy-cheap-online.info”.
This can really wreck your statistics by skewing your conversion rates and acquisition channel ratios. Taking action to prevent this type of thing is a wise thing to do.
Before we begin
As always, we recommend creating separate filtered and non-filtered views of your website in Google Analytics. You should use your filtered view for analysis, but keep the non-filtered around as a control group, to check that your filters are working.
We also recommend annotating the changes you make in Google Analytics, so that in two years when you’re scratching your head trying to work out what happened today to make your referral traffic drop off a cliff, you can see it at a glance.
- Head to the admin section of your property
- Click on “View Settings”
- Check “Exclude all hits from known bots and spiders
Filter the spam.
- Head back to the admin section of your property.
- Go to Tracking Info
- Go to Referral Exclusion List
- You will need to add referral exclusions for all known Referral Spammers.
There is a fairly comprehensive list of these to be found here: http://www.ohow.co/wp-content/uploads/Referrer_Spam_List.txt
However, it is useful if you are able to identify referral spam yourself.
- Head to the reporting tab
- Head to Acquisition > All Traffic > Referrals
If it has a 100% bounce rate, 1.00 page/session and an Avg. Session Duration of 00:00:00 this is a huuuuge red flag and the site is almost certainly spam. If you are suspicious but not totally sure, try Googling the domain and see if you get a load of results from disgruntled webmasters complaining about their analytics data. That’s usually a dead giveaway!
Advanced "future proofing"
The problem with the above two steps is that new spammers are born all the time, so you will have to periodically check your Analytics for new referrers, which can be annoying.
We can’t automatically filter all of the new ones, but we can automatically filter less sophisticated ones. Doing this also has the advantage of stopping Ghost Event Spam.
- Head to the reporting tab
- Select a really big timeframe so that we can look at lots of data
- Head to Audience > Technology > Network.
- Select Hostnames as the Primary Dimension
- Show as many rows as you can
We are looking to whitelist valid hostnames – by extension, blacklisting everything else. The only hostnames you should be accepting are domains that legitimately have your Google Analytics code on them. This means your domain (which would include any subdomains) and perhaps if you have a separate shop or blog domain that you also run the same analytics code on. Some unusual but valid examples include translate.google.com or any third party e-commerce shopping cart services, which you will want to keep. Forget everything else, even google.com or (not set).
- Make a note of your valid domains.
- Head back to the Admin tab.
- Select Filters in the View column
- New Filter
- Filter Name > Valid Hostnames
- Filter Type > Custom
- Filter Field > Hostname
- Filter Pattern:
Populate the filter pattern field by listing your valid domains, separated by a pipe, with no spaces.
If your valid domains are:
We would enter them into this field as below:
Use the verification tool to check you’re happy with the results. If you are, save the filter!
And you are done! Just remember, if you move domain, add a new domain, add a new third party checkout service, remember to add it to the filter, or the data from that domain will be treated as spam!
Won't this stop me from seeing legitimate referral traffic?
No. What we are doing here is qualifying that the tracking information being sent to Google is originating from your domain! Think about it - there is nothing to stop me from taking your Analytics tracking ID from your page source and adding it to all of my own website, completely screwing your data. This is the mechanism whcih some spammers use, particularly Ghost Event Spam.
Thanks to the dudes @ http://www.ohow.co/what-is-referrer-spam-how-stop-it-guide/ for re-assuring us that our methods are valid.