Tracking Links with JavaScript

So you want to use JavaScript to track some links. Easy. Just attach a click handler and call some function that triggers a tracking beacon. Done. Oh, but what about the possible race condition between the execution of your function and the browser’s following of the link? No problem.¬†You can get around that by preventing the default behavior on the link and setting window.location == url inside a 100ms setTimeout. That 100ms delay should be plenty of time to fire off the beacon. Hold on… IE doesn’t pass the referrer when setting a URL with window.location. So now we’ve got a different problem.

Wait. I know. We can inject a new anchor element into the DOM, with the original link’s href value, and then trigger the new link programmatically. Race condition and referrer problem solved. Boom goes the dynamite.

Ah, it turns out there is not an easy cross-browser method of triggering a link with Javascript that actually causes the browser to follow the link. So that’s not going to work either.

I dug into this problem a few days ago and learned a few things in the process. This post is my attempt to summarize what I figured out.

Is this necessary?

What are some of the cases where you’d want to use Javascript to track links? The most common case is probably tracking external/outbound links. Because you can’t get analytics on the destination pageview, you need to use JavaScript to record the link click before the user leaves your site. The same is true for file downloads and mailto links.

But I think there are also worthwhile cases that apply to internal (same domain) links. What if you want to compare clicks on two different calls to action that point to the same URL? Or measure how many clicks your main navigation is getting? Or track some application event that results in a non-unique URL? JavaScript isn’t the only option but often it might be the best one.

Before we go on let’s lay out a few assumptions. First, let’s assume that intercepting links and using JavaScript to log click data is something you need to do — in other words, you’ve decided against using server-side variables or janking up your URLs with query parameters. Let’s also assume you’re aware of any accessibility issues that might arise when you use JavaScript to hijack links.

Browsers are in a hurry.

The main problem is that the browser is not interested in waiting for your tracking function to do its thing. If your function can finish executing in time, great; but the browser is going to move on with or without it. So we need a way to slow things down.

setTimeout

Probably the most common method is using setTimeout to insert a slight 100ms delay, a delay that gives your code plenty of time to execute while still allowing for the UI response to feel instantaneous. This is the method that Google recommends for tracking outbound links in Google Analytics.

Your code might look something like this:

$('#menu').delegate('a', 'click', function(event) {
 var url = this.href;
 
 //Code to log some data...
 
 setTimeout(function() { 
  window.location = url;
 }, 100);
 
 event.preventDefault();
});

This doesn’t guarantee execution but it gives you enough of a head start in the race.

_gaq queue

If you’re using Google Analytics’ asynchronous implementation (you are, right?), there’s another option that makes use of the _gaq queue. Inside your click handler you push the tracking event onto the _gaq array and then push the function that changes the URL.

The items that you push onto the _gaq array are executed immediately and synchronously. Even though the _trackEvent method triggers an asynchronous XHR request, we don’t have to wait for the utm.gif response before moving on to our function that sets the window location. Your code might look like this:

$('#menu').delegate('a', 'click', function(event) {
 var url = this.href;
 
 _gaq.push([ '_trackEvent', 'Page Actions', 'Navigation', url ]);
 _gaq.push(function() {
  window.location = url;
 });
 
 event.preventDefault();
});

Update 11/2/2011: Daniel Harcek pointed out in the comments that once the GA script loads, _gaq becomes an object and push() becomes a custom method. So when you use the code above you’re not actually pushing onto a queue but simply calling functions in sequence. The result is the same but I should point out that the GA script isn’t doing anything magical here — you could achieve the same effect by writing your own function that triggers a beacon, followed by a function that redirects to the new URL. Thanks Daniel!

Anchor proxy

The idea here is that you create a new anchor element with the same href as the one you’re tracking, and then programmatically trigger the new link. As I mentioned at the beginning of the post, making this work cross-browser is non-trivial. It’s also pretty ugly. Let’s just move on.

Lock up the browser UI thread

Google Analytics does not offer external link tracking out of the box, but other analytics services like Clicky, Omniture, and Nielsen NetRatings do. How do they do they do it? They run a CPU intensive function that locks up the browser UI for 300-500ms. Here’s the “pause” function from Clicky’s tracking JS:

this.pause = function (x) {
 var now = new Date();
 var stop = now.getTime() + (x || clicky_custom.timer || 500);
 while (now.getTime() < stop) var now = new Date();
};

Don’t do this. Don’t hijack your users’ CPUs to track clicks.

Trade-offs

Since the anchor proxy and UI lock-up are not really options, we’re left choosing between the setTimeout and _gaq queue methods.

In both of these we have to use window.location to execute the link and that raises two issues. One, IE8 and below do not pass a referrer when the page is navigated via window.location. This is a problem because even if you don’t need the referrer for what you’re tracking, having a lot of referrer-less pageviews could screw up your traffic source data.

Two, when links are processed this way the user loses the ability to open links in a new tab with the shift/command key. To me this is a major usability failure. Luckily it’s not too hard to work around: listen for the shift/command keycodes and only run your code if the shift/command key is not pressed.

The only other solution I know is to forget all of this, ignore the race condition, and just bind your tracking function to the link’s click event. You’ll lose some tracking events — I’m not sure what the expected average loss is — but in exchange you’ll maintain usability while keeping your code clean and simple.

If you can afford the potential data loss, I think that’s the way to go. If you need more of a guarantee, you may have to accept the drawbacks associated with using window.location.

I’m sure there are other techniques but these were the only ones I could think of and the only ones I’ve seen discussed or practiced out in the wild. In the future we may better options like the ping attribute. Do you know of any others?

Both comments and pings are currently closed.

Discussion

Rob, thanks very much for writing about this. I thought I was going crazy when my tracking code failed to execute about 90% – 95% of the time.

The setTimeout solution works fine, but, as you say, it’s a guess about how much time is optimal, which could vary from browser to browser and from computer to computer.

I found yet another solution for a special case of tracking, and this one actually is 100% reliable. The special case is when the tracking code uses $.ajax to record the tracking data server-side. The solution is to set async: false in the $.ajax function call. Yes, it creates a noticeable lag, but then so does setTimeout.

An extra added benefit of this tracking methodology is that it lets me track form submissions (which are obviously not amenable to the window.location method) just as easily as tracking link clicks.

Now if I could only find a way to track the “right-click / open in new tab” link action I’d be a very happy camper. As far as I can tell, the only way is to create a custom context menu and substitute it for the browser’s native context menu. Apparently not all browsers permit this. Any thoughts?

Steve Diamond

why cant you add the click event not prevent the default event? I dont understand?

just run the default event (the link) in a callback to your click tracking code?

@Steve,
Thanks! I’m glad you liked the post. Your ajax idea is interesting. I think the downside is that by firing off an async ajax request you’re still locking up the browser. If it only takes 100ms or so then I guess this doesn’t really matter. But something about generating an extra blocking request for every link seems not great… then again all of the techniques I mentioned in the post are hacks and less than ideal. :)

The obvious limitation is that you have to control the server-side data capture yourself. It wouldn’t work with the Google Analytics’ events API or another tracking service that used image beacons.

Is there not stil a possible race condition? Couldn’t the browser navigate from the page before jQuery was able to initiate the request?

@Luke,
Because the browser will not wait for your callback to execute before it navigates to the next page. It’s a race… there’s no guarantee that your script will have time to execute.

Ben Graham

I’m losing about 20% of my events that are fired as part of an onclick event on a form post. I’ve been trying to find a way around this race condition without using setTimeout (which I feel is inelegant and will never guarantee proper completion).

Because I’m doing posts and not just redirects / anchors I don’t really want to have to screw around with callback functions. We’re a transactional ecommerce site and the amount of debugging that we’d have to do to be sure that works reliably cross-browser is daunting.

I was hoping to find a way to expose _gaq’s queue, so I could just loop and watch the queue until it’s clear. _gaq.u seems like it may be the right one, but when I check its value it’s always 0 (when debugging in Chrome, which never seems to lose the race anyway). Rob, have you found a way to expose the queue?

Daniel Harcek

Hello Ben,
I think the reason why the are not able to “expose” _gaq queue is because once GA js lib is loaded, it is not longer a queue but it’s converted to object and push becomes to be a method which immediately executes the command.

See
http://code.google.com/apis/analytics/docs/gaJS/gaJSApi_gaq.html

“This function is named push so that an array can be used in the place of _gaq before Analytics has completely loaded. While Analytics is loading, commands will be pushed/queued onto the array. When Analytics finishes loading, it replaces the array with the _gaq object and executes all the queued commands. Subsequent calls to _gaq.push resolve to this function, which executes commands as they are pushed.”

It could possible to read it in between the lines from Rob’s text, see “The items that you push onto the _gaq array are executed immediately and synchronously.”
Misleading might be using the word “queue”, since that is a half true – well in fact not half – most of the time you operate with it in the context of object.

Daniel,
You’re absolutely right. I’ve updated the post to clarify. Thanks for pointing it out!