Commentary

Analytic Vs. Audited Data: Yes, There IS A Difference

To start, analytics are not an audit. An audit involves a series of checks and balances, beyond simply quality control, that tests data for accuracy. Further, audits provide standardized metrics and methodology, consistency of process, and transparency of results.

Traffic analytic tools certainly have their place as an important business resource, but they do not produce audited data. From their inception, reputable companies providing analytic tools have never claimed to be auditors, but they have consistently positioned themselves as "third party."

In the world of media, third party equates to an auditor. This has caused misunderstanding in the marketplace whereby analytic data has been accepted as audited data.

This is not to disparage analytic tools. They do a very good job providing actionable information to help one manage a site and improve performance. But analytic tools were never intended to produce the audited traffic data upon which an ad buy/sell decision is made. And buyers of online media need to be aware that different tools produce different results.

Standardized Metrics and Methodology

Two years ago, a white paper was issued by Stone Temple Consulting entitled "Web Analytics Shootout." It is a comprehensive look at many of the popular analytic tools and explains why they produce different results.

An excerpt:

Web analytics packages, installed on the same web site, configured the same way, produce different numbers. Why?

1. By far the biggest source of error in analytics is implementation error. There are dozens (possibly more) implementation decisions made in putting together an analytics package that affect the method of counting used by each package.

2. Placement of JavaScript on the site.

3. Differences in the definition of what each package is counting. The way that analytics packages count visitors and unique visitors is based on the concept of sessions. There are many design decisions made within an analytics package that will cause it to count sessions differently, and this has a profound impact on the reported numbers.

Makes sense, right? But it's also scary because it means data results from a single site can be inaccurate due to several factors. It also means data results from different sites are not comparable, since there is no way to tell if mitigating factors are the same across all sites measured.

Let's look at a specific example of Point #3 above using sessions and duration as the metrics.

Using analytics Package A, session begins when one arrives at a site and ends when one leaves. The session will also end if there is 30 minutes of inactivity from that visitor. So if a visitor arrives at a site, stays 5 minutes, leaves, and returns 10 minutes later again for 5 minutes, the package will report two sessions with a duration of 5 minutes each.

Using analytics Package B, a visitor exhibiting the same activity pattern (5 minutes on the site, 10 minutes away, 5 minutes back) will be reported as one session for 20 minutes. That's because this package allows any visitor returning within 30 minutes to count as part of the original visit.

For buyers of online media, this poses a significant problem: how does one reconcile the difference between Package A and Package B in order to evaluate activity and make the best possible buy?

Consistency of Process

A further variable in analytic packages is that the user of the package controls many of the processing functions. For example, the user can control the degree to which filters are set to exclude mechanical traffic from spiders/robots. The user also controls whether to set any filters at all.

A generally accepted best practice is to filter spiders/robots listed by the Interactive Advertising Bureau. Analytic tools certainly have the capability to filter according to the IAB list --  but to what extent? With the user of a tool controlling the filter settings, traffic results can be manipulated. Unless all sites follow a standardized process of applying filters, as occurs in an audit, results can be questionable and are clearly not comparable.

Transparency of Results

One function of an independent media auditing firm is to make results publicly available. This is typically done through the auditing firm's Web site, which buyers of online media can access to identify those audited sites in a specific vertical market. The availability of audited traffic data in a single location is a benefit to buyers, as it provides an easy-to-use resource that makes the search process quicker and more efficient.

In conclusion, here's another excerpt from the Web Analytics Shootout:

Don't get hung up on the basic traffic numbers. The true power of web analytics comes into play when you begin doing A/B testing, multivariate testing, visitor segmentation, search engine marketing performance tracking and tuning, search engine optimization, etc.

Not a single mention in the paper of using the data for ad selling/buying. And that's because analytic tools were never intended for this purpose. They are intended to help one better manage a site and they do a good job of that. But for an actual audit of traffic data, only a truly independent media auditor provides standardized, reliable data upon which a media evaluation can be made.

17 comments about "Analytic Vs. Audited Data: Yes, There IS A Difference".
Check to receive email when comments are posted.
  1. John Grono from GAP Research, February 12, 2010 at 3:53 p.m.

    Hallelujah!

    An extremely erudite and well thought-out post. Every website publisher needs to heed that ... "analytic tools were never intended to produce the audited traffic data upon which an ad buy/sell decision is made."

    Put another way - it's internal management data only. As soon as you see a staff member taking this data out the door to a client - rap their knuckles! Confiscate the data there and then!

    Peter, I only have one very small quibble, and it is a matter of semantics.

    Even calling these software "web analytics packages" gives them a sense of scale, purpose or gravitas that is misleading. They are NOT analysing the web. They are analysing a single website, or a small grouping of websites.

    May I implore you to also start referring to them as WEBSITE analytics packages so that everyone is clear of their purpose.

    Once again, congratulations on a great post.

  2. Joshua Chasin from VideoAmp, February 12, 2010 at 4:28 p.m.

    I delayed in getting over here to comment and wouldn't you know it, Grono beat me to it.

    Here at comScore, with Media Metrix 360, we are now capturing and compiling site-centric data for many publishers. But there are multiple audit intiiatives underway for that data. First of course, is the MRC process, through which our Direct interface (where the beaconed data for a publisher will live) and ultimately all of Media Metrix 360 will be audited.

    But too, we recognize that publishers who wish to use our beaconed, census data in a standalone fashion externally would be well-advised to have that data (and its implementation) audited by a third party. Thus we expect to be working with both BPA and ABC, around the world, to faciliate that process. Peter, we culdn't agree more.

  3. Joshua Chasin from VideoAmp, February 12, 2010 at 4:29 p.m.

    Now that I think about i though, Grono had an advantage. It's already tomorrow in Australia.

  4. John Grono from GAP Research, February 12, 2010 at 5:55 p.m.

    The early bird catches the worm Josh! I can even look up the race results for you Josh and tell you who will the last race so you can place a bet ... hehehe.

  5. Giada Noè, February 13, 2010 at 11:38 a.m.

    I'd like to know your opinion about the italian Audiweb. My company is one of their census provider, but I admit that I agree with your analysis.

    Despite the fact that they have established common metrics and rules for the placement of JavaScript on the sites, who can ensure that every single implementation is correct?

    Analytic tools were never intended to produce the audited traffic data because they are much more! And people should know also that, on the contrary, the standard tools for traffic census have not to be confused with analytics tools!

    I thought it was only an italian problem..

  6. Peter Black from BPA Worldwide, February 13, 2010 at 5:03 p.m.

    John,

    Thanks for your comments and I'll be sure to follow your recommendation to be more precise in referring to the analysis of a website, not the web.

  7. Peter Black from BPA Worldwide, February 13, 2010 at 5:05 p.m.

    Josh,

    Appreciate the comment. I see we are both on the upcoming OMMA conference panel regarding measurement. I'm looking forward to it.

  8. Peter Black from BPA Worldwide, February 13, 2010 at 5:09 p.m.

    Hi Giada,

    I'm not familiar with the Italian Audiweb but if its an analytics package, my comments would apply it it too. I can't emphasize enough that analytic tools are extremely valuable and serve a distinct purpose. They were just never meant as a data source for buying/selling advertising.

  9. Jim Sterne from Target Marketing, February 14, 2010 at 8:47 p.m.

    Hi Peter -

    As the Chairman of the Web Analytics Association and founder of the eMetrics Marketing Optimization Summit I have to agree with you -- half-heartedly. I too have a semantic issue.

    Your depiction of audit data as accurate and web(site) analytics data as inaccurate it, well, inaccurate.

    You say that an audit tests data for accuracy. I say an audit tests data against a specific yard stick to ensure comparability to other websites. But to say that audit data are sacrosanct rather ignores the fact that one auditing firm's numbers have never matched another.

    On the rest, we are in violent agreement. The act of measurement for ad buying and selling MUST be carried out by a third party. Should some publisher offer you internal web analytics reports as validation for their invoice you have two choices; a) leave the room quietly, or; b) leave the room laughing hysterically. I leave that to you.

  10. Judah Phillips from A Big Global Brand, February 14, 2010 at 10:41 p.m.

    Your company and one of your competitors audited the "web analytics package" I was managing for a large global media company and found the census data we generated to be 99.9% accurate. In fact, the audit uncovered problems with the audit methodology and software, which failed to correctly count cookies that were set in our log files differently then expected. It was only when I provided the actual discrete, individual cookie values down the exact one that we uncovered the issue, which, I was told, raised questions internally over the accuracy of ever other audit ever done because other companies may have set cookies in a similar way and thus may have been undercounted.

    The reality is that audits, unless they have changed since I last did one 1.5 years ago, simple process company-provided server log files against non-concensus based, proprietary standards specific to the auditing company. Now I'm not saying audits are useless or invalid - far from it - they have their place in the buying/selling of advertising, just like audience measurement tools and web analytics data, but they can't be accepted as unarguably accurate because they just aren't. They are just another prismatic view into the dark science/art of web measurement.

  11. Judah Phillips from A Big Global Brand, February 14, 2010 at 11:23 p.m.

    One more thing - the IAB Spider and Bots list is certainly useful and money well-spent, but it is not complete at all by any stretch of the imagination. In fact, if you have a web analytics tool that is able to report actual visitor-level detail data, not aggregates, processed out of log files, from a well-trafficked site, you'll find thousands of spiders unrepresented in the IAB list.

    The list is primarily useful for filtering bots if you are processing server logs (like is done for an audit). Remember javascript page tags, the primary data collection method used by most in-house and SaaS web analytics tools, aren't generally executed by bots (though some bots execute them). Heuristics are needed to detect other robotic patterns (long visits, high view-to-visit ratios, inordinate #'s of page requests over short durations and so on). I know the BPA auditing software and some log-based and SaaS analytics software providers do use such algorithms to accommodate for bots missed by the IAB list.

    The takeway here is that an advertiser needs to examine multiple data points to fully qualify an ad buy. No source is truly 100% accurate or correct nor should be believed as the single source of truth no matter how much the vendors want to sell you on their methods.

    At the end of the day Neilsen doesn't match comScore Panel data which doesn't match comScore 360 data which doesn't match AGOF data which doesn't match OWA data which doesn't match BPA data which doesn't match Omniture which doesn't match Unica which doesn't match Google which doesn't match Coremetrics which doesn't match WebTrends which doesn't match some homegrown or commercial log file parser which doesn't match Quantcast which doesn't match Alexa which doesn't match anything. That's the reality of our industry and why we're all working to optimize our shared methods and as a result improve Internet commerce. There's plenty of room on the table for all the data and plenty of seats for the analysts to dig through it all in our noble quest for truth!

  12. Mikhail Doroshevich from e-belarus.ORG, February 15, 2010 at 6:28 a.m.

    Great post, thanks!

  13. Guillaume Wolf from Alenty, February 15, 2010 at 8:31 a.m.

    I'm not sure you can affirm that Web Analytics can't deal with media plan optimization.
    Some services offer solutions dedicated to the ad analytics. Alenty is an example that emphasizes how analytics can enhance advertisement. These feedbacks are the true value of on-line advertisement.

  14. Peter Black from BPA Worldwide, February 16, 2010 at 10:02 a.m.

    Jim, you make a fair point. I do not mean to suggest analytics data is inaccurate, sometimes it is and sometimes it isn't. That's what an audit reveals. And you hit the nail on the head with respect to an audit being a "yard stick" against which to measure. It's the standardization of the metrics, process, etc. that allows for comparability.

  15. Joshua Chasin from VideoAmp, February 16, 2010 at 10:04 a.m.

    Some great comments here.

    The primary point I take away is that internal data is internal data. It is not external data util it has been ausited; that is he covenent that buyers and sellers have reached.

    One comment I'll add in response to Judah: in my experience auditors dont' audit to their own standards; generally they enndeavor to audit to third-party published standards (i.e., here in the US, those from the IAB.) In fact, if you navigate to the IAB ad impression guidelines, they even list 7 companies (including, it should be noted, BPA) who might conduct such an audit.

    I've managed to get this far on thetopic of audience data that can be used externally, without sticking in a plug for a hybrid of internal, census data with a robust audience measurement panel. But should you need such a plug, you know where to find me.

  16. Peter Black from BPA Worldwide, February 16, 2010 at 10:14 a.m.

    Judah, having been in the media auditing business for a number of years, it has always struck me as odd that there is a notion an audit is intended to uncover what is wrong when in most cases it uncovers what is correct! So your experience with the audit verifying 99.9% accuracy is not unusual.
    As for spiders/bots, we do follow the IAB list but also discover others that operate only in distinct vertical markets. We include those in our filtering as well.

  17. Richard Bennett from ImServices Group Ltd, February 17, 2010 at 2:12 p.m.

    Hmmmmmmm. I'll be little contoversial here.
    I wonder if we're not making too great of a distinction here btw analytical systems and audited figures.
    If a company installs one of the better analytics systems, and configures it to count consistent with industry guidelines, then an audit should show that metrics are accurate. And the "measuree" should feel confident that the system will measure consistently accurate, unless there are significant changes to the system configuration or the web site.
    That said, I find it amusing to also point out that the US IAB RETRACTED their definition of a page view a few year ago. I also observe that NO commercially available measurement systems have had their audience measurement methods validated to the IAB Guideline (I do not consider the IAB unique browser metric to be of any use to anyone because it does not come close to audience). I also don't find anything on the IAB website to speak to any companies committed or audited to the audience guideline.
    So if IAB US is afraid to standardize page views (for good reason) and few, if any, companies have their audience metrics measured and audited to the IAB Guideline, what are we talking about? Maybe that is what Jim Sterne is getting at........
    Now if we are talking advertising activity, if set up correctly, we don't see analytical packages producing much different numbers from the audited ad serving companies.
    On the subject of the IAB/ABCe spiders and bots list, I should point out that my company maintains this list for IAB. The list was primarily developed to improve measurement of advertising activity, and then was expanded to cover page views and other audience metrics. IAB Guidelines and best practices mandate use of this list to filter "known" bots. We many times see that it captures about 50% of normal robotic activity. IAB Guidelines also require behavioral filtration to remove robotic activity that does not "self identify". We receive regular update recommendations from IAB US members as well as from European companies via ABC Electronic. In our experience, any measurement system will not be accurate without these two filtration methods or an equivalent.

Next story loading loading..