Commentary

SOS: Publishers Have Sprung A Data Leak...Or Two...Or 40

If publishers don't start making the most of the data they collect on user behavior, then someone else already piggy-backing on their site will. As cookies start flying every which way to feed this complex ad-targeting infrastructure online, the big topic for content providers is "data leakage." Who is collecting data on a publisher's users via third-party cookies and without the publisher's knowledge or consent? We have already seen this year some yield optimizers try to service this worry among their publisher partners. A new category of tag/cookie containers has cropped up, promising to give publishers greater control over the tracking pixels and cookies that get planted on their site.

advertisement

advertisement

How bad is the problem? While admittedly an interested party, one solution provider in the category, Krux Digital, claims in a new study that an analysis of the top 50 publishers online shows they may be losing between $850 million to $1.2 billion to third parties. The ad network ecosystem is monetizing data that the publishers should be using more effectively rather than sharing indiscriminately. "Publishers are being technologically outmatched on the buy side when it comes to targeting and leveraging their unique assets to their fullest purpose," says Ben Crain, vice president of marketing and corporate development, Krux. Crain says Krux has the data to prove it. In late summer the company started analyzing a sample of URLs across the top 50 sites to see what data collection was being done, whether it was on the page itself, in an iframe, or from an ad call. Krux found 167 different entities participating in data collection on these sites.

Almost a third (31%) of data collection on a page was being done by a third party, and in many cases by yet another party whose tracking tag or cookie was being "ushered" in by the most apparent third party. In fact, over half (55%) of the third parties also brought at least one other data collector with them. The sheer volume of the entities involved across so muchinventory has to cause some concern among publishers about who is sharing with whom. How much knowledge does a content owner have of the embedded partners in a cookie?

To be fair, a number of the tracking codes on these pages involve analytics software and such. Overall, there were about ten data collection events occurring on every page studied. Crain says that in some cases with major publishers the number of tracking entities went up to 40 on a page. Even before we get to the revenue implications, the performance impact of having that many entities on a page can be palpable to the user.

But in Krux's scan 27% of the data collectors were ad networks or exchanges, DSPs or widgets. That last category is interesting, because the widgets some publishers are partnering with to bring valuable content onto their sites may also be collecting data to leverage in their own ad networks.  Crain says that the danger in letting third parties take a publisher's data is not just in direct lost revenue but in the creation of competitive products.

The ad networks and exchanges may of course argue that the data they are collecting improves the targeting and pricing within the entire ecosystem and so benefits the publishers who are contributing data. Crain argues that "the biggest media threat is the way in which this data can be used as an overlay to inventory and create secondary premium product sets that are available on the market that is priced considerably lower than direct sales of premium inventory." He sees a threat both from downward pressure being put on overall display pricing as well as loss of share to a secondary channel.

As we discovered at the last Mediapost Ad Nets conference in New York in early November, no one has a handle on what the data layer really is worth right now. So it is even harder for protective products like Krux's to demonstrate to publishers how much they really may or may not be losing through leaked data. "Step one is to give visibility so publishers can understand data collection and how it aligns with the partners," says Crain.

In other words, publishers may not be able to put a cash value yet on the amount of data being lost to third parties, but they can see where the ad ecosystem is placing its bets by understand what part of their own site seem valuable to these entities. The Krux product gives them a way to track who is collecting data, and the company is in the beta stage with a tag management system that contains and controls third party tags. Ultimately the goal is to give the publisher more flexible control over the data so they can combine it with a third-party source and even improve the value of their own direct sales inventory.

In other words, the sell side has to start arming itself with the kinds of technology the buy side has been cultivating now for years.

1 comment about "SOS: Publishers Have Sprung A Data Leak...Or Two...Or 40 ".
Check to receive email when comments are posted.
  1. Bruce May from Bizperity, November 19, 2010 at 1:12 p.m.


    Wow! How much of this do we not understand at all? Clearly, the technologies are moving faster than we can keep up with them. That is an unfortunate aspect of new and emerging technologies that are naturally integrated with other technologies in ways that the creators can not plan for or even know is happening. Krux reveals how a system naturally evolves on its own. No one is in charge of this anymore. All we can do is respond to the changes as we become aware of them. The system is slowly being refined but in a completely organic way (as in not being controlled by humans). That's pretty much how microorganisms evolve in the real world. The word "virus" is less a metaphor and simply a direct representation of the organic world in a machine environment. We need more anti-bodies!

Next story loading loading..