We don’t talk about it much, but there is a growing bias in the data that feeds our digital marketing algos and measurement.
This emerging instability in the data assets that power our industry are amplified by browser tracking changes, government regulation, user consent frameworks and clean room ID redaction. This challenge presents an opportunity for the industry to fix pervasive but previously unaddressed measurement and optimization issues.
Many of us have been focused on viewability, fraud, unfair auctions and supply-path optimization. These issues took priority, but now that they appear to be settling, and new friction is emerging, I feel the time is right to address some fundamental issues in measurement that make marketers uneasy about their digital investments and create a shroud over the industry that keeps ad dollars away.
Shining light on the problem is the first step in correcting the behavior.
The 20,000 panelists in the Nielsen TV-rating survey are limited in size, but they are thoroughly vetted to ensure that the panelists are a balanced representation of the US population. The universe of the Nielsen sample is an example of a methodological gold standard of unbiased measurement that is trusted by marketers.
Ad tech data scientists seem to rarely consider sample bias in the digital data assets of their platforms or walled gardens. They are tasked with optimizing the attributed cost per KPI as measured in their bespoke universe of data.
These are two different approaches to measurement and optimization that are separated by their vantage point.
The Nielsen approach assumes that learnings from its research are deployed to US households and relative to the household. It is a household-centric view.
The ad tech platform and walled garden approach assumes that the learnings are deployed in the same platform from which the data was extracted and are relative to the platform. It is a platform-centric view.
Both of these approaches are technically unbiased in the opinion of the researcher or data scientist. The distinction is irrelevant as long as the universe of data in the digital platform or walled garden is large enough that it is representative of the population. However, the sand is shifting under the foundation of the digital platforms, and I believe the assumption that the digital data is unbiased can no longer be accepted.
Bias created by ad blocking is a harbinger of what is to come from browser-level ID deletion, such as Safari’s Intelligent Tracking Prevention (ITP), and government regulated opt-in, which is required in the California Consumer Protection Act (CCPA).
The underlying data that feeds digital machine learning algorithms or heuristic targeting decisions have always been biased to a degree, but the problem has become acute since the rise of ad blocking in 2013.
The number of monthly active ad-blocking users, as estimated by PageFair:
The acceleration in ad blocking has skewed the digital sample used in ad tech data scientists' training sets, making the digital platform universe of data less representative of the US population. ITP, similar restrictions in Chrome and privacy regulation will further skew the data because some browsers will delete IDs more frequently than others, and some users will opt in to tracking consent and others will not.
Because some patterns of users are blocked or opt out of tracking before they are ever a member of the universe, their patterns, attributes and features are never seen by targeting or optimization models. Much like a black hole is only detected by the absence of light, the missing patterns are only detected when observed from an external vantage point.
Signs of bias or black holes
Research shows that 31% of Americans use ad blockers mostly on their laptop and desktop. Men are 10 percentage points more likely than women to use ad blockers, and 18- to 34-year-olds are 10 percentage points more likely than all age categories to employ ad blockers.
ITP is creating a similar black hole for Apple’s Safari browser. Conversion rates for Apple browsers and devices have fallen since the introduction of ITP as Safari IDs’ half-life shortened. Overall CPAs are relatively unchanged as algorithms and media planners adjusted to the black hole, but this is unsustainable as the problem expands.
Safari browsers represent 15% of US usage, compared to 64% for Chrome. The average salary for an iPhone user is $53,251, more than 40% higher than the $37,040 average salary of Android users.
Their tastes in TV also diverge: Android users are fans of “NCIS,” “Law & Order” and “Saturday Night Live,” while iPhone users watch “Game of Thrones,” “Grey’s Anatomy,” “Friends” and “The Walking Dead.”
Pending government regulation will likely amplify the impact of ad blocking as proposed user consent frameworks add friction to persistent tracking and, by default, more sub-pockets of the population appear as black holes and increase sample bias.
The result is that digital marketing campaigns that are executed by digital platforms and designed to drive performance against measured CPA or target specific addressable audiences will not reach their targets.
These sub-pockets are less likely to be tagged with a persistent identifier, resulting in their underrepresentation in the sample, which will make it difficult for their distinct patterns to be identified by analysts or machines.
Mobile app data is not immune to the trend. The issue is not isolated to universally unique identifiers (UUIDs) in browsers – it will also likely skew mobile advertising ID (MAID) data. Some industry observers believe that Apple and Google will make similar moves that reset IDFA (Identifier for Advertisers) and AdID in mobile apps, respectively, and MAIDs could become as unstable as UUIDs in browsers. If they don’t, then the heavy hand of government regulation surely will cover mobile devices and mobility tracking. MAIDs are currently better than UUIDS, but they are not future-proof.
Identity resolution does not solve the problem ... yet
Identity resolution and onboarding providers that match identity to first-party or third-party digital IDs do not fix the problem today. These solutions rely on matching identity to digital IDs during log-in events, and digital IDs’ shortening half-lives require the log-in event to be verified more frequently to maintain persistent tracking.
Identity resolution needs to be coupled with a persistent log-in to make identity tracking more stable. Combined, they can go a long way to fix user-consented persistent identity that would assist in mitigating sample bias and other issues. They are just not ready yet.
There are many unknowns about how browser restrictions, government regulation, user consent, persistent log-ins and clean rooms will play out. However, what is clear is that due to the upheaval we have an opportunity to retool our measurement and optimization approaches. In that process we can correct where we have gone wrong and move the industry closer to a measurement gold standard that can be trusted by marketers.
If press coverage and major ad tech conference agendas are any indication, Supply Path Optimization, or SPO for short, is one of the hottest topics in ad tech.
But SPO means different things to different people at different times. In an attempt to demystify Supply Path Optimization, and highlight its recent popularity and most promising use cases, let’s go back to the beginning.
SUPPLY PATH FRAGMENTATION IS NOT NEW
SPO relates to the multitude of supply paths that an impression opportunity can take on its way from the publisher to the DSP. Even though the topic has become critical lately, due in part to the rapid adoption of header bidding, fragmentation in the supply path has existed since programmatic’s early days.
A common misconception at the start of programmatic was that the buying and selling of media was going to work like an ad exchange that provided “seats” to buyers and sellers that used the exchange to transact, like the simple market below:
What we got, though, looked more like the complex market
BUYER BID STRATEGY IN A SIMPLE MARKET:
In an efficient market, it is widely accepted that the optimal transaction mechanic between a buyer and seller is the second-price auction, popularized by the Vickrey or Generalized Second Price auctions and made famous by Google Search auctions.
The second-price market incentivizes buyers to bid their true value for any given good. The true value should be close to a buyer’s “willingness to pay,” or the max price that buyer would be willing to pay for that good. Because the buyer will never pay more for the good than his or her buying competitor is “willing to pay” the second-price auction transacts goods at a fair price, leaving the buyer without remorse from the transaction and willing to participate again.
In a DSP, this bid strategy can be represented well by bid multipliers or a probability function that calculates the probability that a certain impression opportunity will lead to a conversion event multiplied by the buyer’s willingness to pay for that conversion event, like a $50 CPA.
In a first-price market, a buyer bids his true “willingness to pay,” and, should he win, pays what he agreed to bid. In this case, he may experience buyer’s remorse if he feels he could have offered less and still won the auction. The optimal bid strategy for a buyer in this situation is to understand his willingness to pay but to shade that bid and bid something else based on either guesswork or sophisticated modeling of what he thinks the competition will be willing to pay for the good.
For a once-in-a-lifetime good, a modeled 1st price bid strategy obviously won’t work. But for multiple (of the same) goods being sold via first-price auction mechanics in sequential auctions, the buyer has the luxury of multiple future attempts to understand his or her willingness to pay, evaluate the need for that good right now vs. sometime in the future, and predict the expected market clearing price for a future given opportunity.
THE SECOND PRICE UNICORN:
Many SSPs declare second-price exchanges as their auction mechanic methodology. This is mostly true, but the second-price can sometimes be augmented, in some cases explicitly by the SSP (such as buy-side fees), and, in other cases, naively by market conditions (as happens when an impression opportunity is duplicated by a publisher across SSPs, pushing the actual transaction closer to first-price).
In the current state of ad exchanges, there is no such thing as a second-price auction. There are many yield management tactics that are deployed by publishers and intermediaries to increase the clearing price, which in -turn create a fragmented and duplicated market. What follows is a list of some of the most common tactics contributing to where we are today.
COMMON BUY-SIDE AND SELL-SIDE TACTICS IN A COMPLEX MARKET:
Since the early days of programmatic, many DSPs have been unwilling to submit all advertiser bids for any given auction to an SSP. The DSP will instead hold an internal auction before submitting to the external auction. When the DSP withholds bids from the external auction, they are acting like a pricing cartel, deflating demand and incenting the sellers to extract demand and price through other trading methods.
Publisher Ad Server Daisy Chaining
Publisher’s daisy-chain sources of demand, including SSPs and ad networks, in their ad server. Each source of demand is given a floor price, and an impression opportunity falls down the daisy chain (also called waterfall) with different floor prices until it is eventually picked up. In a daisy chain, an impression opportunity can be made available between 1 and up to 6 times, depending on how long it takes to get purchased.
Publisher Header Bidding
With the adoption of header bidding, an impression opportunity can now be seen first in the header and then subsequently in Google AdX and/or the publisher’s daisy chain. When a publisher uses a header tag with multiple SSPs, the header creates duplication on its own but, once the header auction is resolved, the impression is still present in Google AdX and then again in the publisher daisy chain if floors aren’t met. The net effect is that some impression opportunities are now duplicated by a factor of 5-30x thanks to header bidding adoption.
In some cases, publishers will contract with resellers to extract additional yield and bid density from the market. These network resellers typically have access to unique demand that sits outside the SSPs. They might also engage in tactics like format arbitrage, as happens when resellers receive a display impression from a publisher and attempt to re-sell it as a video impression at a higher price (in which case, the video ad is wrapped in a video player and inserted into the display ad slot).
In tactics similar to authorized reselling, some intermediaries will harvest demand from multiple sources and arbitrage market inefficiencies like network latency and UUID syncing to fill a publishers supply from any source of demand. The key difference is that the unauthorized reseller has not contracted with the publisher or advertiser. The publisher still receives higher yield than they would have had the unauthorized reseller not sold their inventory, but the unauthorized reseller has also taken a fee and compounded the ad tech tax.
SSP-Directed Bid Duplication
SSPs will increase their yield by farming out publisher calls to alternate sources of demand, like other SSPs that have agreed to help each other duplicate and underwrite publisher supply.
As an antidote to the DSPs’ withholding bids, some SSPs will ping the same DSP multiple times for the same ad impression in order to extract higher bid density from the DSP.
SSP Bid Tiering
Bid tiering is a form of SSP Directed Re-solicitation. In this form the SSP sends the same request to the same DSP with different price floors. This tactic offers some form of price reduction as it can still contractually be defined as a 2nd price auction but the end result is closer to a 1st price auction.
SSP-Declared 1st- or 2nd-Price Auction Dynamics
Announcements around auction type declaration by OpenX, AppNexus and Rubicon now present DSPs with information they need to make a more informed decision on how to bid for the two types of exchanges that these SSPs now support. Some of the declared first-price auctions have some sort of price reduction built in depending on the given auction from the SSP (and others do not), making it very difficult for DSPs to understand the true value of any impression opportunity they’re bidding on.
TOOLS TO OPTIMIZE FOR SUPPLY PATH FRAGMENTATION:
The backdrop created by these buying and selling yield tactics has long supported a fragmented market position. Any method to treat this fragmented market generally falls under the phrase Supply Path Optimization. The most common of those treatments are explained below.
Bid Stream Reduction
This treatment involves blocking undesired supply paths either through manual heuristic blocking, SSP-side filtering, early exit, or through a 3rd party bid-shaping technology. When executed well, a DSP should be able to effectively reduce the bids they are listening to without harming its ability to spend campaign budgets. When executed poorly, the DSP will miss out on impression opportunities they should have otherwise tried to purchase.
Fair Price Discovery
This is a buying strategy that modifies a bid response designed to work in a second-price auction to work for a fragmented market or first-price auction. These systems are designed to help the DSP pay a fair price for a given impression opportunity that is either a declared first-price auction or an unknown auction due to fragmentation.
Header Bidding Fill Rate Improvement
A common problem in header bidding is that many auctions are cleared as second-price in the header but don’t win in the ad server, thereby suppressing publisher yield. Consider this scenario:
Ads.txt is an IAB-backed policing tactic that lets publishers announce the supply players (networks, SSPs, etc.) they have authorized to resell their inventory to DSPs, so they can determine if they are buying that inventory through legitimate channels or not.
Bid Flattening to remove the ad tech tax
This DSP bid strategy identifies the supply path of a given impression and, depending on the DSP’s need for that impression at the given time, enables the DSP to bid the same amount for the given impression across multiple supply paths. This method prevents the DSP from overpaying on any given supply path. All else being equal, the supply path that extracts the lowest ad-tech tax will win the auction, as that path should be able to present the highest bid to the publisher.
There is a naive approaches to SPO that shut down given supply paths for the purpose of selecting low fee paths but this approach is dangerous. It trades off price at the expense of value as it ignores the dynamic and interwoven nature of a the supply path engineered to maximize publisher yield.
I'd recommend that the best approaches to tackle supply path fragmentation are contained in how the buyer chooses when they bid and how to bid instead of which path to turn off or turn on. In order to build an effective solution all the supply side tactics mentioned above need to be addressed. This function falls squarely into the overlooked field of buyer bid strategy that to date has been outsourced to DSPs with almost no buyer scrutiny or even understanding into how this function is handled within a DSP. I believe it is time for this to change and for buyers to own their own bid strategy that drive to their own KPIs.
I also believe that buyers will begin to focus on value over price and as they do the brands and agencies that can best control their bid strategy either within or across DSPs will yield the biggest gains. I believe it is time for the buy side principals to engage and control their own optimization and bid strategy to maximize outcomes on their behalf.
This category is one of my favorite topics. I'd love to hear additional ideas and suggestions from other people who are passionate about the topic.
I'm also happy to add any definitions or tactics that are not covered and I will republish and give you credit.
At ATS New York 2016, Nathan Woodman, general manager, Demand Solutions, IPONWEB delivered a keynote speech on the topic of first-party machine learning in programmatic, with a request to the industry: can we create a framework for ‘Open Machine Learning’?
With data growing at an unprecedented rate, we exist in a situation of extreme complexity; and the most complex model far exceeds human comprehension, with 50 million potential outcomes. Everyday we are creating 2.5 quintillion bytes of data, with more people connected to the internet than ever before, more connected devices with more sensors, and the meteoric rise in programatic and RTB adding more marketing-related data – it’s adding up fast. IDC estimates the amount of digital data being collected will grow from 4.4 zettabytes in 2013 to 180 zettabytes in 2025, with one zettabyte being the equivalent of one trillion gigabytes. “So we’re talking 180 trillion gigabytes’ worth of data by 2025”, explained Woodman. “That’s what people really mean when they say Big Data.”
According to Forrester: “There’s too much data. Marketing departments can’t deliver the analytics and deploy that level of agility that customers require. We’re reaching the limits of human cognitive power.” However, Woodman believes we’re already far beyond that: “A machine is infinitely superior when it comes to processing and applying information. Data is only really valuable if you can use it to identify patterns and make decisions about who to target, with what creative, in what context, and at what price, in real time”, explained Woodman. “When we’re talking about zettabytes of data, that task is simply beyond the realm of human comprehension and capability.”
Enter Machine Learning
This is where machine learning comes to the fore. The level of complexity that exists in machine learning is within three fields: Data, Analytics, and Machine Learning. Analysis involves processing the data for human consumption: “It dumbs it down into enough variables to be understood by a human”, explained Woodman. “Machine learning doesn’t dumb down data, it just applies it to a goal. We don’t know what the machine is doing, we just know it’s achieving its goal.” Machine learning looks for patterns among massive data sets, uncovers hidden insights, constructs algorithms to make data-driven predictions or decisions, acts in real time, and grows and changes when exposed to new data. While around since the 1950s, it’s only really gaining popularity today because of the vast swathes of data we have available.
Machine learning is already redefining digital advertising, and programmatic specifically. According to Juniper Research, USD$3.5bn (£2.4bn) is being funnelled into machine learning today and that is set to increase to a phenomenal USD$42bn (£28.7bn) in 2021.
A Brand’s Competitive Advantage
Referring to the running theme throughout ATS New York of ‘Bring Your Own Algorithm’ and how brands should be building their own, Woodman explained the importance of machine learning to a brand’s competitive advantage: “Brands need to have their own machine learning and not use somebody else’s.” And, according to Woodman, there already exists a number of companies able to build first-party machine learning models on behalf of their clients, citing IBM Watson, TensorFlow, and PredictionIO as the key players in this space. Woodman explained that machine learning and algorithms are currently owned by the walled gardens. Brands are currently giving all of their data to the likes of Google and Facebook, who are using it to build machine learning systems – despite the data belonging to the brands, the walled gardens own the black box responsible for their performance. “If brands are in charge of their own machine learning”, explained Woodman, “performance may be inferior, but it becomes their black box.”
What is the real competitive advantage of first-party machine learning? According to Woodman, the benefits are obvious and it is possible, and necessary, for brands to fully adopt machine learning systems. It brings unique first-party audience and media data into decisioning rules, it executes to a brand’s custom or blended KPI, and it modifies that KPI as the machine learns what works and what doesn’t, creating distinct derivative attributes based on the brand’s knowledge of target segments and also avoids fraud by seeking custom, hard-to-game KPIs.
Nathan Woodman, General Manager, Demand Solutions, IPONWEB
“At this point, machine learning is mission-critical for marketers”, said Woodman. “That’s why you see major players like Google, IBM, and Apache entering the field and providing open tool kits that are empowering brands to deploy machine learning within their enterprises in a variety of ways.” Even DSPs are entering the space, offering their own form of programmable machine learning tools, such as AppNexus’ Bonsai decision trees or The Trade Desk’s bid multipliers – these players are empowering brands to develop first-party machine learning algorithms that leverage their own unique data sets and KPIs to buy media programmatically. However, this still brings its challenges in the form of closed implementation. “It’s very much a closed ecosystem”, explained Woodman. “The algorithms you’re able to build in one DSP are closed and not portable. What you build in one DSP can’t be used in another.”
Open Machine Learning
This is where a request for Open Machine Learning comes into play. “We think the market is ripe and needs Open Machine Learning”, said Woodman. “It’s similar to what OpenRTB did for the bid stream. There is a need for a machine learning protocol – a superior class of learning model – for the industry to capture the potential USD$42bn (£28.7bn) of machine learning spend in the future.”
Woodman is attempting to socialize an idea: “We’re putting the idea out there and want the industry to adopt it. It would be a model output allowing algorithms to be transported across DSPs. The closest we have now exists in the statistical science space – a markup language called PMML, which can be an output from tools like SPSS.” According to Woodman, it’s not as advanced, but there are certain types of protocol in the statistics industry, which could be modified to handle a complex class that works in the programmatic space.
If the industry is keen to adopt this, it won’t happen overnight. Woodman explains that if the concept of Open Machine Learning sticks, it will take three to four years to adopt and the enthusiasm with which brands embrace machine learning will depend on its transferability across multiple systems. “To receive a chunk of that USD$42bn (£28.7bn) in ad spend, the industry needs an open environment for brands to embrace this approach.”
There’s a scene in the 2002 Tom Cruise movie, “Minority Report,” that has become legendary in marketing and ad tech circles.
In the scene, Cruise’s character, John Anderton, walks through a crowded mall. Retina scanners and other technologies identify Anderton in real time and serve him entirely personalized ad experiences and product messages. Upon walking into a Gap store, a hologrammatic shopper bot immediately recognizes him, asks about previous transactions and recommends additional items he might like.
For advertisers, this looks like the holy grail of marketing: true one-to-one communication – though in a more dystopian, big brother kind of way. For mar tech companies, it represents the challenge everybody is trying to solve, and fast.
This race to personalization has resulted in an avalanche of disparate point solutions tackling highly specific pieces of the much larger problem: CRMs, data management platforms (DMPs), demand-side platforms (DSPs), email service providers, recommendation engines, search engine marketing bidding platforms, tag management solutions, measurement and attribution systems, marketing automation platforms and social listening tools – just to name a few.
The average enterprise uses more than 12 of these different solutions to power its digital marketing efforts, with some companies using more than 30 unique tools, according to a 2015 report from Winterberry Group and IAB. With 70% of companies planning to increase their mar tech spending in 2017, one can only assume those numbers will climb even higher.
Most of these tools are smart, powerful and effective at doing what they’re designed to do. Most leverage machine learning to create value, drive performance or both. What happens, though, when they aren’t talking to each other? How smart can they be when they don’t see what the other tools are doing?
Take a DSP, for example. A standard machine-learning application in a DSP may record and use a variety of attributes to decide which impression to buy and how much to bid on it. One such attribute might be the length of time since a user last saw an ad impression.
If an advertiser is using multiple DSPs – as most are – how would any single DSP know the last time a user was exposed to an ad without understanding what the other DSPs are doing? Now imagine compounding this situation with social, search, email, web visits, TV and all other forms of media exposure.
This lack of communication between systems creates massive data gaps, where each system is blind to the data of the other systems being used. This data blindness results in each system’s unique machine-learning applications making decisions and learning behaviors based on incomplete or inaccurate data.
DMPs came along to fill this void, but the application of the data contained in DMPs has largely failed because the preferred method of distribution has been through heuristic rules that guide segmentation and then one-way pushes of one-dimensional audience data into existing DSPs. This process flattens the contrast of the underlying data set and degrades performance.
One approach to solving the data gap problem, at least in the programmatic channel, is to use a single stack with an integrated DMP and DSP. This seems viable, but most brands and agencies use multiple DSPs to achieve mass reach and drive efficiencies. It also doesn’t solve for the problem of nonprogrammatic media channels.
Alternatively, a brand or publisher could build its own stack tightly integrated with its own data sets. This would require an enterprise to build, replace and maintain the 12-plus tools they currently use with homegrown technology – a daunting task, to say the least.
Perhaps the most viable path lies somewhere in the middle, between consolidation and building one’s own stack. With this approach, the same data that is already being captured in a DMP can be used to build a holistic decisioning engine that looks across an increasing number of digital marketing tools and consumer touch points to decide what message to present to what user in what channel.
The result is an enterprise-centric learning system that sits atop a series of ad tech stacks and executes against marketing directives in as close to real time as the points of distribution allow.
This is not yet another mar tech product, but rather a process. It requires analyzing current available data sets, including gaps between systems, as well as the distribution interfaces of the existing media channels. Think of it as marketing stack management, powered by enterprise machine learning (bring on the acronyms).
Wider adoption and support for open machine learning or the concept of brand-controlled decisioning algorithms (brandgorithms) that are portable across distribution channels would only accelerate this process. But first, brands and agencies need to ask for this level of openness of their mar tech and ad tech vendors.
Only when all channels are considered holistically can media decisions be made intelligently, bringing the promise of “Minority Report” and one-to-one marketing closer to reality.
AdExchanger Podcast: Open Machine Learning: Nate Woodman Says Brands Will Eventually Own Proprietary Machine-Learning Models
According to Nate Woodman, GM of demand solutions at IPONWEB, the deployment of brand data in the media-buying arena is at an early stage. His pet thesis: Now that CRM activation in programmatic is common, the next challenge will be the development of proprietary machine-learning models that are owned and controlled by brands.
"Most CRM data is activated through a DSP," Woodman says in this latest episode of AdExchanger Talks. "That supports a segmentation strategy, but to drive real performance out of a system requires a machine-learning model, which can hit performance targets in a vastly superior way to segment-based buying."
A tiny club of big marketers, such as Netflix, have initiatives in place today around proprietary algorithmic IP. And other performance-focused verticals like banks may be positioned to do so. But it's a steep climb.
"The challenge to the industry, and it's a daunting one, is to find a way to spread proprietary algorithms across programmatic platforms," Woodman said. "Most brands aren't even close to realizing this vision, but some are making overtures in the direction of proprietary machine-learning models.”
He added, "I don't know that it's going to go there, but it's a vision."
Also in this episode: Woodman talks about IPONWEB's unique place in ad tech history, its current strategy and the evolution of the agency trading desk model.
Lessons in Measurement and Attribution from World War II: Survivorship Bias And Why We Need Incremental Measurement
During World War II, Allied bomber planes faced a critical design problem. They were slow, lumbering and constantly shot down. The bombers needed to be reinforced with armor, but covering an entire plane made it too heavy to successfully fly a mission. Reinforcing them only in their most vulnerable areas, however, could solve this challenge.
Naval engineers mapped the location of bullet holes that had struck the returning planes to reinforce areas most often hit. Statistician Abraham Wald noted that this analysis only looked at planes that survived since it was impossible to assess damage on planes that were shot down. Wald concluded that the areas hit by bullets weren't vulnerable – these were areas where the planes could get hit and still return safely. Instead, they needed to secure areas where returning planes hadn’t been shot since those were more likely to fail if hit.
The naval engineers’ original plan to only reinforce the bullet-ridden areas is called survivorship bias. This sort of flawed thinking leads to false conclusions and misguided decision-making by only looking at people or things that survived some event or process, while overlooking those that did not.
In digital marketing, a similar mistake is being made with attribution. The current state of attribution is flawed because it only looks at completed campaigns to measure conversions and identify top performers. Traditional attribution models, such as last-click, first-click or multitouch, falsely assume that for a conversion to take place, an ad must be shown. This reasoning fails to acknowledge that certain conversions would have taken place regardless of ad exposure, and it causes even the savviest digital marketers to unjustly reward specific media partners and inflate ROAS.
Consider, for example, a campaign that is designed and executed to maximize post-click conversions. A sophisticated media-buying platform powered by machine learning is trained to find and target the lowest-hanging fruit – users most likely to convert – to ensure campaign goals were hit. In this case, the platform gets credit for conversion.
In truth, many of those users would have converted without seeing an ad, perhaps because of seasonality or brand loyalty. But an intelligent system finds and targets these users excessively because it knows they have a high likelihood to convert.
Should marketers spend money to reach those people? Should revenue from those users be taken into account when evaluating ROAS? Or would marketers be better off redirecting those dollars to reach people who truly need to see their ad to convert?
There may even be some cases where users who would normally convert on their own would be less likely to convert after seeing an ad, potentially the result of misaligned creative or overexposure. But because traditional attribution models only measure and reward performance against impressions shown, the related campaign might still be deemed a success so long as it hit its CPA benchmarks.
This is obviously (and thankfully) not the case with every campaign and optimization strategy, or we would all be out of work. But, it does highlight the critical need for brands to rethink what they know about their audiences and how they plan and evaluate digital buys.
I get it: The notion that ad exposure is not always needed to drive business outcomes seems radical, especially for advertisers. Digital-first companies, like Netflix, are already adopting sophisticated technologies that enable them to more efficiently find and target users that drive incremental value, rather those that would convert regardless. And traditional brands like Allen Edmonds are starting to question the real incremental value of their retargeting activities and reallocate those dollars to more efficient channels.
These companies have upended the way they think about digital advertising to gain greater clarity on ROAS and drive greater value for their businesses. They’re reinforcing their planes to ensure future survival.
Earlier this year at ATS NYC, Nate Woodman, GM, demand solutions at IPONWEB gave a fascinating keynote presentation in which he discussed: incremental lift and demand side attribution, a topic the industry flirts with but never really dives into. His goal was to plant a seed among the audience and prompt them to re-evaluate the disconnection between ad tech solutions and brand goals.
IPONWEB have been around for around 15 years, they came to market after Google and before Facebook and have seen many different ad tech companies come and go during that time.
IPONWEB have two primary products: U Platform and Bid Switch that are solutions that are built and maintained by IPONWEB for other players in the market. These products allow IPONWEB to plug into inventory from across the industry and thus provide a unique view of trading data.
Using their proprietary data, IPONWEB have been able to understand the incremental value of retargeting ad exposure in relation to search clicks. The image below, taken from ATS NYC 2015 shows that much of the retargeting campaigns run today do not benefit advertisers in the way that was intended. The research shows that it is not until one day after a search click that retargeting starts to provide incremental lift. However, the majority of display media budget is spent reaching consumers in the first six hours following a search click .
This pattern of spend and performance is rather unsettling. IPONWEB dug deeper into the data to understand why a DSP would spend money that does not create incremental value for advertisers. One hypothesis was that the incentives between the DSP, the agency, and the brand are misaligned; unlikely. Perhaps the DSP didn’t have access to all the data and perhaps didn’t know that these search clicks were occurring. Finally, and most probably the DSP is doing exactly what it is told to do.
IPONWEB set up a controlled experiment. They segmented users into a number of different categories; users who had visited a brand page and not purchased, users who had visited a brand page and purchased; users who were new to the site and unknown and users who were not exposed to any of the ads (the control group). Then they looked at those macro-segments and divided them again by their exposure to different forms of marketing (email, retargeting, search, paid search).
For each segment they measured incremental lift from programmatic retargeting.
The traditional campaign (not designed to deliver incremental lift) spent USD$30,000 in media and generated USD$42,000 in post-click attributed revenue.
The hypothesis of the experiment was that exposing a user to an RTB ad increases the probability that this user converts / makes a purchase. Therefore, RTB adds a lift to the baseline of purchases that is influenced by external factors and other marketing channels.
The results, pictured below showed that, from a macro view RTB does create incremental lift (total users). Going deeper by segmenting the users shows that incrementality can be negative. Serving ads to people who had been to the site before but not purchased (previous pure visitors) was not just a waste of money it actually caused them to be less likely to purchase compared to the control group.
In the TraderTalk video featured in this piece, Nate Woodman delves further into the sub-segments of users and the impact RTB ads had on their propensity to purchase and answers the question “what are the segments of users plus media exposure that do buy”?
Ad exchanges aren't second-price auctions.
About a year ago my colleague Edward Montes wrote a column titled “The First Rule of Advertising Exchanges – There Are No Advertising Exchanges” on ClickZ.
In the column, Ed concludes that there is no such thing as an ad exchange because of the lack of price transparency to the buyer, especially when the seller can see buyer bids.
I conclude that the second rule of advertising exchanges is that ad exchanges are not second-price auctions.
Most of the research I’ve seen around advertising auction types is focused on the Google paid search auction. The common misconception is that the Google exchange is a Vickrey-Clarke-Groves auction, but most experts conclude that it is in fact a generalized second-price auction. Both auctions support some type of price reduction where the winning buyer pays slightly more than the next highest bid.
The Vickrey auction is designed for the single sale of a tangible asset. The expected buyer behavior in this type of auction is for the buyer to bid the known value of what the good is worth to them. This is called the buyer’s willingness to pay. This behavior is expected because there is no risk of overpaying since the price is set by the value of the second place bid.
The GSP auction gives price reduction, however, the asset is not a single expiring good but one that is sold multiple times. In the Google paid search auction, the multiple goods are the multiple winners of the auction that are organized by order of ad slot.
Most search marketers know that the ad slot’s relative performance is mostly invariant. This means that the performance of ad slot two will be about the same as it is for ad slot one. There are branding reasons to be in ad slot one, but that is a feature that is not given much value by the search buyer, so the “performance” drives the valuation of the search buyer.
The behavior of a buyer in a GSP auction is different than in a Vickrey auction. In a GSP auction a buyer should not bid their willingness to pay but instead just enough to win at least slot three.
In a traditional English auction, commonly called a first-price auction, the winning bidder is the max bidder and they pay what they bid. The expected behavior of the buyer in this type of auction is to bid below the buyer’s willingness to pay and just enough to win the auction but as little above the second place bidder as possible. This type of bidding involves a lot of uncertainty that fuels analysis into forecasting, valuation, and gamesmanship where the buyer tries to figure out the value of the asset to all likely bidders.
The English auction and GSP auction both encourage a bidding strategy known as bid shading. This is when the buyer hides their willingness to pay for the asset to avoid paying too much. The more buyer uncertainty of how much to shade the bid, the more likely they are to bid lower than they would if the auction were a Vickrey auction. The end result for the auction is a lower bid average for each ad impression as compared to what the bid average would be if it were a Vickrey auction.
In the real-time bidding (RTB) ad exchanges, each impression has only one winner and there is also price reduction to the second highest bid, so you would expect buyers to bid their willingness to pay like they would in a Vickrey auction. However, there are alternate market dynamics at play that encourage RTB buyers to embrace bid shading.
With such a vast difference it is likely that sellers of Exchange 2 (left skew) are setting floor prices that are in line with our advertisers’ bids. When buyers bid less than average the impression leaves Exchange 2 and ends up in a fill exchange (normal distribution) like Exchange 1.
These observations lead our algorithms, our traders, and our advertisers to bid lower or outright block domains/exchange combinations where we observe cross-selling and/or odd behavior.
Over time, the publisher’s use of daisy chaining will lead more and more buyers to shade their willingness to pay and create a set of market dynamics that anchor the clearing price of digital media.
More transparent and market-driven exchanges will allow natural buyer competition to reveal an advertiser’s willingness to pay and naturally drive up pricing. This is where most analysts feel the RTB marketplace will go once the publishers feel there is enough buyer competition in the marketplace. This is not reality in the current state.
As a representative of the media buyers, I am indifferent as to whether or not publishers truly adopt market-driven pricing. The more games media sellers play, the more advantage buyers who have the data and analytics horsepower to react can and will.
I want to leave this column with a question. There are a lot complex decisions that need to be made inside filtering strategies and algorithms that only a few in the industry truly know. There are different companies in our space. Some that represent buyers, some networks, some trading desks, some sellers. There are also some that say they represent all. Which one represents you?
Cookie Deletion and Upper Funnel Targeting: What will the impact of cookie decay be on digital marketers?
A few months ago I read an article in Ad Age written by Jag Duggal, vice president of product management at Quantcast. I agree with most of what Mr. Duggal covers. In fact, I wrote a column for ClickZ that supports his second claim that CTR is a poor optimization metric.
It is Mr. Duggal’s first claim that caused me to pause:
My company took a sample of ~100 million RTB cookie IDs on Day 0 and then scanned a discrete day at seven-day intervals for four weeks. The further we got away from Day 0, the less likely we were to see the cookies again.
(Author Note: Please do not interpret this data as a true cookie decay rate. We measured discrete days. It is very likely that cookies in the 100 million at Day 0 may have appeared on days that we did not measure. Therefore this estimate is directional and we do not present it as an actual decay rate.)
We categorized the data into multiple segments to see if we could find any significant skews. It turns out that Brazilians and Chrome and Mac users tend to hold on to their cookies a bit longer, but not enough to justify a deeper investigation.
We also studied a variable we call surfing behavior. It is how many times we see groups of cookies within a given time frame. These numbers showed a significant skew.
Chart 2 shows that we saw 20 percent of the users only once, 61 percent were observed at a low frequency, and 19 percent were observed at a high frequency.
Chart 3 establishes that there is a significant difference in cookie decay rates when classified by surfing behavior. In this case, the 0,1 surfing behavior category decays at ~2x the rate of the high surfing behavior category.
This data suggests that there is a small subset of browsers that are classified into 0,1 to low frequency buckets that churn their cookies at a much higher rate than the average browser.
The observation is supported by the 2007 comScore study on cookie deletion. In the study, comScore concludes that 7 percent of the browsers are responsible for 35 percent of the cookies over a 30-day period.
This is an important point. The impact of a small subset of what comScore calls “Serial Cookie Deleters” could be very significant. Here is an example to help illustrate:
Assumptions in example for Chart 4:
Chart 4 demonstrates that even if a small percent of the total number of browsers remove their cookies in a systematic manner, the total number of cookies available can be much higher than the actual number of browsers.
If you think through this a bit more, after three days there are 500 dead cookies (600 cookies – 100 browsers). This is an example to illustrate that small subsets and high distribution can make a huge impact on actual counts. This example is not real data, but the comScore data is likely much closer to current reality.
The real potential reach of the Chart 4 example is 100 browsers. Eighty of them are stable throughout the example. Any cookie-based targeting and tracking model will naturally skew toward stable browsers, as the cookie IDs of unstable browsers are quickly removed from the tracking and targeting pool.
What Does This Mean for Digital Marketers?
If the majority of cookie decay is the result of a minority of browsers that remove their cookies frequently, then the majority of browsers have stable cookies even though a significant minority of cookies may have a limited life.
According to the comScore study, any upper-funnel targeting that is >30 days would effectively eliminate 7 percent of the browser target audience over a 30-day period. Mr. Duggal’s statement concludes that upper-funnel segmentation models will be skewed toward the 93 percent of browsers that do not frequently remove their cookies. I feel this fulfills the promise of upper-funnel cookie targeting.
It also means that the current state of cookie-based attribution slightly under-represents the top of the funnel in favor of the lower funnel, but again, not enough to entirely throw away the practice. It needs to be understood by the media buyer that upper-funnel results are undervalued and lower-funnel results are overvalued.
The most tangible impact is on unique reach and frequency calculations. Real reach is vastly overstated, while real average frequency is widely understated.
An educated media buyer should understand these results and their implications on what they buy and how they measure. I think the information in Mr. Duggal’s first statement is interesting and directionally correct but not significant enough to entirely abandon upper-funnel cookie targeting and attribution models.