We keep hearing companies confess to spending huge sums on marketing without really knowing the value of what they are doing. It seems to be that under market pressure it is often safer to push for more sales through tried and tested ‘spray and pray’ marketing approaches than it is to agonize over petty details like, is our ROAS measurement accurate, incremental, and reflecting the contribution of all marketing channels? Here I will explore the broad reasons for this.
Playing it safe
The demands of business growth competitors, seasonality, and company cash flow do not allow the luxury of waiting for perfect measurement, so companies tend to try multiple strategies at once in the hope that if they throw enough marketing budget at the wall some of it will eventually stick. This broad market mix approach actually makes sense for spreading overall marketing budget risk but does so at the expense of optimal ROI.
Yet in all these firms, there is usually at least some attempt at marketing effectiveness measurement. Measurement of ROI / ROAS is very hard to do properly because it requires sound marketing attribution. The gold standard of marketing measurement is to run properly structured experiments with control and treatment groups and statistical confidence measures of uplift. Yet there are so many possible marketing strategies and channels that it is not practical to do this when under market pressure. You cannot test everything at once and often you cannot even afford to ‘go dark’ on even just a part of your market to create that all-important experimental control group.
If there was only a way, your ad spend would align with incremental sales impacts and you would grow faster with higher market efficiency.
The paradox of cutting ROAS analytics to focus on growth
More scalable approaches to marketing measurement such as marketing effectiveness analytics and attribution are still complex and can be expensive to implement. You need to address issues of data quality and you also need well-managed teams of analysts and data scientists. All this costs money and takes time. So perhaps it’s not surprising that the requirements for sound marketing measurement take a lower priority than the more pressing requirement for sales growth. Paradoxically, measuring success is considering secondary to achieving it when there is so much growth to gain, but this is how many organizations regard investment in marketing measurement and analytics.
Given the complexity, advertisers fall back on softer media-independent measures of ad effectiveness such as surveys of brand awareness and purchase intent. These softer measures provide reassurance that ad money is not completely wasted but does not actually attribute any value or ROI to different marketing channels, so leave major questions unanswered.
Actor fragmentation and lack of team focus
Another major reason why companies waste money on marketing is the fragmentation of marketing teams and agencies. Channels as diverse as TV, search and social require very different strategies and skillsets for sound execution. For larger brands and larger budgets, specialization within marketing channels makes sense, but too often this comes at the cost of poor integration between channels. Once again measurement tends to take a low priority. By their nature, different marketing channels have different systems of measurement and so increase the cost and complexity of getting to an integrated view.
Agency marketing measurement is of mixed quality. It can sometimes be excellent but at heart, agency measurement has a conflict of interest, whereby the agency is essentially marking its own homework. Such bias is rarely malicious or deliberate but more often is a case where there is a systematic tendency to prefer more positive accounts of media effectiveness.
Once again if different agencies are handling your different marketing channels then you cannot expect much insight into how these channels work together. The result is marketing inefficiency through wasted and duplicated spend and also missed opportunities for finding synergy between channels.
The challenge of cross channel measurement is solvable but it requires focus and resources at the level of senior management. The CMO typically needs support from engineering and finance to push through a properly integrated approach. In many organizations, there is a lack of understanding of the nature of the challenge. It’s often not that there isn’t enough data but that there is a lack of expertise around how to leverage that data for improved decision-making in marketing. This means it is doomed to fail since it is not easy to well. Sound multi-channel measurement means different things for different businesses and so often requires a unique approach for each organization. Just as soon as you think you have nailed it, business and market conditions change.
Do not let perfect be the enemy of good
One piece of advice we give is just to stick with it and make a gradual transition to a more data-driven approach. It has to be better to make decisions within a rational data-driven framework than to simply burn the marketing budget. Some analysis and experimentation are better than none, and more is better still. Strive to improve and do not let perfect be the enemy of good.
The basic idea could not be simpler: money wasted on an activity that is not working can be better spent leveraging activity that IS working. This increases marketing efficiency and sales growth.
Be less wrong!
Gabriel Hughes PhD
Does this article resonate with you? Check out other posts to the right of this page.
Can we help unlock the value of your analytics and marketing data?
Metageni is a London UK-based marketing analytics and optimisation company offering support for developing in-house capabilities.
https://www.metageni.com/wp-content/uploads/2019/05/loosingtheplot.jpg483724Gabriel Hugheshttps://www.metageni.com/wp-content/uploads/2015/10/logo1.pngGabriel Hughes2018-11-19 21:26:532021-05-05 10:49:19Why do companies waste millions on marketing?
Attribution across devices is one of the major measurement challenges for digital advertising particularly affecting how mobile and upper-funnel activity is valued. Many choose to ignore the problem and plump for guesswork. Is this wise? And what are the best strategies for dealing with this key source of measurement bias?
What exactly is the problem here?
The challenge is simply that we cannot measure actual user marketing influences before sales when users switch devices. For small purchases, most people research and buy in a single visit on a single device. But for a larger more complex purchase, such as a holiday, TV, or an insurance policy, often a little research is done on the phone, maybe a bit more on a tablet, and then the purchase might be made on a laptop.
A website visitor is anonymous unless they explicitly sign in at some point. So to be visible as the ‘same’ person on more than one device, users have to be an existing customer already. Heavily used online services like Amazon, Google and Facebook have done a good job of streamlining user management but the norm for the rest of the internet is sites cannot tell whether visits from different devices are made by the same person or different people. This is a major challenge for advertising measurement since an ad click followed up by a purchase on a different device shows up as the ad as leading nowhere and having no apparent ROI.
This measurement bias generally gets worse with more complex and expensive purchases which involve ‘upper funnel’ channels and have longer multi-touch paths to purchase. The early touchpoints are already penalised by last-click attribution, and then the cross-device challenge penalises them even further.
How big is the issue?
Comscore produces global stats on this using their usage panels and reports that globally around 46% of users use multiple devices within any given month, but the figure is much larger in the more developed market at around 50-60% in countries including the US, Germany and UK. The more people use multiple devices in this way, especially during a purchase process, the greater the issue.
Although the growth in the devices is a relatively recent phenomenon, the tracking issue is not new. For as long as internet use has been the subject of research, analysts have worried about ‘home versus work’ use whereby someone might do some shopping research during a lunch break and then pick it up later at home – in fact, this is still a major measurement issue. Also, note that the cross-device issue is strictly a cross-browser issue i.e. a user who switches browser or app on the same device cannot usually be linked across events either.
The bias in marketing measurement is clear: user journeys appear much shorter and less multi-touch than they really are.
Most companies still overvalue the last click and undervalue earlier touchpoints. With cross-device switching, even a first click positional model will give too much credit to the last click. This is because a single click is both first and last, and for many advertisers, these are the largest single type of user journey visible in their data. In reality, many of these single touch journeys will be from users who have visited before, maybe even very recently, but just on another device.
Multi-touch journeys will also get broken by a switch to a different device. Maybe you make 2-3 clicks doing some research on your phone, then 2-3 more choosing and then purchasing on another device, with a short break in between. Once again the first few touches via upper funnel channels get undervalued and their true contribution to marketing ROI is partially hidden.
How to ‘solve’ cross-device attribution
Even just thinking about taking this issue into account is a major step that the majority of advertisers don’t take.
The first step is to consider how big a problem it is for your business specifically. As with other attribution issues, the more considered and complex purchases, often the higher value ones like holidays, high tech or financial services, tend to have a long path to purchase and therefore a greater attribution challenge. You can use published research to get cross-device estimates but it better is to get some idea of your own customers’ cross-device behaviour, for example looking at the cross report in Google ads reporting which leverages the benefit of their own cross-device tracking.
Many companies have some customer tracking across devices thanks to user login and sign up apps, and email/ CRM – and while this data is heavily skewed to the purchasers (more on this below….) it does at least provide a window into your customer cross-device behaviour which you can use to explore just how big a challenge this is for your attribution.
Since most companies just ignore the issue, educated guesswork may actually be a better than average approach. In an attribution course which I periodically run we do an exercise where participants estimate the size of the cross-device bias, simply by considering the proportion of sales that are likely to be affected by the issue and using this estimate to up weight upper funnel attribution models. Maybe you guess around 40% of your sales involve multi-touch cross-device journeys. This suggests that when comparing first and last click models, the shifts that occur for each channel whether up or down, are weaker than they should be – in reality shifting by a factor of a further 40% or so.
This kind of simple analysis may be enough for you to give an apparently low ROI channel the benefit of the doubt, as your estimates could show the channel driving more upper-funnel activity than initially appears.
Device graphs and other technical solutions
The main thrust of technical solutions fall into either ‘deterministic’, or ‘probabilistic’ methods and are generally a mixture of both. This is not as scientific as it sounds. Deterministic means you can actually track a user using some kind of technology, while probabilistic means you make inferences (guesswork) most robustly and correctly possible given the data you have. Crucially the probabilistic approach depends on some level of deterministic linking since it relies on using information about ‘known’ cross-device behaviour to try to infer the unknown cross-device behaviour.
So the basic idea is to link as many users as you can using a tracking database, and then make a well educated statistical guess as to the rest.
When you consider a cross-device solution you will no doubt encounter platforms that promise that you can leverage their massive ‘device graph’ database to join users up. It sounds like they have all the information you need. A ‘graph’ in this context means a dataset describing a network of relationships, in this case between different devices. The tactic employed by these companies is to draw on data from multiple companies and networks to work map cookies to a wide set of more robust and persistent login based IDs. They, therefore ‘know’ that when there are visits from two different browsers, they actually belong to the same user: this is deterministic linking.
Technology providers also use their linked data to train a model to predict what happens when these ID mappings are not available, and which visits are likely to be associated with other visits from other browsers and locations, and following a similar pattern as observed in the deterministic linking. This modelling process is called probabilistic linking.
Before you decide to shell out the large sums required to use these solutions, there are some really major challenges you need to be aware of.
GDPR and user consent, and other challenges
First, the only way a third party can track someone who is not logged in by mapping IDs based on shared login and tracked users from other sources. This type of large-scale mapping is almost certainly a violation of user privacy and identity under the GDPR legislation, and it is only a matter of time before these platforms have to delete this data. User data which has been properly consented to is likely to be very small relative to the total universe of users out there. After all, would you consent to a company you deal with sharing cross-device usable data with a whole network of other companies?
Second, the data gets out of date quite quickly. Users frequently change logins, and suppliers, and also change and upgrade devices. So a large proportion of users who show up have not been already linked to the device graph ID set, which is only ever a small subset of the total number of users who login in. They cannot track everyone, so, for the most part, they have to estimate.
This brings us to the third issue – and this is actually the biggest one – which is that claims about probabilistic linking are almost certainly overstated. Claims about highly accurate matching rates tend to fudge the difference between precision and recall (look these terms up if you want to understand more). When you dig into the problem, a basic fact hits you square in the eyes: there are many occasions when users do something and then switch devices, and many occasions where users do the same thing, but then do not switch devices. No amount of probabilistic modelling can change this fact.
For example, suppose I see 100 people make 4 clicks on a mobile device, and my data tells me that there is a 3% chance that their next click is on a desktop. This suggests that 3 people should now be modelled as switching to desktop. But which 3 people? There is no way of knowing from the data you have. If you knew, you wouldn’t be using probabilistic linking in the first place! In technical terms, the ‘recall’ is fine, but ‘precision is very weak.
Deterministic linking is clearly superior since this is where we simply have all the data we need to match user to user on different devices. What solution providers do then is to effectively offer deterministic linking based on their database, with probabilistic linking used as a fallback option to ‘plug in the gaps left by this method.
Again, if you are piggybacking on someone else’s user data, then there are major privacy concerns to consider. However, it is worth noting that most companies already have some data allowing them to join up users across devices – what we might call ‘Do It Yourself’ (DIY) linking. For example, if your website asks users to log in each time, and collects an email address, then if they read their email from you on a phone, you can potentially match them from device to device. It seems there should be a way to leverage that.
Of course, the challenge is that this is always going to be a limited percentage of users, representing a big gap in your knowledge of cross-device usage. However deterministic linking is never 100% complete. So one way to leverage it is to try your own probabilistic matching, using the data you can match to make inferences about the data you cannot match.
If you do go down this route you should be aware of another major challenge with this partially linked data, which is that it’s heavily biased towards fully signed up customers and against non-customers. It would be very easy to use your own sign-in data to conclude that people who buy from you have complex cross-device usage, whereas people who do not buy from you have simpler single device interactions. The problem with this is that the people who sign up and thus become trackable tend to be the buyers, and so you inevitably have more visibility on their complex user journey data than for the non-buyers. This kind of data bias can easily generate misleading conclusions.
An alternative solution
Here is where you have to forgive me for plugging our own Metageni solution which we call a ‘cross-device inference model’. We are interested to know the feedback from the community, so let me explain the principles.
The basic idea is that the known or ‘matched’ cross-device data can be viewed in both its matched and non matched forms, and we can observe how the process of matching itself changes the data distribution. The distributions of interest are relevant data features such as the user path length, device use frequency, and device position in the user path. We use this information to resample from our raw data, to create a sub-sample of non matched data that has a similar distribution of these features.
Thus, unlike probabilistic matching, we give up on the attempt to somehow create new cross-device data out of the unknown and instead settle for trying to make the overall sample more representative of what the complete cross-device data set would look like.
For example, we might find that when known data is linked, there tends to be a reduction in single clicks from tablet devices, as these get matched to multiple clicks on other devices. Let’s say these types of interaction fall by 10%. So before we use the raw data which has no matching ID, we create a subsample that randomly drops around 10% of these single click tablet interactions. We do this but for many different features of the data which we know shift around when the data can be matched. We end up with a data set that is not fully linked across devices, but which includes only non-linked data which is considered representative.
We would love to hear what other experts think about this less aggressive and more GDPR compliant approach.
Cross-device use continues to evolve but is not going away
Whatever you decide to do, hopefully, you can now see that it is best not to ignore the problem. Recent research suggests we may be past the peak of cross-device usage for some users who now just use smartphones for almost all their internet use. Mobile has become dominant over desktop, vindicating those companies that adopted a mobile-first strategy a few years back.
However cross-device use is in many ways a symptom of a continuing rise in multi-task activity, as we sit listening to music on Spotify, watching a show on YouTube on the TV and playing with holiday ideas on our mobile phone, often all at the same time. Measuring how users move through their digital universe continues to be vital to understanding their behaviour. Marketing analysts ignore this problem at their peril.
Gabriel Hughes PhD
Can we help unlock the value of your analytics and marketing data? Metageni is a London UK based marketing analytics and optimisation company offering custom solutions for in-house capabilities.
One of my proudest achievements while working at Google was the award in 2011 of a patent for Multi-Touch Attribution (MTA). Since 2007 I led a small team of data scientists and engineers working on the exposure to conversion journey using Doubleclick data. The solution we eventually patented was the culmination of months of research and client analytics support and formed the basis of the Attribution Modelling Tool which is still the most advanced available feature in Google Analytics.
Working in the field of marketing analytics and attribution today I am struck by how many people still fundamentally misunderstand the purpose of rules-based attribution modelling as we had originally conceived it back then. So let me put the record straight.
The purpose of rules-based attribution is not to provide you with the best alternative to last-click wins attribution. Indeed as many have observed the problem that MTA highlights are that there are many different ways to understand and measure the contribution of different marketing touchpoints in the user journey to sale. This means it is not immediately obvious what the relative contribution of different marketing channels could be.
It has always been clear that a definitive answer to that question lay in the direction of some kind of data-driven or statistical approach. Indeed this is what my current company attempts to do using machine learning and what Google and others attempt to do with sophisticated algorithmic models. So what is Multi-Touch Attribution good for?
Challenging hidden assumptions
Any alternative rules-based model highlights the extent of potential bias in last-click attribution. The background to the development of MTA was the deeply ingrained adherence to the flawed last click model of marketing value. Indeed even today over a decade after rules-based attribution was invented a surprisingly large number of major brands still rely on a last-click view of the world. Last-click is baked into most advertising reporting systems and as a result, companies continue to under-invest in upper funnel sales growth opportunities and fail to realise marketing efficiencies.
Companies use last-click attribution without even knowing about the implicit assumptions that are embedded within this logic. When I give talks to larger groups of agencies and advertisers I often open by asking for a show of hands from all those who are, ‘doing attribution. Naturally, it’s a trick question. If you are not ‘doing attribution’, then you are almost certainly relying on last-click wins attribution without realising it.
Indeed, if you make any claim at all about the sales impact of your advertising, then you are ‘doing attribution’, and so the question advertisers need to ask themselves is how they can do a better job of challenging and testing their attribution assumptions.
Around a decade ago Google and others started to address this huge hidden measurement bias through the introduction of metrics such as the assisted click count. So for example in AdWords ‘Search Funnels’ it became possible to see how brand last-click sales are influenced by generic search campaign, ‘assist clicks’. This was a major step forward but was not a game-changer. Just as the magician’s assistant is a secondary presence on the stage compared to the main performer, then in a similar way the assist click is a secondary player to last-click attributed sales.
This is where alternative models come in: they start with equivalent status to the last click. Using an alternative attribution rule such as, say, first click wins demonstrates the equivalent status of anyone attribution rule to another. The point is that without experimental methods or statistical analysis, one attribution rule is no better than any other.
Indeed, last-click is quite a poor choice compared to other positional models. The specific problem with last-click is that it maximizes the risk of attributing a sale too late within a marketing process which influences the user in a series of steps. If the marketing has any value at all it is there to help a brand convince and influence a person to purchase from them in preference to spending their money elsewhere. Somewhere within this multi-step process, the consumer makes a choice. By the time customers make the last click, many of them have already made up their mind beforehand. Thus many last click actions are purely the final navigational steps in a considerable research and purchase process.
So, alternative attribution models help us expose the flaws in last-click wins attribution.
Is it worth investing in accurate attribution? – Attribution gap analysis
Ultimately we all want to know the ‘true’ attribution picture: what is each marketing touch point and channel actually contributing to sales? – if we know this we can allocate budget for maximum efficiency and sales growth. Getting to that accurate picture is a complex challenge however – at Metageni we advocate a custom data driven model incorporating customer signals and trained using machine learning on your own data. We are often asked whether it worth all that effort? One way to answer is to actually measure the uncertainty itself, but comparing a range of different attribution models and measuring the difference, the ‘gap’, between each model. Attribution gap analysis does this, measuring the range of revenue which is subject to ambiguous attribution. If your customers generally use just one marketing channel and only click once or twice on those ads before purchase, your attribution gap may be quite small. In most cases when we measure the gap between MTA models, we expose a large variation between the value of key channels especially in the upper funnel of the customer journey.
So one thing rules-based models are good for then, is exploring the size of attribution problem and measuring the range of uncertainty. This help CMOs and finance teams work out how much to invest in more accurate attribution, addressing how much uncertain attributed revenue and potentially wasted marketing budget there is in any given year.
Understanding the customer journey
Alternative rules-based models do much more than just move us beyond the last click. They also help users understand the complete measured journey to sale. If you compare a first click model to a last-click model you are directly able to see the patterns of where users start their journey compared to where they end their journey. If you introduce a linear model you can understand all the touchpoints that had an influence along the way. A time decay model helps you to understand the closing stages of the process without early focusing on just the final stage. And so on. In other words, rules-based models are an excellent means of summarising the data on multiple complex paths to purchase online.
The need to be able to summarise this data is clear as soon as you try to look at the raw data about each user journey. Many paths are incredibly complex and unique. As an analyst, you need a way to summarise this information so you can understand at a macro level what the most important patterns are. Rules-based attribution models give you precisely this.
One key feature of the attribution modelling tool in Google Analytics is that it allows the user to compare models one against the other. This is because the aim is not to encourage the user to pick a single alternative to last click wins but to compare multiple models for insight into how marketing works throughout the user journey. This feature was central to the original attribution analytics prototype that we created.
So can rules-based attribution tell us anything about the actual value of marketing? Of course, it can. Think of each rules-based attribution model as a hypothesis about how value is created and then you can check this hypothesis against the other hypotheses and against your marketing strategy.
For example, if all models agree that a particular marketing channel drives positive ROI then it looks like a pretty safe bet for ongoing investment. If all models show negative ROI then maybe you should look for a cost-saving. If only the upper funnel models show positive ROI for a channel then consider whether your marketing strategy is to use this channel to drive the early stages of purchase such as research, for example, a review based affiliate.
Put simply rules-based models allow you to consider whether your digital marketing is really working in the way that it is supposed to work.
Ultimately decision-makers want simple answers. Multi-Touch Attribution was an attempt to simplify what is, in fact, an extremely complex analytical problem. This is proved to be both the strength and the weakness of MTA as marketers continue to struggle with knowing exactly what to do with these models.
Data-driven models still involve hard choices
There was never any doubt that a robust data-driven approach or experimental framework would be preferable to multiple rules-based MTA models. The promise of an approach that could yield a single definitive answer is too tempting to ignore. Yet even here the analyst has a duty to ask questions of such models and try to understand the underlying patterns in the data.
One major challenge with data-driven attribution models is that navigational actions like brand search clicks are a good predictor of sales even though they are not a reflection of marketing cause and effect. The issue is essential that marketing influences sales through a psychological mechanism that is not directly observable in behavioural data.
Furthermore, experienced data scientists know that there are multiple data-driven models that can be obtained leading to different results and that a simple objective criterion for selecting between them is not easily available. So, even if you can get to decent data-driven models, it is not nearly so easy as you might imagine arriving at a single unambiguous solution.
We should always be open to multiple interpretations of the data and try to avoid oversimplification. So even if you believe you have the perfect data-driven model I would urge any marketer to use the rules-based models as a benchmark and guide to interpretation.
As Einstein said, ‘Everything should be made as simple as possible, but not simpler.’
Gabriel Hughes PhD
Can we help unlock the value of your analytics and marketing data? Metageni is a London UK based marketing analytics and optimisation company offering support for developing in-house capabilities.
https://www.metageni.com/wp-content/uploads/2018/10/MTA-WTF.jpg483724Gabriel Hugheshttps://www.metageni.com/wp-content/uploads/2015/10/logo1.pngGabriel Hughes2018-10-03 17:51:032021-06-08 12:12:03What is multi-touch attribution good for?
Sounds too good to be true? Maybe so, but as machine learning and cloud data technology become more accessible and scalable, building your data-driven Multi-Touch Attribution (MTA) model is becoming increasingly realistic. The advantage of a predictive machine learning model is that it can be objectively assessed based on a blind hold-out data set, so you have clear criteria as to whether one model is ‘better than another.
What you need for an in-house approach
First, you need data and lots of it. Fortunately, digital marketing is one of the most data-rich activities on the planet. Chances are that if you are a B2C online business, you have hundreds and thousands, or else millions of site visit logs generated every week. These logs contain referrer data which helps you analyse and understand the customer journey to sale, the first essential step in building your attribution approach.
Second, you need a solid experienced data science and marketing analytics team. Go for a mixture of skills. Typically, some data scientists are strong on theory, but weaker on the application, while others are great communicators and see the strategic angle but are weaker at data-related programming. You also need domain expertise in marketing analytics. You need visualization experts and data engineering experts. The fabled ‘unicorn’ data scientist is impossible to hire, so instead, you should go for a team with the right mix of skills, with strong leadership to move the project forward.
Third, you need patience. The truth is, getting to an attribution model using machine learning is not easy. It is not a case of throwing some data at a model and waiting for the answers to pop out by magic. Your team needs to decide what data to use, how to configure it as a feature set, what types of model to use, what an appropriate training set is, how to validate the model and so much more besides. You will need to make multiple attempts to get to a half-decent model.
Choosing the final model
The best candidate ML models depend on your data – we have had good results with well-optimized classifiers and regression models, which we find often outperform even high order Markov models. While a black box or ensemble method may get better predictive accuracy, you need to consider the trade-off in terms of reduced transparency and interpretability. The best advice is not to commit to a particular modelling approach or feature set too early in the process, but to compare multiple methods.
But what then? An advanced machine learning model does not speak for itself. Once you have a model, you then need to be able to interpret it in such a way as to be able to solve the marketing mix allocation problem. What exactly is the contribution of each channel? What happens if spend on that marketing channel is increased or decreased? How does the model handle the combined effects of channels?
All of this will take months, so it is a small wonder that many companies ignore the problem or else go for a standard vendor out-of-the-box approach. It’s worth remembering then that there are some key benefits of a, ‘do it yourself approach to consider.
Benefits of an in-house model
If you create your model, you will discover a great deal about your marketing activity and data in the process. This can lead to immediate benefits – for example with one major international services firm we worked with we found significant marketing activity occurring in US states and cities where the company has no local service team. Even with no attribution model defined at that stage, the modelling effort uncovered this issue and saved the company huge sums right away. The point is that your data quality is tested and will become cleaner and more organised through the process of using it, and this, in turn, supports all your data-driven marketing.
Another beneficial side effect is that if you create your attribution model you will also learn about your business and decision making. This process will force your marketing teams to collaborate with data scientists and engineers to work out how to grow sales. Other teams need to be involved, such as finance, and your agencies, and this will often spawn further opportunities to learn and collaborate across all these marketing stakeholders.
Attribution is all about how different marketing channels work together, so your various marketing teams and agencies need to collaborate as well – social, search, display as well as above the line, and brand and performance more broadly. Again, this provides intrinsic and additional value over and above the modelling itself.
Finally, it is worth pointing out that you will never actually arrive at the final model. This is quite a fundamental point to bear in mind. By its nature, a machine learning approach means you need to train the model on fresh data as it comes in. Your marketing and your products are also changing all the time, and so are your customers. So really you need to build a marketing attribution modelling process more than you need to build a single attribution model.
So, go ahead, build your model, be less wrong than before, and then when you have finished, start all over again. As we say at Metageni, it is all about the journey.
Gabriel Hughes PhD
Can we help unlock the value of your analytics and marketing data? Metageni is a London UK based marketing analytics and optimisation company offering support for developing in-house capabilities.
https://www.metageni.com/wp-content/uploads/2018/09/data-scientist-2.jpg443788Gabriel Hugheshttps://www.metageni.com/wp-content/uploads/2015/10/logo1.pngGabriel Hughes2018-09-03 13:29:332021-05-05 10:41:30Build your own attribution model with machine learning
If you do not know how bad your analytics data is, then the chances are, it is much worse than you think. With data analytics, it is not the known data quality issues that will cause you the most trouble, not the known unknowns, but the ‘unknown unknowns’ – those issues you uncover and discover as you explore and analyse your data.
Usually, it is only the practitioners who are very close to the data who understand the full extent of the data quality problem. Too often the poor quality of data is kept as something of a dirty secret not to be fully shared with senior management and decision makers.
Common issues in web and marketing analytics
Let’s look at just some of the most common issues affecting web and marketing analytics data. To begin with, do not assume that the data sources provided by the most common analytics solutions are robust by default. Even the best ones are prone to big data quality issues and gaps. Take Google Analytics referrer traffic, which often reports an unrealistic level of ‘Direct’ traffic, supposedly visits made directly through users typing in URLs, or bookmarking, both low frequency methods of site access. The reason is that ‘Direct’ is, in fact, a default bucket used where no referrer data is available to the analytics server.
Experienced web analysts know that high levels of direct traffic usually mean high levels of broken or missing tags, or other technical issues, that means the true referrer data has been lost.
The major providers are also contributors to that other major source of poor data quality, which is fragmentation and disjoint data sources. Google search marketing tags will track conversions, but only from the Google search and display network. Facebook similarly provides tags which only link Facebook marketing to sales, ignoring all other channels. Affiliate networks do the same thing leading to widespread duplicated over attribution of sales to multiple sources. This challenge is exacerbated by different marketing attribution platform look back windows and rules which are different between platforms.
Having worked with multiple brands of all sizes, I have yet to come across a brand that does not have some level of tagging issue. A typical issue is a big mismatch between emails delivered and emails opened and clicked. Another is social campaigns which are delivered by 3rd party solutions and then appear as referral sources, due to the use of redirect technology.
Tagging and tracking
Tag management systems help manage this, but unfortunately not by linking the data, just by de-duplicating tag activity at source, which is hardly satisfactory if your goal is to understand Multi Touch Attribution (MTA) and marketing channel synergy.
Assuming you solve all your tagging issues and have well-structured soundly applied tags, you should not forget that the tag is only as good as the tracking itself. A great challenge here is the gap that exists tracking users across devices. You cannot link visits by the same user on different devices without sophisticated tracking that users have signed up to beforehand. This means your tags cannot tell the difference between the same user visiting twice on two different devices and two different users.
The idea every one of us can be closely tracked and monitored online is an illusion for all the biggest technology companies – and perhaps we should be glad of that. Indeed, unique ID tracking and linking is now more closely under scrutiny the age of data security breaches, increased concerns over user privacy and the GDPR. This is yet another source of difficulty for companies looking for a 360-degree view of the customer. Companies have to work with fully consented and well-defined samples of data to make progress in understanding their customers.
For the analyst, this is yet another reason why having huge volumes of data is not enough for user insight and data-driven decision making.
So what can you do about all these data quality challenges?
Data quality is perhaps like muscle memory in sport. You use it or you lose it. It’s only by trying to analyse and find patterns in your data that you uncover the issues that need to be addressed. Where there is a need, strategies can be devised to manage these gaps in data quality and take steps for improvement. It is a process.
The best advice is to get stuck in. Pick one data source and run with it, making sure to compare it to others and ask if the data makes sense given what you know about your customers. There are always discrepancies between data sources which in theory should report the same numbers: in my experience, this is a kind of law of all data analytics, so you need to get used to it. Use these differences to help you validate your sources, understand why differences might arise, and just accept that there is an acceptable level of difference – say 2-3%.
In data analytics, as in life, you must not let perfect be the enemy of the good. Be wary of the massive data technology project which promises to link all data together in one big data lake and thereby solve your challenges. Bad data plus more bad data does not equal good data. Face up to your terrible data quality, and tackle the ugliest issues head-on. If you ignore the problem, it can only get worse and you will continue to struggle forwards in the dark.
Gabriel Hughes PhD
Can we help unlock the value of your analytics and marketing data? Metageni is a London UK based marketing analytics and optimisation company offering support for developing in-house capabilities.
https://www.metageni.com/wp-content/uploads/2018/09/puzzling-data.jpg483724Gabriel Hugheshttps://www.metageni.com/wp-content/uploads/2015/10/logo1.pngGabriel Hughes2018-09-03 13:05:272021-05-05 10:28:36Just how bad is your analytics data?
‘What attribution model should we use?’ is now a question being asked not just by marketing analysts, but by the CMO and even the CDO, CFO and CEO.
It would be fantastic if there was a simple formula to answer that question but of course, life is never so easy. Here are some simple guidelines you can follow to help you answer this question.
Differences in attribution reflect differences in your business
If your business offers a higher value considered purchase, like a holiday or a financial product, then you should expect a longer more complex research to purchase journey, and therefore a higher value given to the earlier touch points. This means attribution models with longer ‘look back’ windows and rules favouring earlier clicks.
If your product is in an especially competitive market, with lots of similar alternatives available to your customers, then you need to focus on the research and comparison phase, again valuing earlier clicks, and thinking about how to address the cross-device attribution challenge. Users who shop around research on their phones and tablets, and flip through several options, before settling on their final choice. Your product needs to be part of that journey so if you are to have a chance of being selected then leaving it all to the last click is leaving it too late.
If your business relies heavily on repeat custom, then you should attach higher attributed value to new customer acquisition and retention, exploring a cautious application of lifetime value. You cannot assume customers will stay loyal without work, but a repeat purchase is almost always easier and less costly to achieve than the first one.
Do you have a diverse portfolio of different products, targeting different customer needs and with a wide range of price points? In this case, you may need different models for different product categories. It sounds complex, but if you think about it makes no sense to treat the buyer of (say) an expensive hi-fi system the same as the buyer of a replacement phone charger. Different customers also behave in different ways. Customer segmentation should therefore map on to differences in marketing attribution.
Is your market heavily driven by brand perception and do you invest in TV, outdoor or print? In this case, you will need to explore how to link above the line analysis to your digital attribution.
And so, it goes on…. The truth is, each business is unique, with unique complexity around the product offer, their customers, how they engage, and the value of each sale. The only way to address the uniqueness of your business is to develop appropriate unique attribution models.
Incorporate these unique features into your model
Once you accept that your attribution challenge and therefore your attribution model is unique, everything becomes easier. For example, instead of plugging into a standard tag or data collection framework, you can leverage the special and unique features of your data as inputs into your model.
You can structure your attribution model to reflect the different customer journeys that you see for each customer segment and product group. As you learn more and more about your customer journeys, your data-driven modelling can adapt.
At Metageni we believe a customised data-driven approach unique to your business is the only way to get to grips with this complex business challenge and turn it into opportunities to increase marketing efficiencies and grow sales. You will not look back!
Gabriel Hughes PhD
Can we help you with your custom attribution model? We collaborate with our clients’ organisations, helping marketing & data science teams create integrated on & offline investment analytics solutions for optimising sales growth.
https://www.metageni.com/wp-content/uploads/2018/09/discussion.jpg467700Gabriel Hugheshttps://www.metageni.com/wp-content/uploads/2015/10/logo1.pngGabriel Hughes2018-09-03 12:09:192021-05-05 09:59:48Your attribution model is unique
You interact with your users, customers and potential customers in so many ways, through many channels and at all stages in their journey. So, how do you use data to create insights which improve their experience and grow your business?
There are three broad approaches to this problem out there. These are:
the digital path approach
the statistical modelling approach
the user experience approach
Let’s explore these and think about how each can lead to new insight: what else we might be able to do with the user journey.
1. Digital path analysis of user journeys
At Google in 2008, a small team in London started looking at custom DoubleClick ad server platform data tracing the ad exposure and click path across digital channels to conversion on an advertiser web site (the team was headed up by yours truly). At around the same time others at Google were looking at measuring the assisting role of generic search clicks to brand search conversions. These analyses became the underpinnings of user path analysis, search funnels and attribution modelling, now built into Google Analytics.
The first task of the attribution pioneers was challenging the built-in assumption that ‘last click wins’ when measuring digital conversion journeys. Today most companies still rely on this flawed conversion tracking logic inherited from the earliest days of internet marketing. The issue is that when the customer sees an ad or searches around to research a product, they often decide what to buy first and then only after that do they make the final last click to the sale: last click does not in fact ‘win’.
More recently user path analysis based on behavioural data has become widespread and is now serviced by small army of marketing attribution technology companies. Data on marketing and web analytics has the potential to provide a complete view of the user path both off and on the website. Still many companies are only just waking up to these new ways of measuring marketing impact and a recent eConsultancy survey found that only 39% of marketers believe they understand customer journeys and adapt the channel mix accordingly.
The analytical challenge today is finding the most robust method for data driven attribution, ensuring that data on the non-converting paths are used to infer the contribution of each channel, campaign and ad. Several companies claim to do this, and clients of Google Analytics Premium can leverage Google’s own data driven modelling approach to optimise their digital marketing spend.
If you want to know what role a digital click typically plays in user journeys for your industry in your part of the world, you can even access a free data tool by Google to do this at a very high level.
In this view user journey analysis would seem to be a solved problem at least for marketing ROI, but of course it is not that simple as there are huge challenges remaining. Foremost, the data is generally incomplete: users are rarely tracked across part of the journey, even if the tracking is set up right, they use different devices and browsers which miss out the whole picture. People also have a habit of being influenced by channels you cannot so easily track in the same data set as your trackable analytics, like offline ads, call centres and not forgetting, competitor offers. In addition the data you get is usually anonymous, with limited data on who exactly you are influencing, which matters a lot because different types of people are influenced in very different ways.
So much for method one: it’s a great data story when it can tell you how to better allocate your online spend, and optimise your web site, but it is only as good as the incomplete data that goes into it.
2. Statistical modelling for user journeys
While digital path analysis takes the problem from the bottom up, an older method looks from the top down, tracking sales and channel influences over time, and using regression modelling to identify the role of each channel. The same techniques can be used to compare steps in a user journey on a site, tracking views and clicks over time and modelling how they influence each other in aggregate.
A great advantage of this approach is that it can look at relationships beyond what can be tracked at user level. Notably the role of offline channels can be included. The statistical techniques are well understood and applied methods like market mix modelling are taught at many business schools and universities. Given the potential to get more of a complete picture, some are advocating this type of approach (for example this post).
The downside is that while you gain in non-tracked data, you lose granularity in the tracked data. The data about each step a user takes is aggregated with the steps of all other users, leaving out the important path information. As a result the technique has somewhat fallen out of fashion, despite the advantages in linking all online and offline channels that it potentially offers. Also, if you use statistical modelling beware: a non-expert cannot tell the difference between a spurious regression model and a robust one, but the former kind is all too commonly peddled even in large organisations. So, make sure you have plenty of data and a statistician who knows what they are doing.
3. User experience for customer journey mapping
The third approach which many companies follow is to storyboard the customer journey across all touchpoints. The idea is to create a visual map of how different types of customer interact at all stages of the journey through to sale and ongoing engagement, across all touchpoints. Each type of customer is identified by a customer persona. Developed by UX design professionals, the map then becomes a useful way to exploring the blocking steps or friction points in the customer journeys, and for making sure that all channels are working together in synch.
The main upside of a customer journey map is the ability to summarise and communicate the user experience to key folks in product management, marketing and senior management. A good map can create empathy for the customer and focus minds on the central problem of improving their experience.
Because mapping is a communication tool, it makes for a great data story. However, of the three approaches outlined here it runs the obvious risk of being strongest on the story, but weakest on the data to back it up. To be data driven customer journey maps can work provided they are grounded in research derived customer personas and each stage in the journey is linked to metrics and KPIs that can be tracked to understand how the journey is building new and repeat business.
The true actionability of a customer map lies it the way it highlights the bottlenecks in the customer journey, so that these can be given due attention. Using market research personas, metrics and KPIs grounds the focus on the steps which matter, and guides data driven priorities. Not a bad approach, however unlike the other two, it is unlikely to yield information about the true incremental impact of any changes you make.
For more information on the customer mapping approach take a look at these this post by Adobe and this one by webdesignviews.
The best of all worlds
So, is it possible to get the best of all worlds? The simple but expensive answer is that you could do all the above and explore areas of agreement and inconsistency. A more manageable answer would be that if you are mainly a digital play, path analysis and attribution make sense. If you have significant offline to online interactions, and your business is high volume, you should explore statistical modelling. And whatever your business model, a high-level customer journey map can help you to understand the overall picture.
Also, if you do make changes to how you interact with your customers, do not forget to apply experimentation and AB testing to validate and optimise the journey.
Finally, if you really want to understand your customer journey, you must try going through the process yourself. After that, try the journey again but this time imagine you know nothing about your product, you are on your iPad and you are in a big hurry. You will soon find room to improve.
https://www.metageni.com/wp-content/uploads/2015/10/User-journey-insight-1.jpg6861030stuarthttps://www.metageni.com/wp-content/uploads/2015/10/logo1.pngstuart2015-10-23 16:02:202021-05-05 09:39:27Find insight from the user journey
Click on the different category headings to find out more. You can also change some of your preferences. Note that blocking some types of cookies may impact your experience on our websites and the services we are able to offer.
Essential Website Cookies
These cookies are strictly necessary to provide you with services available through our website and to use some of its features.
We provide you with a list of stored cookies on your computer in our domain so you can check what we stored. Due to security reasons we are not able to show or modify cookies from other domains. You can check these in your browser security settings.
Other external services
We also use different external services like Google Webfonts, Google Maps, and external Video providers. Since these providers may collect personal data like your IP address we allow you to block them here. Please be aware that this might heavily reduce the functionality and appearance of our site. Changes will take effect once you reload the page.
Google Webfont Settings:
Google Map Settings:
Google reCaptcha Settings:
Vimeo and Youtube video embeds: