Welcome to our blog series on decoding identity resolution. This is a nine part blog that offers an attempt at a friendly, comprehensive view of how to think about the concept of identity resolution as well as how to interpret the way it is represented in marketing and sales materials by different companies across the tech landscape. The other articles in the series can be found here:
The data management space is convoluted. Every single platform has a diagram with a bunch of boxes on the left, some magic in the middle, and a bunch of boxes on the right. This is because modern companies build a collection of applications to solve use cases that present near infinite possibilities.
Below are some common narratives to look out for when talking about identity resolution with a software company you are interested in learning more about.
“Probabilistic” and “machine learning”
Technically the term “probabilistic” just means that the process makes some educated guesses about the data. The “score table” concept described earlier in this series about rules-based rules identity resolution involves making some assumptions that could be wrong.
In the market, companies will take advantage of how complex identity resolution is, and throw words like “probabilistic” or “machine learning” into their materials. Which is technically true but clearly not what the customer thinks they are buying.
Companies also commonly refer to “machine learning” in the same messaging as their profile resolution, but they only use actual machine learning algorithms on behaviors and only after identity is resolved. This is much less complex but it makes their platform sound more sophisticated than it is.
Terms like “unified profile” or “Customer 360” get thrown around in almost every application regardless of the underlying truth. In reality, most applications have a specific view of a subset of customer data they specialize in, which means their platform will only support a summarized view.
Common gaps that this phrase obfuscates:
Not being able to store historical data
Only storing summary data that you will need to process outside of the application
Storing people as “visitors” biased for digital identity, meaning little to no support of other data
Being unable to store or manage ALL behaviors
There’s little worse in this space than adopting a platform based on a polished pitch only to have them tell you they can’t load part of your data and you will have to rely on expensive custom ETLs or an entirely separate tool to use part of your data.
Rebranding third-party identity resolution
Identity resolution is such a hard problem to unravel successfully that many applications on the market will implement a very rudimentary, deterministic algorithm. When their customers continue to see problems, they simply partner with a third-party provider and claim “advanced identity resolution” of some form. In reality, what they’re saying is that they will resell or integrate you with a costly third-party provider that will match your data deterministically against their graph with absolutely no transparency as to how they came to the conclusion they provide.
This is expensive, creates an unintentional reliance on third-party data, and is basically a software company throwing in the towel because identity resolution was just too hard for them.
Simple stories hiding gaps
It’s hard to talk about identity resolution without very quickly getting into the “algorithms” weeds. They’re often far beyond the technical depth of whoever wants to buy a piece of technology, and companies have become very good at crafting simple stories that hide functional gaps in their platforms.
For example, many platforms that frame themselves as a “Customer Data Platform” are simply online, event routing platforms. They do not store a “Customer”, they store a “Visitor” and they do not have first class-support for any data not directly coming from your website.
If a company cannot give you a deep breakdown of how their platform works, it is because they are hiding a gap they think you won’t understand.
Large scale data
Next time you are in a demo and a company is showing off its product, pay attention to the counts that show up on screen. A staggering number of companies demo their platform with less than 100k profiles worth of data in it.
Always be sure to dig into how a platform handles scale. A platform that is snappy and easy to use with 100k records might be near unusable when there are 100 million or more. A modern data solution should be able to handle hundreds of millions of profiles and hundreds of billions of events, transactions, and other behaviors.
Next up I will dig into some of the Amperity specific offerings and what makes us the best in the market at identity resolution (the time had to come eventually, right?) Click here to advance to part eight!