Customer Data Unification

Lost in Translation

All The Right Tools and People, Why Can’t They Communicate?

Consumer brands today have enormous potential to build meaningful relationships with their customers. With a multitude of channels, enormous amounts of digitized data, and a glut of powerful martech investments, they have all the building blocks they need to understand, engage, and satisfy their customers as individuals. Yet despite these investments, when we talk to our customers, what we hear is that they are still struggling to pull all of their systems and data together in a way that works.

Marketers can’t get the customer data they need into the tools they use, even though they know the data exists in distant silos. Data scientists and analysts can’t produce meaningful insights because they spend 80% of their time munging data, and even then they’re working from a fragmented customer view. Customer service staff who interact directly with customers lack the information they require to create seamless experiences. IT teams are frustrated because, despite their heroic work, they continue to be overwhelmed by more and more requests for data, and at the same time confront new regulations around compliance and governance. And most importantly, consumers are frustrated because they are treated like strangers by brands they shop with continually. It seems like everything is being lost in translation. 

The Traditional Approach – Pick A Standard Data Language

Most brands have already attempted to solve these customer data challenges using the traditional approach to data unification – force all the systems to talk to one “master” system, and then use that system to drive all conversations. This approach involves picking a central system, defining a fixed schema, writing connectors that connect all other data sources to this system, and using deterministic logic to match records.

This approach looks good on paper, and given the technologies available when this approach was first invented, it was a decent solution to a fairly gnarly problem. That said, there are some serious limitations to managing customer data in this way, and the result is that a lot of vital information gets lost in translation.  

First, data is lost when it comes into the system. The reason for this is that all systems speak their own distinct data language, and a system with a fixed schema is designed to only speak one. This means that each system has its own idea of what a customer is, its own way of structuring data, and its own set of optimizations based on how it uses data. For example, a point-of-sale system is organized around the notion of transactions, not customers. Site personalization tools are optimized for real-time interactions and build cookie-based profiles. Email service providers store email preferences and response data, and distinguish individuals by email addresses. Trying to house all of these detailed, nuanced, and distinct notions of the customer in a rigid data schema is impossible. The data must be reformatted, trimmed, flattened, aggregated, and aligned to the fixed schema.  Every time data passes through one of these connectors, it loses richness, and an essential part of the customer is lost with it.

Second, deterministic approaches miss links between customers, resulting in incomplete identities. This is less about the structure of the data, and more about the fact that the data doesn’t align, given the lack of definitive linking keys across systems. Without a primary key/foreign key relationship, how can you tell for sure that the Kelcey Jones who bought at your retail store in Seattle is the same Kelcey Jones in your loyalty database? You can’t. And because most systems lack these linking keys, it’s impossible to bring data together to form the complete customer profiles you need to drive competitive  marketing, analytics, and customer experiences. Instead, you end up operating on a subset of customer data, leaving valuable customer data unleveraged simply because it doesn’t fit the rigid underlying architecture.

All of that said, this solution, while imperfect, was still best-in-class before the cloud. However, as storage and compute were no longer limiting factors, and machine intelligence was improved, a revolutionary new approach to customer data management became possible.

Enter the Cloud

In the last several years, there has been a tremendous amount of rapid invention and innovation. Hadoop was open-sourced in 2007, suddenly making cost-effective scalable storage and compute a reality for businesses. Cloud providers like Amazon Web Services and Microsoft Azure took the concept further, offering operationally robust capabilities that were more reliable and easier to use than anything that had come before. Then, in 2014, Apache Spark was released, enabling a more sophisticated distributed computing model built to take advantage of modern, public cloud-era hardware.

In parallel, academics were inventing new learning algorithms that could replicate human intelligence for certain tasks like speech recognition, image analysis, and machine translation. . Those algorithms, hungry for data, got smarter the more data they were given and the more computing power they could leverage. A powerful synergy was born: more data + more compute + better algorithms = better intelligence. Researchers realized these same approaches could also be used to solve identity resolution and translation between data languages.

The New Approach – A Universal Translator

Inspired by these innovations, we invented a new approach to solving the customer data problem. This new approach rapidly brings data together from systems that weren’t designed to integrate with one another, and then makes sense of it all using the intelligence of machine learning. All of this is possible at the massive scale and complexity of today’s consumer brand data, and the result is complete, unified, and flexible customer profiles that are unique to your company and can be used to fuel any downstream system. Let’s dig into the details.

Full datasets from each of your sources are ingested raw. No trimming, flattening, or reformatting needed. There’s also no limit to the size of the data (the platform handles trillions of entries concurrently), and it’s fast. Historical data can be onboarded in a few days, with daily incremental refreshes, no matter the volume or frequency. Once data is inside the platform it goes through a series of specialized steps to tag, normalize, and process the data, preparing it for machine learning-driven identity resolution. And the next step is where the magic really happens.

Next, specialized algorithms look deeply and comprehensively at your data. Not the meta data. Not select columns. But all the data, at the cell level – in fact, at the feature level. This means the algorithms are understanding your data in the same way a human would, if they could pour through all your data bit by bit. For example, the algorithms consider rare but matching email aliases with the understanding that even when the domain is different, it’s likely that the emails belong to the same person. The algorithms compare names with the understanding of how rare those names are in that particular zip code. They’re also making transitive connections simultaneously across 3, 4, or more different datasets. The result of all of this is the richest customer 360 profiles possible. These profiles are built using all your customer data – with no valuable information (like in-store purchases, or email responses, or survey data) left on the cutting room floor.

And finally, because storage isn’t limited, the platform stores not one version of your customer data in a single, rigid database, but as many as you need, in the formats and structures you require. This allows you to connect seamlessly with all of your systems of engagement and analysis, using the precise data language each of them needs to make full use of your unified data.

Power Today’s Tools on a Foundation for the Future

This approach works. It’s different from the older techniques because it takes full advantage of recent innovations to turn customer data management on its head. As a result, brands can finally get the rich, connected, and usable view of their customers that they need, delivering results quickly while establishing the foundation for the future. And this matters for three core reasons:

First, all the tools and teams you’ve invested in become more productive in the tools and systems they use today. The full view of the customer, powered by insights from all of your systems, can now be used to personalize and optimize every engagement you’re running today. By finally unleashing that data, people are happier, more effective, and more creative, and all the existing analyses and campaigns you’re already driving can become more targeted, more personalized, and more accurate.

Second, this approach helps to future-proof your entire customer data management, MarTech, AdTech, and analytics stacks. Because the system is flexible and adaptable, you can connect new systems at any time without breaking anything, so you can innovate and experiment while running your business day to day. The data can be used for more than just marketing, campaigns, ads, and web personalization;  it is now available to power customer support, customer experience, and analytics.

Third, with a flexible and powerful way to manage your customer data, you enable the types of intelligence and innovation that internet-first brands are already bringing to the table. You’re finally able to see trends and patterns that have never been visible, because your data was lost in translation.  As you start to unlock that data, you’ll inform not just marketing, CX, and advertising, but product development, operations, and more.

The Results Speak for Themselves

We’ve been privileged to work with passionate people who are working to redefine the customer experience for their brands, and in a few short months they are seeing results. Their people and tools are more effective –  a 200% uplift in conversions for a popular retailer. A 90% improvement in acquisition rates for a leading airline. A 130% increase in email revenue at a CPG brand. They adapt as their needs change – whether it’s new use cases, new data sources, or new best-of-breed engagement tools.  And they are starting to see their customers differently – finding patterns and trends that were invisible before. As on of our customers said recently, “When we installed Amperity it gave us insights that led to questions we never would have had before. It’s helping us think differently about who our customers really are, and how to best to engage with them.”  

If that’s the translation you’ve been looking for, we’re ready to get started. And that leads to another benefit from being on the cloud: no deployment required. Get your first customer 360 profile and unified customer identities within 90 days. Have your business use cases up and running in the first few months. And accomplish all of this while building on your existing tools and investments, with your unique customer profile, and using a common language that can finally connect your systems, your people, and most importantly, your customer.

Comments?

Let us know what you think about Lost in Translation on Twitter.

Signup to receive new blog posts as they're published.

Privacy Policy

By submitting this form, you agree to our terms and privacy policy. You can manage your communications preferences at any time by clicking “Unsubscribe” at the bottom of any of our emails.