The Dirty Secret Hiding in Your Dirty Data – and its Impact on Your Best Customers

When it comes to customer data we’ve seen it all: from brands with the world’s largest loyalty programs to ones launching D2C initiatives for the first time. We’ve worked with retailers, travel and hospitality brands, fast-casual restaurants, and even car dealerships. What these brands have in common may be surprising: they think they know their best customers, but they don’t. In fact, they are more likely to misunderstand their best customers – including who they are, what channels they engage on, and how much they spend – than any other customer segment.

The numbers are pretty shocking. We found that, on average, businesses misidentify the 23% of customers who account for over 50% of all revenue

Your best customers have the dirtiest data

How can this be the case? It’s simple – your best customers have the messiest data. This is because they engage using multiple online and offline channels, use a variety of email addresses and other identifiers, ship products to different family members’ addresses, and, perhaps the trickiest issue of all, change fundamentally over their long relationship with your brand. They move, they get married and change their last name, they start a new job and use a new email address, all leaving a trail of data behind them. They sometimes sign up multiple times for your loyalty program (a fact that surprises many CRM leaders). Finally, because they shop and engage more, they also have more chances to enter their own data (or have it entered by your staff) incorrectly, riddling your records with typos and errors few systems can understand. 

All of this means that each of your different systems and channels has one or many different “versions” of an individual’s identity. These fractured profiles have a ripple-effect throughout your business, resulting in poor personalization as important signals are missed, inflated marketing spend as campaigns pay to acquire the same customer again and again, and worst of all – disjointed and frustrating customer experiences.

Brands have long attempted to build a complete view of the customer across systems using match-and-merge business rules, loyalty programs, and more. And while that’s a good start, it doesn’t solve for the customers that matter most.

“Tried-and-True” methods have tried and failed

At first glance all of these problems seem within reach of a loyalty program that incentivizes customers to self-identify: just consolidate all of your purchases on one account to reap the best rewards. But what loyalty programs don’t consider is that customers may not use their account because their card isn’t handy at the moment, they forgot their password, or because a store associate miskeyed their phone number. For example, a hospitality brand found that it had captured under 50% of eligible loyalty stays on customers’ loyalty accounts. Customers may choose not to use the loyalty program for a specific reason – like differentiating work and personal spending, or simply because they don’t want to. As we mentioned above, they also sometimes sign up more than once, either because they forgot they already had an account, or occasionally for more nefarious reasons such as wanting more than one birthday or other reward.  

Because loyalty programs have failed to provide a solution for disconnected customer identities and records, brands continue to fall back on writing business rules and deterministic ETL to try to join sources together. While this is the predominant approach, it unfortunately isn’t enough to solve the core data problems, leaving bits of data orphaned in secondary and tertiary accounts and limiting your ability to make smarter decisions about the customers that matter most. This is because data is messy and business rules aren’t smart enough to see past the errors and inconsistencies in your data. For a deeper exploration of why this is the case, check out our whitepaper that explores the fundamental data challenges in depth and looks at the impact of your business. 

Enter Machine Learning

Modern brands need a more comprehensive approach to managing their customers’ evolving identities so they can better serve them. Luckily, with advancements in AI, a new approach is possible. Machine learning algorithms can be trained to understand who is who across all your customer data (yes, really – all of it), leaving you with customer profiles that are more complete and accurate, built in a fraction of the time, and that actually get better over time.

If you’re on a path to be more customer-centric or are struggling to get the data you need for personalized marketing, look for a modern data management solution that can handle the complexities of your real-world, dirty customer data. Your best customers will thank you.


Let us know what you think about The Dirty Secret Hiding in Your Dirty Data – and its Impact on Your Best Customers on Twitter.

Up Next

Customer Experience

The Power of Partnership

Customer 360

4 Challenges to Getting a Customer 360 Right

Customer 360

Wrestling with Customer Data? Check Out Amperity Sandboxes (New!)

Signup to Receive New Blog Posts

Privacy Policy

By submitting this form, you agree to our terms and privacy policy. You can manage your communications preferences at any time by clicking “Unsubscribe” at the bottom of any of our emails.