blog | 7 min read

A Deeper Dive: How Smart Brands Get the Most from Customer Data

October 20, 2020

Image displays yellow background with black lines and dots.

Brands need a customer data strategy that resolves, analyzes, and operationalizes first-party data. And they need the highest quality third-party data to enhance and expand on this first-party foundation. Experts from Infutor and customer data platform Amperity recently discussed why this is true now more than ever in a webinar exploring how smart brands are getting the most out of their data.

The discussion also touched on how to combine first- and third-party data for more accurate and efficient identity management, how brands systematically use data to better connect with customers, and compliant use cases. You can watch the full webinar here.

After presenting the webinar, Infutor VP of Partnership Development Jason Ford and Amperity Director of Product Management Yoni Friedman took time out to discuss additional questions from webinar attendees and dove further into the secrets of better marketing outcomes and longer-term, higher-value customers.

Q: Yoni, you mentioned in the webinar that third-party data is often a topic of conversation with clients. Can you expand on what they’re typically looking for holistically in third-party data?

YF: At a very high level, our customers ask about third-party data in the context of interest in expanding their marketable audiences. Different companies approach this with different needs depending on their business’s characteristics (e.g., how they generally serve and market to consumers) and specific priorities and campaigns. Some of the needs we see most often include enriching consumer profiles with additional attributes like age and gender to create better personalized offers, and performing address hygiene to be able to send mail to customers that may have moved since last providing their address.

Given these needs, we typically see brands evaluate potential third-party data on two key criteria: data quality and pricing. When we help our customers run these evaluations, we break down data quality which is a general indicator, into the following specific metrics:

  1. Person coverage / match: the number of individuals provided by the brand that the third-party data has additional information on. For example, if we test with 10,000 consumer files, how many of the randomly selected individuals are found in the third-party dataset.

  2. Attribute coverage: the number of attributes the dataset contains overall; for example, how many attributes (such as age, gender, etc) are generally available in the dataset.

  3. Attribute fulfillment: the share of available attributes that actually receive a value. This is important because there are cases when an attribute is theoretically available, but for some share of the population it does not actually contain any information.

  4. Accuracy: the share of fulfilled attributes that contain accurate information about the person. For example, is the age returned actually the actual person’s age.

Q: Jason, you spoke about brands prioritizing customer retention and lifetime value, especially in regards to five- to 10-year relationships. What do you suggest in terms of data hygiene cadence?

JF: Obviously, a lot of change can happen in a half a decade or decade. Data can become stale rapidly, and poor data or matching is a very quick way to end a long relationship. There isn’t a universal rule for how often any given organization should cleanse its data. But I would say that minimally it should be done upon data ingestion and prior to the activation of the data. Beyond that, we see it done anywhere from monthly to quarterly, so it really depends on the given workflow. In today’s data-driven world, organizations need to look to data as currency, as an asset that’s dynamic and not static,and as a foundational element to lifetime value.

Q: It was mentioned on the webinar that a brick-and-mortar brand added two simple attributes in age and gender and saw a 3x increase. What are some other commonly valuable attributes that a third-party provider brings?

YF: The most common ones we see customers ask for are addresses - mailing and email. Email is particularly important because it is often as an identifier in internal databases as well as in advertising systems. We also see other forms of basic contact and demographic information often, but we are also seeing more and more companies use additional attributes like an affinity to buying a certain type of product, for example.

JF: It is really dependent upon the organization. We’ve seen more traditional attributes like age and income as commonly requested attributes, but more and more brands are looking deeper into the unique data attributes that can move the needle. Many are leveraging machine learning with hundreds of behavioral attributes to let AI tools help define the data that makes the biggest impact. More innovative brands are using attributes as the raw fuel for modeling. In simpler cases they’re looking at age and gender and in more complicated cases of property and auto ownership, income, life stage information, etc. We’ve seen some really interesting results, with attributes that the brand would never have thought of as a key predictor of lead conversions and customer lifetime value.

Q: Jason, you talked about the importance of deterministic or “declared” data. What about the role of probabilistic data as a complement for marketers?

JF: Leveraging probabilistic data typically depends on a marketer’s use case. Probabilistic data is often easier to acquire than deterministic data and is available at a larger scale. I think much of the value of probabilistic data lies in data science and modeling initiatives that benefit from larger amounts of data. In those cases, probabilistic data can provide scale for modeling initiatives that deterministic data might not be able to provide.

Q: Yoni, what is your view on the role of probabilistic data?

YF: Probabilistic data can have different meanings in different contexts. In the Customer Data Platform context, we use these terms to describe two different approaches towards identity resolution. The probabilistic approach means that machine learning algorithms are used to assign a probability that two records are tied to the same individual; this approach enables systems to cluster records together even when the probability of the records being related to the same individual is less than 100%. For example, this is true when there is no unique identifier tying the two records together like in these two (fake) records:

Record 1

First name: Amanad Last name: Davidson Email: Phone: (262) 555-5877 Address: 565-66 Hoffman Estates Ave. City: North Samantha State: New YorkPostal: 11461 Birthdate: 1900-01-01

Record 2

Name: AMANDA Last name: WARREN Email address: Phone: +1-262-555-5877×172 Address: 5656 Ho Estates – Unit 66 City: North Samantha State: New Hampshire Zip code: 11461 Birthdate: 1982-04-09

The average human would probably say that there’s a high chance that these two are the same person, and the machine learning algorithms are trained to arrive at the same conclusion.

A deterministic approach, on the other hand, will only cluster two records together if it is absolutely certain that they are related to the same person via matching unique identifiers, which can be a customer ID or a combined key such as email address + phone number.

The two approaches have pros and cons, which should be weighted differently depending on the use case - there are cases when being wrong can be very expensive, for example, when sending a customer a highly specific communication such as their vehicle lease expiring soon, and others when a mistake isn’t a big deal, like targeting the same customer with an ad for a new lease on Facebook.

The world has increased public awareness and sensitivity to privacy in general. And with the use of third-party cookies, specifically in the context of using third-party data, it is becoming increasingly important to use deterministically matched information to make sure that your brand’s efforts to improve relationships with customers do not result in erosion of trust.

About Infutor and Amperity

Infutor and Amperity have partnered to provide end-to-end consumer identity management. Discover more about how Infutor’s consumer identity data will enhance Amperity’s resolved and unified first-party customer profiles to create an enriched foundational identity data management solution here. To learn more about Infutor and their industry-leading consumer identity management, visit

This post originally appeared on Infutor's blog.