blog | 4 min read

The Lakehouse CDP: Data Quality and Governance Without the Silos

May 16, 2024

An image of the Amperity Lakehouse CDP user interface showing someone setting up a Bridge connection to share data between the CDP and the lakehouse.

Today Amperity is proud to announce updates that solve the persistent challenges of Customer Data Platforms (CDPs): data quality and governance across all the tools in the stack. 

These updates turn Amperity into what we’re calling a Lakehouse CDP — a CDP plugged directly into a Data Lakehouse to take advantage of its open format, so users can easily share customer data among tools and platforms without having to copy data or build and manage connections. 

Lakehouse CDP represents an evolution in customer data management, moving beyond the limitations of earlier eras of CDPs. A brief history (with a more in-depth exploration linked here):

  • First era: Packaged CDP – An all-in-one platform with a closed environment, meaning it stores a copy of the data it collects to optimize performance. The problem was that they weren’t designed as part of a broader data management strategy and often required extensive ETLs and connectors to move data anywhere, creating another silo.

  • Second era: Composable CDP – An approach championed by component products (primarily Reverse ETL companies) where businesses build a CDP with a combination of SaaS offerings and custom code aimed at getting these components to work together. It was more flexible than a Packaged CDP and built around the data warehouse, but lacking any features to solve the fundamental problem of building a unified profile and lacking any operations features that span the multiple components. This resulted in hidden costs and a massive maintenance burden on data engineering teams.

  • Now: Lakehouse CDP – An open, cross-platform solution that shares and activates data across a tech stack without replication. Instead of relying on complex business logic, a Lakehouse CDP unifies and enriches customer data in a Lakehouse without custom code for activation, analytics, and AI use cases. This results in true composability, combining the benefits of Packaged and Composable CDPs without any of the limitations. 

Amperity introduced the world’s first Lakehouse CDP with data sharing powered by our new feature, Amperity Bridge, to address the critical concerns of data quality and governance through four key benefits:

  • Automating identity resolution

  • Building data assets quickly

  • Syncing enriched data to any tool

  • Keeping data secure and flowing

A data architecture diagram showing Amperity connected to a Data Lakehouse through the data sharing interface Amperity Bridge, with various sources sending customer data into Amperity and the Lakehouse from the left and various destinations where enriched data can go from Amperity and the Lakehouse on the right.

Read on to understand why we re-platformed our CDP for the Lakehouse environment specifically, how we use Amperity Bridge to support open sharing of data through the lakehouse and across the stack, and what users can expect from the four benefits listed above. 

Or if you’re already excited to see it in action, watch the demo now!

Why the focus on the Data Lakehouse

We believe a Customer Data Platform should be a unified application that handles the entire end-to-end transformation of raw data into unified and enriched profiles. At the same time, Lakehouses are becoming the new standard in enterprise data management thanks to how they decouple data storage from computing. Databricks and Snowflake have each created an open-source, data-sharing protocol so teams can decide where to store and process data. This flexibility lets data teams build cross-platform workflows without needing to copy data or build and maintain ETLs, so businesses can use the best tools for each job. We believe more and more SaaS tools will embrace Lakehouse architecture, as evidenced by the investments in Lakehouse features from Snowflake, Azure Fabric, and GCP's BigLake, as well as the major shift in this direction on the part of the marketing clouds.

However, customer data that has been centralized in a Lakehouse still needs to be cleaned, unified, and aggregated into insights for business users. Amperity’s specialized CDP tools for AI-powered identity resolution, customer profile unification, and governance working within a Lakehouse-based architecture mean that businesses can create the best data asset and use it wherever it needs to be used. No friction, no silos. 

An image showing the architecture of the Data Lakehouse Landscape. At the bottom is a storage layer where data is kept in Iceberg, Delta Lake, and Parquet tables; above that is a Metadata and Governance layer. Connecting into these are various apps, tools, and use cases where compute happens. The main points of the diagram is that compute is separated from storage, and that Amperity accounts for more than half of the functions that would need to interface with stored data, including ETL, Data Engineer Tools, Operations Tools, Identity Resolution, and Reverse ETL.

What makes it work: Amperity Bridge

Amperity Bridge is the feature that lets users stop integrating and start sharing data to and from a Data Lakehouse, allowing Amperity’s AI-powered components to enhance Lakehouse data. It uses each Lakehouse’s open, industry-standard data formats so businesses can zero-copy, zero-ETL, or fully copy data as appropriate for each use case to optimize costs and prevent unnecessary network calls and processing. Amperity Bridge is general availability for Databricks and early adopter available for Snowflake. 

Advantages of Amperity Bridge:

  • Fast setup – Configure a share to any supported Lakehouse in minutes.

  • Zero copy – Access shared tables without replicating data.

  • Scalable – Process larger amounts of data since nothing needs to be moved or transformed.

  • Live data – View data at rest in each environment. 

Key benefits of the Amperity Lakehouse CDP

Amperity's modular capabilities allow brands to use any combination of its offerings to optimize customer data operations. Data teams can seamlessly transform data across Amperity and a Lakehouse with processes including identity resolution and enhancement through AmpID, generative AI with AmpAi, and pre-built data assets with Amp360 – all of which provide the visibility to establish and maintain governance. Business teams can use AmpIQ as a reverse ETL, enabling deep personalization with data from a Lakehouse.

Automate identity resolution

Amperity’s platform uses a patented AI process for identity resolution that ingests raw customer data, unifies it, enriches it, and assigns a common identifier. Customer profiles remain stable as new and conflicting data is captured, saving IT teams countless hours of profile maintenance. 

Build data assets quickly 

Amperity enriches data in a Lakehouse by populating pre-built data models and out-of-the-box attributes to accelerate the creation of data assets. This saves hours of gathering business requirements and provides IT teams with a model to build on in a Lakehouse. 

Sync enriched data to any tool

Amperity provides a business-friendly reverse ETL tool, AmpIQ, so marketers don’t need to understand the underlying data or SQL to access customer data. They can easily segment by profile attributes or based on marketing engagement history, customer purchase behaviors, channel-level opt status, web interactions, and other profile information.  

Keep data secure and flowing

Governing data has been a major gap in any type of CDP deployment. The customer data operations necessary to test workflows, monitor data transformation, and make changes eat up thousands of valuable hours that would be better spent building solutions to drive revenue and cost savings. Amperity provides native data observability and monitoring capabilities to ensure brands keep data secure and compliant. 

The new era of CDP is here

If your business wants to centralize data and use it more effectively, a Lakehouse CDP provides you with the flexibility of a composable CDP and the comprehensive infrastructure of a packaged CDP. Amperity’s Lakehouse CDP helps data teams choose how to store and process data to improve data quality and governance in a Lakehouse, boosting results and lowering the total cost of ownership. 

See how it works in our demo video and get in touch to learn how it can work for you.