Self-service predictive models in the Customer Data Cloud

Q: How does Amperity's pCLV model work?

Amperity's pCLV model is an ensemble of three independently trained sub-models: a Return Classifier predicting the probability of future purchase, an Order Frequency Regressor predicting how many orders a returning customer will place, and an AOV Regressor predicting average revenue per order. The three outputs multiply together, then pass through sigmoid rescaling to compress outliers.

Q: What's the difference between Amperity's predictive models and building your own in a data warehouse?

Amperity's models handle feature engineering, model architecture, training pipelines, and retraining cadence out of the box, while giving data scientists full hyperparameter control, versioning, and exportable holdout datasets for independent validation. Building in a warehouse from scratch gives you more flexibility but requires owning every layer of the pipeline, from features to monitoring.

Q: How do I evaluate an Amperity predictive model?

Every model version runs a validation pass against a holdout period. pCLV reports Mean Absolute Error, Spearman's Rank Correlation, Brier Score, and F1. Product affinity reports precision and recall by audience size against a naive baseline of historical purchasers. Holdout datasets export to Databricks, Snowflake, or BigQuery for independent analysis.

Q: What's the difference between the score, ranking, and audience size outputs in product affinity?

Score is an uncalibrated probability between 0 and 1 indicating affinity strength, comparable only within the same product attribute value. Ranking orders customers by score per product, with 1 as the highest affinity. Audience size flags (Small, Medium, Large) indicate whether a customer falls into the recommended audience for that product.

Your data team can build a customer lifetime value model, but should they spend six months doing it?

Predictive modeling inside most customer data platforms falls into one of two buckets. Either the models live inside a marketer interface with no visibility into how they're trained, no control over hyperparameters, and no way to validate them independently. Or the vendor hands you a set of SQL tables and wishes you luck building from there.

Neither option serves an enterprise data science team well. The first is opaque. The second is redundant: your team could build customer lifetime value (CLV) or product affinity from scratch, but it probably shouldn't because those aren't the models that differentiate your business. This spring, we released self-service predictive models in the Amperity Customer Data Cloud built on a different assumption: pre-built doesn't have to mean black box. Amperity has been delivering predicted customer lifetime value (pCLV) and product affinity to enterprise customers for years through services engagements, on the same Identity Resolution foundation that powers the rest of the platform. What's new this spring is that data teams configure, train, version, and validate them directly.

The predictive modeling problem for data teams

Tooling has matured, feature stores are standard, and most data teams can, given enough time, train a respectable CLV model against their warehouse. What hasn't changed is the opportunity cost. Time spent rebuilding table-stakes customer intelligence is time not spent on the work that differentiates your business: custom propensity models, attribution, pricing elasticity, experimentation infrastructure.

But most CDPs solve this the wrong way, in one of two failure modes. The first is to build predictive models for marketers and hide them behind a UI that doesn't give data scientists enough visibility to feel confident in the output. The second is to offer data scientists a limited set of complex tools that aren't meaningfully faster than what they'd build themselves in a warehouse. Neither serves an enterprise data team well. The first is opaque. The second is slow. What we built this spring is neither: a toolkit designed to accelerate data scientists while giving them the visibility and levers to iterate quickly and trust the outputs.

What's in the release

Two of Amperity's models are now self-service: predicted customer lifetime value and product affinity. Data teams configure, version, train, and validate both directly in the platform without a services engagement. Both are ensemble models, both run inside your Amperity tenant on configurable training and inference cadences, and both write outputs to database tables you can query in the Segment Editor or export to Databricks, Snowflake, or BigQuery.

pCLV answers the question: what's the total value a customer is likely to generate if they return to make another purchase in the next 365 days? Its outputs support value-tier segmentation (Platinum, Gold, Silver, Bronze, Medium, Low), winback targeting, and customer value migration analysis.

Product affinity answers a narrower question: which customers are most likely to purchase a specific product, category, or brand next? Its outputs are per-customer-per-product, ranked, with recommended audience sizes calibrated to capture roughly 50%, 70%, or 90% of future purchasers.

Both models train on the same unified inputs: the Merged Customers, Unified Transactions, and Unified Itemized Transactions tables produced by Amperity's Identity Resolution and data modeling. If your tenant has those tables, you have what the models need.

Inside the pCLV model

Three questions, three models

A customer's future value is a function of three separable questions: will they come back, how often will they buy, and how much will they spend when they do. pCLV trains one model for each question rather than trying to predict a single dollar value end-to-end.

The Return Classifier is a binary classifier outputting a probability between 0 and 1 that a customer will make at least one purchase during the prediction horizon (by default, the next 365 days). The Order Frequency Regressor predicts how many orders a returning customer will place during that window. The Average Order Value (AOV) Regressor predicts the average revenue per order.

Each sub-model trains independently and can be configured separately. You can run the Return Classifier as Random Forest, Gradient Boosted Trees, or Logistic Regression; the regressors support Random Forest and Gradient Boosted Trees. Random Forest is the recommended default; the other options are available when your data or use case calls for them.

The multiplicative ensemble

The three sub-model outputs combine as a product:

Predicted CLV = P(return) × Predicted Order Frequency × Predicted AOV

After the raw product is computed, a sigmoid rescaling step compresses extreme outliers so the distribution of predicted values is smoother and more usable downstream. A customer with a 95% return probability but modest order frequency and average order value lands in the middle of the distribution rather than the tail. A customer with high numbers across all three factors lands at the top.

The feature surface

Each sub-model trains on roughly 200 derived features from the unified transaction tables. The feature set includes lifetime totals, recency metrics, and frequency bands, plus the same measures over 12-month, 6-month, 3-month, and 30-day windows. That rolling structure lets the model distinguish between customers who are genuinely declining and those whose long-term averages obscure a recent resurgence. Holiday-specific features (order count and total amount during holiday windows) let the models capture customers whose behavior concentrates around seasonal events.

Inside the product affinity model

Product affinity is structured differently. It's an ensemble of a random forest classifier and a beta-geometric distribution that together answer a narrower question: for a given product attribute (a category, a subcategory, or a brand), which customers are most likely to buy next, and at what audience sizes does the model's signal outperform a naive baseline?

How the ensemble works

The random forest classifier learns historical purchase patterns for each value of the product attribute you select at model creation. One model is built per attribute field; the field is fixed at creation and cannot be changed afterward. Training uses 450 days of historical purchase data with a 365-day exponential half-life decay, so recent purchases weigh more heavily than older ones. The beta-geometric distribution sits on top of the classifier and handles the audience-sizing layer. It generates a purchase curve from observed transactions and calibrates how large an audience needs to be to capture a target share of future purchasers.

Scores, ranks, and recommended audience sizes

Each customer-product pair gets three outputs in the Predicted Affinity table.

Score is an uncalibrated probability between 0 and 1 indicating the strength of the customer's affinity for that product attribute value. Scores are only comparable within the same product attribute value. Don't use scores in segments, and don't use them as absolute thresholds across products.

Ranking is a per-customer, per-product integer where 1 is the highest affinity. Ranking is the recommended way to build "top N customers for category X" audiences, since scores shouldn't be used in segments.

Audience size flags are three Boolean fields (Small, Medium, Large) indicating whether the customer falls into the recommended audience for that size. Small captures roughly 50% of future purchasers with the fewest non-purchasers. Medium captures roughly 70%. Large captures roughly 90%. Audience sizes are inclusive: everyone in Small is also in Medium and Large.

The levers you control

For pCLV, each of the three sub-models has its own model-type selection and hyperparameters. Random Forest exposes max depth, number of trees, max bins, feature subset strategy (auto, all, sqrt, one-third, log2), and split impurity metric. Gradient Boosted Trees adds max iterations, step size, and loss type. Logistic Regression, available only for the Return Classifier, adds regularization strength and the elastic net parameter balancing L1 and L2 penalties. At the model level, you set the lookback window (default 4 years), max training size (default 50 million records), whether to balance labels for the Return Classifier, and whether to filter or flag outliers above the 99.5th percentile on lifetime order value, AOV, or order frequency.

For product affinity, hyperparameters are configurable only during initial version setup: max depth, number of trees, max bins, feature subset strategy, audience size thresholds, and customer exclusions from the Customer Attributes table. Product group selection can be rules-based (automatically including values with at least 100 purchases in the last 30 days and 250 in the last 365) or managed manually.

Every model supports multiple versions, each with its own configuration. That's where a lot of the practical value lives. Want to compare Random Forest against Gradient Boosted Trees for the Return Classifier on your data? Create two versions, run them through validation, compare the metrics side by side, activate the one that performs. The same applies for tuning hyperparameters within a model type, testing different lookback windows, comparing label balancing strategies, or evaluating outlier handling. Iteration is fast, results are directly comparable, and there's no version drift to untangle.

You don't spend weeks exploring data, engineering features, and building a feature generation pipeline. You also don't spend weeks building out model architecture, setting up ML Ops, or wiring retraining and inference schedulers. All of that is handled. The models come with roughly 200 derived features validated across thousands of training cycles, a configurable retraining cadence, a configurable inference cadence, and a scoring pipeline that writes results to queryable database tables. What's left for your team is the work that actually benefits from your judgment: which hyperparameters to tune, which versions to test against each other, which outputs to activate, and what to build on top.

How validation works

Every model version must pass a validation step before activation. Validation runs the model against a holdout period and compares its predictions to actual customer behavior over the same window.

Metrics

Four validation metrics come out of every pCLV run. Mean Absolute Error (MAE) captures the average dollar difference between predicted and actual CLV. Spearman's Rank Correlation measures how well the model orders customers from highest to lowest value, which matters more than absolute accuracy for most segmentation use cases. The Brier Score measures calibration of the Return Classifier's probability predictions. F1 is the harmonic mean of precision and recall for the Return Classifier. You configure the churn baseline (default 90 days without a purchase) that defines what counts as a non-returning customer for validation purposes.

Product affinity reports precision (how the model performs against random sampling) and recall by audience size (how it performs against the naive baseline). Recall is reported separately for small, medium, and large audience sizes.

The naive baseline

Both validation workflows compare the model against a naive baseline. For pCLV, the baseline carries forward each customer's spend during the period preceding the prediction horizon. For product affinity, the baseline is everyone who has purchased the product within the 450-day training window. Data science teams evaluating the release should look at both the model's absolute metrics and the delta to the naive baseline. A product affinity model with precision under 10%, or one that loses to the baseline on three of four recall metrics, shouldn't be deployed.

Exportable holdout results

Validation results export to Databricks, Snowflake, or Google BigQuery using an outbound bridge with the predictive_tables dataset selected. The holdout export for pCLV includes, for every customer in the evaluation window, the predicted probability of transaction, predicted order frequency, predicted AOV, predicted CLV, the naive baseline prediction, and the actual values observed in the holdout period. That's everything you need to run your own error analysis, compare against your team's preferred loss function, or benchmark against an in-house model. Results are retained for the most recent run of each model version.

Where to find the models

Both models are available to any database that contains the Merged Customers, Unified Transactions, and Unified Itemized Transactions tables. From Customer 360, open the database, select Predictive models, and click Add model. From there, pCLV and product affinity each have their own configuration flow. Only one pCLV model is allowed per database; product affinity supports one model per product attribute field (one for Product Category, one for Brand, and so on).

What changes for your data team

The reason this release matters for enterprise data teams isn't that pCLV and product affinity are new ideas. They aren't. The reason it matters is that owning them end-to-end was previously the only way to have confidence in them, and owning them end-to-end was a bad use of most data teams' time. The architecture is inspectable. The evaluation is yours to run against your own benchmarks. The outputs are queryable in the Segment Editor and exportable to your warehouse in the format your downstream systems already consume.

Time that used to go to building the table-stakes models goes to the models only your team can build.

For more detail on configuration, hyperparameter ranges, and validation metrics, see the predicted CLV documentation and the product affinity documentation. To see the models in the context of a full Customer Data Cloud deployment, request a demo.

Self-Service Predictive Models FAQs

How does Amperity's pCLV model work?

What's the difference between Amperity's predictive models and building your own in a data warehouse?

What hyperparameters can I configure in Amperity's predictive models?

Each pCLV sub-model exposes its own model type (Random Forest, Gradient Boosted Trees, or Logistic Regression for the Return Classifier only) with corresponding hyperparameters: max depth, number of trees, max bins, feature subset strategy, impurity metric, learning rate, and regularization. Training cadence, inference cadence, lookback window, and customer exclusions are also configurable per model.

How do I evaluate an Amperity predictive model?

What's the difference between the score, ranking, and audience size outputs in product affinity?