How Data Sandboxes Change the Game

Peter has a very stressful job. He’s a data architect in an IT firm, which means he’s responsible for the systems that move data around, including bringing in new data in different formats and from varying sources. Every time he has to add a new source, it takes a long time because he has to make sure it doesn’t interfere with ongoing work. And when he’s finally finished setting up the connectors from the new sources, Peter is filled with dread because he knows something will go wrong the minute he hits go. As he finishes his sixth cup of coffee, he closes his eyes to try and soothe his ragged nerves by thinking of something that always calms him down: a safe testing environment where he can play around with new data sources and try them out before pushing them live. If only it were a reality. What Peter needs is a sandbox.

Shifting sands

A sandbox is a sophisticated parallel environment that replicates the entire customer data workflow, from ingestion and resolution through segmentation and orchestration. It allows technical teams to safely and efficiently change any element and manipulate data flows to see what yields the best results.

This type of setup is important in an environment that deals with messy data. Data is messy because there’s a lot of it, it comes from different sources and in varied formats, and it’s always changing as people go through life. Robust data architecture needs to have the flexibility to easily add new channels and destinations — and there are many: point of sale, email, ecommerce, and service center calls on the way in; email platforms, marketing clouds, advertising onboarders on the way out, to name just a few. As businesses evolve, they need to add new information and change the way things fit together. And just like any other root system, there needs to be support for making changes in isolation, the option to review them before they go live, and the ability to pivot quickly in case something goes wrong.

In a customer data workflow, a sandbox needs to address:

Running tests: the isolated environment allows users to try out codes, run programs, and open applications without affecting the system they use.
Validating changes: sandboxes track changes across the entire data management system and will validate if a change in one place affects dependencies. It reduces the complexity of testing plans by letting the application handle large portions of the work.
Forecast impact: users can see how these changes will affect their system and evaluate whether they want to push them live.

Driving with no safety belt

Trying to manage change without the stability of a sandbox is frustrating, but more than that, it’s risky. You’re not going to know how the changes you’re making will affect the rest of your data ecosystem until you push them live, and the majority of Software as a Service tools expect users to make changes in production which means there’s a high chance of something going wrong. Sandboxes give you a safe, non-public environment to see if the alterations you’re making work.

There are different ways that working without a sandbox can prove difficult, like:

Maintaining multiple environments with copies of all data infrastructure — most development environments use exclusively test datasets for security reasons, and because of this there can be bugs that show up with real-world data that weren’t present in the test
Any time you make a change, you need to do it in the first environment, update the transfer process to include that update, account for any testing required, and then migrate that process
Building a system to migrate changes between environments requires a lot of custom code. Manual processes are dangerous territory for companies as they can potentially break things (in ways that are hard to roll back), and if something goes wrong, the impact may not be immediately apparent
Bypassing multiple silos, which adds time and introduces chances for mistakes along the way

Playing in the sandbox

Sandboxes give data technicians a safe environment to play with code without worrying about potentially breaking something.

Working with sandboxes brings myriad benefits, including:

Easy to roll back if something changes because of its robust version history
The ability to run tests with production-grade infrastructure, security, and data, which eliminates the risk of issues arising from test data or development environment infrastructure
Simple to iterate and build from because every change you make is in a parallel environment
Less risk of breaking things
Less chance of partial migrations
More transparency on change history
The ability to validate all changes, including the entire tree of dependencies

Bolstering your data architecture

A whole data pipeline from ingest to analysis to egress will need architecture that can evolve without crashing, and getting value from customer data is contingent on staying up to date with it and logging new information without fear of everything breaking.

Sandboxes provide a secure environment where users can exercise creativity and test out new codes and try out programs, giving them the ability to account for changes in data flows, architecture, sources, and destinations. That’s why we built Amperity sandboxes — so that people like Peter, our stressed-out engineer tasked with managing change, can safely and quickly do his work without worrying about anything breaking. If you find you have a lot in common with Peter and want to make your work easier, learn more about sandboxes here.