Chapter 3. Stable Transformations

In this chapter, you will learn about data transformations and how they will help you convert a non-private data analysis into a differentially private data analysis. Understanding if a data transformation is stable will help you identify whether your data analysis can be transformed into a DP data analysis.

Data transformations encompass any function from a data set to a data set. In the context of transformations, consider a data set to be any form of data that has not been made private. Transformations are mathematical abstractions that represent any manipulations, modifications, and computations performed on a data set.

Non-private data analysis pipelines can typically be broken down into three distinct phases: data preprocessing, a statistical query, and postprocessing. In the pipeline shown in Figure 3-1, data passes sequentially through each phase.

hodp 0301
Figure 3-1. A data processing pipeline from both the non-DP perspective and the DP perspective

In a non-DP context, preprocessing consists of any modifications you may make to microdata. Microdata is a data set where each row corresponds to data from one individual.

An example is a function that modifies each record in a data set, one at a time, while preserving the dimensions of the data. Consider a preprocessing that doubles the value of each numeric element in a column. In differential privacy, ...

Get Hands-On Differential Privacy now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.