Chapter 3. Planning Your Prep

So you know your data isn’t suited to the purpose for which you need it. This is probably because you have tried to analyze it but hit a roadblock early on. Maybe there are multiple data fields containing the information where you would expect just one. Maybe the data you are analyzing has gaps. Or, maybe the data you want to analyze comes from multiple sources.

What do you do after reaching this realization? How do you develop a solution when all you see are the issues in front of you?

This chapter recommends a staged approach to help you plan your data preparation, define the outcome, and build a workflow to solve your challenges. The four stages in the proposed framework are as follows:

  1. Know your data (KYD).

  2. Identify the desired state.

  3. Determine the required transitions from KYD to the desired state.

  4. Build the workflow.

To illustrate this process, we’ll walk through a simple example data set, some sales data from Chin & Beard Suds Co. (Figure 3-1).

Sample data from Chin & Beard Suds Co.
Figure 3-1. Sample data from Chin & Beard Suds Co.

Stage 1: Know Your Data

Without understanding your data set as it currently stands, you will not be able to deliver the results you need. For small data sets, sometimes it’s very easy to develop this understanding and a corresponding plan. With larger data sets, the planning process can take longer, but it is arguably more important since ...

Get Tableau Prep: Up & Running now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.