4
Tidy Data
Hadley Wickham, PhD,1 one of the more prominent members of the R community, introduced the concept of tidy data in a Journal of Statistical Software paper.2 Tidy data is a framework to structure data sets so they can be easily analyzed and visualized. It can be thought of as a goal one should aim for when cleaning data. Once you understand what tidy data is, that knowledge will make your data analysis, visualization, and collection much easier.
1. Hadley Wickham, PhD: http://hadley.nz
2. Tidy Data paper: http://vita.had.co.nz/papers/tidy-data.pdf
What is tidy data? Hadley Wickham’s paper defines it as meeting the following criteria: (1) Each row is an observation, (2) Each column is a variable, and (3) Each type of observational ...
Get Pandas for Everyone: Python Data Analysis, 2nd Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.