Making Sense of Data: A Practical Guide to Exploratory Data Analysis and Data Mining

4.2 TABLES

4.2.1 Data Tables

The most common way of looking at data is through a table, where the raw data is displayed in familiar rows of observations and columns of variables. It is essential for reviewing the raw data; however, the table can be overwhelming with more than a handful of observations or variables. Sorting the table based on one or more variables is useful for organizing the data. It is virtually impossible to identify any trends or relationships looking at the raw data alone. An example of a table describing different cars is shown in Table 4.1.

4.2.2 Contingency Tables

Contingency tables (also referred to as two-way cross-classification tables) provide insight into the relationship between two variables. The variables must be categorical (dichotomous or discrete), or transformed to a categorical variable. A variable is often dichotomous; however, a contingency table can represent variables with more than two values. Table 4.2 describes the format for a contingency table where two variables are compared: Variable x and Variable y.

Table 4.1. Table of car records

images

Count₊₁: the number of observations where Variable x has “Value 1”, irrespective of the value of Variable y.
Count₊₂: the number of observations where Variable x has “Value 2”, irrespective of the value of Variable y.
Count₁₊: the number of observations where Variable y has “Value 1”, irrespective of the ...

Get Making Sense of Data: A Practical Guide to Exploratory Data Analysis and Data Mining now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Making Sense of Data: A Practical Guide to Exploratory Data Analysis and Data Mining by Glenn J. Myatt

4.2 TABLES

4.2.1 Data Tables

4.2.2 Contingency Tables

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly