2Measuring Variation

2.1 What Is Variation?

If you measure your body height 10 times on a single day, you will not measure the exact same value every time. If you measure the circumference of all heads of newborns on a given day at a hospital, you will receive a different value for every baby's head you measure. Even if you weigh plastic parts that are mass produced by a 3D printer, you will get a range of values if your balance is accurate enough. Variation is so inherent in both the natural and the human‐made world that a deep understanding of its nature, its causes, and the possibilities to describe it, are fundamental. In fact, assigning variation to certain causal drivers can be viewed as the core purpose of statistical analyses. In this chapter, we will look at how variation is partitioned, the two different types of variation, explained vs. unexplained variation, and ways of describing variation both visually and numerically. We will rely on the basic R skills acquired in Chapter 1.

2.2 Treatment vs. Control

Any statistical analysis is ultimately comparative. The (scientific) statement ‘This treatment helps prevent arthritis’ actually means ‘On average, the odds of getting arthritis for people receiving this treatment are lower compared to people in a control group who do not receive the treatment’. The second (comparative) part of the sentence is often not expressed explicitly, yet extremely important. Without it, we find ourselves in a ‘no‐control’ scenario, or, ...

Get R-ticulate now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.