Two-by-Two Contingency Tables

Count data are often classified by more than one categorical explanatory variable. When there are two explanatory variables and both have just two levels, we have the famous two-by-two contingency table (see p. 309). We can return to the example of Mendel's peas. We need to convert the vector of observed counts into a matrix with two rows:

observed<-matrix(observed,nrow=2)
observed
     [,1]   [,2]
[1,]  315    108
[2,]  101     32

Fisher's exact test (p. 308) can take such a matrix as its sole argument:

fisher.test(observed)
Fisher's Exact Test for Count Data

data: observed
p-value = 0.819
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
0.5667874   1.4806148
sample estimates:
odds ratio
 0.9242126

Alternatively we can use Pearson's chi-squared test with Yates' continuity correction:

chisq.test(observed)

               Pearson's Chi-squared test with Yates' continuity correction

data: observed
X-squared = 0.0513, df = 1, p-value = 0.8208

Again, the p-values are different with different tests, but the interpretation is the same: these pea plants behave in accordance with Mendel's predictions of two independent traits, coat colour and seed shape, each segregating 3:1.

Get The R Book now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.