Let’s apply what we’ve learned in Chapter 9 to R.
Objectives
Shortcuts:
| R Command | Notes |
|---|---|
| matrix_name <- matrix(data, nrow, byrow = T) | creates a matrix with numbers you provide; matrix_name is a name you will assign your matrix to, data is your vector of values (e.g. c(1, 2, 3, 4)) that you enter in row by row, nrow is your number of rows in the matrix, and byrow = T specifies that your vector should be filled by rows |
| rownames(matrix_name) <- c(“row name 1”, “row name 2”) | specifies names for each row in the matrix, where matrix_name is the name you assigned your matrix |
| colnames(matrix_name) <- c(“col name 1”, “col name 2”) | specifies names for each col in the matrix, where matrix_name is the name you assigned your matrix |
| chisq.test(matrix_name) | runs a chi-squared test of independence between rows and cols of a matrix; matrix_name is the name you assigned your matrix |
| fisher.test(matrix_name) | runs a fisher test for a 2x2 contingency table when data violate chi-squared assumptions; matrix_name is the name you assigned your matrix |
Reminder: Save your script for practice at home! :)
Shortcuts:
| Symbol or Command | Keyboard Shortcut |
|---|---|
| <- | Alt + - |
| # | Shift + 3 |
| Run one line in script | Ctrl + Enter |
| Run entire script | Ctrl + Shift + Enter |
| Open new script | Ctrl + Shift + N |
Exercise 7
1. Open RStudio and prepare a new script.
# Filename: (what you will save the script as)
# Title: (give script a title)
# Author: (write your full name here)
# Date: Month Day Year (write the actual date here)
# Description: (describe what this script is for)
2. We’ll go through a few examples of the chi-squared contingency test together.
Example 1: Does toxoplasmosis infection in humans increase risky behaviour leading to car accidents?
| Infected | Uninfected | |
|---|---|---|
| Drivers with accidents | 61 | 124 |
| Drivers without accidents | 16 | 169 |
Step 1: Make your matrix in R, and check data for correct placement.
toxo <- matrix(c(61, 124, 16, 169), nrow = 2, byrow = T)
toxo
Step 2: Add row and column names. Check matrix.
rownames(toxo) <- c("drivers with accidents", "drivers without accidents")
colnames(toxo) <- c("infected", "uninfected")
toxo
Step 3: Run chi-squared test. If you receive a warning message, you have violated assumptions and will need to use Fisher’s exact test instead.
chisq.test(toxo)
Step 4: Assumptions met! State your conclusion based on your p-value.
- p-value = 1.753e-08, meaning p < 0.05. Reject H0. Toxoplasmosis infection is associated with risky behaviour (car accidents).
Example 2: Was survival on the Titanic associated with gender?
Step 1: We have a data file for this. All we need to do is read in the data in R, and convert it into a table. Check your table.
surv <- read.csv("https://github.com/lczawadzki/biostats/raw/main/data/survival-titanic.csv")
surv_table <- table(surv$gender, surv$survival)
surv_table
Step 2: Run chi-squared test. If you receive a warning message, you have violated assumptions and will need to use Fisher’s exact test instead.
chisq.test(surv_table)
Step 3: Assumptions met! State your conclusion based on your p-value.
- p-value < 2.2e-16, meaning p < 0.05. Reject H0. Survival was associated with gender on the Titanic (why do you think this is the case?).
Example 3: Do vampire bats prefer cows in estrus?
| Cows in estrus | Cows not in estrus | |
|---|---|---|
| Bitten by vampire bat | 15 | 6 |
| Not bitten by vampire bat | 7 | 322 |
Step 1: Make your matrix in R, and check data for correct placement.
cows <- matrix(c(15, 6, 7, 322), nrow = 2, byrow = T)
cows
Step 2: Add row and column names. Check matrix.
rownames(cows) <- c("bitten by vampire bat", "not bitten by vampire bat")
colnames(cows) <- c("cows in estrus", "cows not in estrus")
cows
Step 3: Run chi-squared test. If you receive a warning message, you have violated assumptions and will need to use Fisher’s exact test instead.
chisq.test(cows)
## Warning in chisq.test(cows): Chi-squared approximation may be incorrect
Step 4: We have violated assumptions! Run fisher’s exact test instead.
fisher.test(cows)
Step 5: State your conclusion based on your p-value.
- p-value < 2.2e-16, meaning p < 0.05. Reject H0. There is an association between being in estrus and being bitten by a vampire bat.
Now it’s time to practice on your own.
| White-rumped | Blue-rumped | |
|---|---|---|
| Predated on | 9 | 92 |
| Not predated on | 92 | 10 |
| 1st male eaten | 1st male escapes | |
|---|---|---|
| 2nd male accepted | 3 | 22 |
| 2nd male rejected | 6 | 1 |