R Exercise 5: Analyzing Proportions (Ch 7)

Let’s apply what we’ve learned in Chapter 7 to R.

Objectives

Be able to estimate the proportion of data.
Be able to calculate the standard error of a proportion.
Be able to conduct a binomial test (hypothesis test about proportions).
Be able to calculate the 95% confidence interval of a proportion.

Reminder: Save your script for practice at home! :)

Shortcuts:

Symbol or Command	Keyboard Shortcut
<-	Alt + -
#	Shift + 3
Run one line in script	Ctrl + Enter
Run entire script	Ctrl + Shift + Enter
Open new script	Ctrl + Shift + N

Exercise 5
1. Open RStudio and prepare a new script.

Open a new script. All of your code for today’s exercises, and your notes and comments will go in this script. Write your filename, title, author, date, and description of the script.

# Filename: (what you will save the script as)
# Title: (give script a title)
# Author: (write your full name here)
# Date: Month Day Year (write the actual date here)

# Description: (describe what this script is for)

2. Let’s revisit the tumbling toast and Murphy’s Law example, using data from: Matthews, RAJ. (1995). Tumbling toast, Murphy’s Law and fundamental constants. Eur. J. Phys. 16:172-176. Matthews (1995) found that 6101 out of 9821 slices of toast thrown in the air landed butter side down. The null hypothesis would be that toast thrown in the air would land butter side up and down in equal proportion.

Pt I: Calculating Proportions and Standard Error
If you are asked, “What is the best estimate of the proportion?”, you can calculate this directly in R by dividing the number of successes, X, by the number of trials, n. What is the best estimate of the proportion of toast that lands butter side down when dropped?

     6101/9821

If you are asked, “Calculate the standard error of the sample proportion,” we would use the formula from Chapter 7: SE = sqrt((sample proportion*(1-sample proportion))/n). What is the standard error of the sample proportion of toast that lands butter side down?

     sqrt((0.62*(1-0.62))/9821)

Pt II: Conducting a binomial test
What if we want to test the hypothesis that toast lands butter side down more often than it lands butter side up? You would need to undergo a hypothesis test. For proportional data, where there are only two outcomes, we use the binomial test.

In R, we use the function binom.test( ). You do NOT have to read in any data. You only need to enter in three pieces of information:

x - the number of successes (X in our formula)
n - the number of trials (n in our formula)
p - the expected probability from our null hypothesis (H0)

Note: This test will be a two-tailed test by default!

Using the toast example, our three values for the binomial test are:

x = 6101 successes
n = 9821 trials
p = 0.5; toast will land up/down equally (half and half!)

Try conducting a binomial test with the above information.

     binom.test(x = 6101, n = 9821, p = 0.5)

Pt III: Understanding your output
Let’s read through the output. The output tells you a number of things about your data:

Line 1 tells you about your data. You have X successes and n trials.
Line 2 states specifically your number of successes (X), the number of trials (n), and your P -value. Remember, the P-value is the probability that a sample from the null model would be as extreme, or more extreme, than your sample. That is, it is the probability of obtaining your data under the null hypothesis or expectation.
Line 3 tells you your alternative hypothesis (HA).
Line 4 tells you the 95% confidence interval of the proportion, which are values between which you are 95% confident that the true population proportion lies between. This function uses the Clopper and Pearson (1934) method of calculating confidence intervals by default.
Line 5 tells you information about your sample estimates, specifically, the probability of success in your sample data.

So, from our output, we know that:

Our data consists of 6101 successes and 9821 trials.
Our P -value < 2.2e-16.
HA: p is not equal to 0.5.
We are 95% confident that the true population proportion lies between 0.61 and 0.63.
Our sample proportion (the estimate) equals 0.62.

Pt IV: Interpreting your results
The point of running a binomial test is to test if there is an effect. This is a hypothesis test! Running the test alone is insufficient, we now need to draw our conclusions. What conclusions can you draw from your output?

“p < 0.05, therefore our results are statistically significant, and we reject the H0. Our data do not match our null expectations, and are not likely to occur due to chance. We conclude that toast lands butter side down more than we would expect due to chance at a p = 0.62. Additionally, we are 95% confident that the true population proportion lies between 0.61 and 0.63.”

Now it’s time to practice on your own.

Many hospitals still have signs posted everywhere banning cell phone use. These bans originated from studies on earlier versions of cell phones. In one such experiment, out of 510 tests with cell phones operating at near-maximum power, 6 disrupted a piece of medical equipment enough to hinder interpretation of data or cause equipment to malfunction. (question adopted from: Whitlock & Schluter 2020)

a. Based on this experiment, what is the best estimate of the proportion of equipment disruption by cell phones in hospitals?

b. How precise is this estimate? Calculate the standard error of the sample proportion. Recall, standard error for proportions is: sqrt((sample proportion*(1-sample proportion))/n).

Marzoli and Tomassi (2009) had a researcher approach and speak to strangers in a noisy nightclub. An observer scored whether the person turned either the left or right ear towards the questioner. Of 25 participants, 19 turned the right ear towards the questioner and 6 turned thier left ear towards the questioner. (question adopted from: Whitlock & Schluter 2020)

a. What is your best estimate of the proportion of individuals that turn their right ear to a questioner?

b. What is the observed value of the test statistic?

c. Researchers hypothesised that individuals would prefer one ear over the year. Test this hypothesis using the data. Report your null hypothesis, your p-value, and your biological conclusion.

d. What is the 95% confidence interval? Report as a statement. If your confidence interval is broad, what can researchers do to increase precision in their next experiment?

e. Using the above data, what is the probability that exactly 19 out of 25 participants favored their right ear? NOTE: You will need to calculate this directly in R. Recall the binomial formula: Pr[X] = (n choose x) * p^X * (1-p)^n-X, where (n choose x) equals n! / (X!(n-X)!). To calculate factorials in R, use the function factorial(). This is because we are asking for the Pr[19] alone. What is the probability of exactly 19 participants using their right ear? A binomial test, on the other hand, compares the sample to the sampling distribution. It calculates the probability of a sample from the null distribution being as extreme or more extreme than our data. If we were to calculate the binomial test, you would essentially be calculating the probability of obtaining 19 or 20 or 21 or 22 or 23 or 24 or 25 participants using their right ear: Pr[19] + Pr[20] + Pr[21] + Pr[22] + Pr[23] + Pr[24] + Pr[25], like we did in part (c)! Since we JUST want Pr[19], we can use the binomial formula.

R Exercise 5: Analyzing Proportions (Ch 7)

Dr. Z

2026