Let’s apply what we’ve learned in Chapter 10 to R.

Objectives

New Commands:

R Command Notes
pnorm(q =, mean =, sd =, lower.tail =) calculates a Z-value and returns the p-value from a statistical table, where q is your sample value, mean is the mean of the normal distribution, sd is the standard deviation of the normal distribution, and lower.tail = F when we want the upper tail (Z > Z-value), and lower.tail = T when we want the lower tail (Z < Z-value)


Reminder: Save your script for practice at home! :)

Shortcuts:

Symbol or Command Keyboard Shortcut
<- Alt + -
# Shift + 3
Run one line in script Ctrl + Enter
Run entire script Ctrl + Shift + Enter
Open new script Ctrl + Shift + N


Exercise 8
1. Open RStudio and prepare a new script.

Open a new script. All of your code for today’s exercises, and your notes and comments will go in this script. Write your filename, title, author, date, and description of the script.

# Filename: (what you will save the script as)
# Title: (give script a title)
# Author: (write your full name here)
# Date: Month Day Year (write the actual date here)

# Description: (describe what this script is for)


2. Normal distributions are continuous probability distributions that describe a number of biological variables They have a characteristic bell-shaped curve, and can be used to determine the probability of obtaining a certain value from your sample – whether that is a value, a sample mean, or an X number of successes.

To calculate the probability of an observation if the data are normally distributed, we use the function pnorm ( ). The arguments for pnorm( ) are described above. Note well that the arguments will depend on your scenario: are you looking to calculate the probability of a given event? a sample mean? or X successes from a binomial distribution? Carefully examine your question before inputting your information.

Here is a table that you can use as reference to determine what to do!

Scenario Argument values
Probability of a given event q is your sample estimate, mean is the population mean (µ), and sd is the population standard deviation
Probability of a sample mean q is your sample mean, mean is the population mean (µ), and sd is the standard deviation of the sampling distribution (the standard error = sd/sqrt(n))
Probability of X successes q is your number of successes, mean is equal to np (n is the total trials, p is your expected probability), and sd is equal to sqrt(np(1-p))


3. Work through these examples. Be sure you understand how to carry out pnorm( ) for each scenario.

Example 1: It was rumoured that Britain’s domestic intelligence service MI5 had a height restriction for spies. Male spies could not be taller than 5 feet 11 inches (180.3 cm). The mean height of males in Britain is 177.0 cm, with standard deviation 7.1 cm. Assuming a normal distribution of male height, what fraction of males meet the height standard for application to MI5?

Step 1: Check what you are being asked. We are being asked to calculate the probability of a given event, assuming a normal distribution. Particularly, we want to know how many males are excluded from being a spy (taller than 177.0 cm). So, q = 180.3, mean = 177.0, and sd = 7.1.

Step 2: Once you have identified your problem, and your arguments, ask what tail you are looking for. We want to know the fraction TALLER than 180.3. That is, we want to know the probability of values more extreme than 180.3…that is, values to the right of 180.3. Right tail –> lower.tail = F.

Step 3: Solve your problem using pnorm( ).

pnorm(q = 180.3, mean = 177.0, sd = 7.1, lower.tail = F)

Step 4: State your conclusion. Our output gives us a value of 0.321, meaning that 32.1% of male Britons are unable to become spies.

Example 2: Let’s revisit our the tumbling toast and Murphy’s Law example, using data from: Matthews, RAJ. (1995). Tumbling toast, Murphy’s Law and fundamental constants. Eur. J. Phys. 16:172-176. Matthews (1995) found that 6101 out of 9821 slices of toast thrown in the air landed butter side down. We expect that toast thrown in the air would land butter side up and down in equal proportion. Test this hypothesis, assuming a normal distribution.

Step 1: Check what you are being asked. We are being asked to calculate the probability of a X successes, assuming a normal distribution. Particularly, we want to know if the proportion of toast landing butter side down equals that landing butter side up in our tests. So, q = 6101 (our number of successes), mean = 9821 x 0.5 ( np; n trials times p expected probability), and sd = sqrt((9821 x 0.5)(1-0.5) (sqrt(np(1-p))).

Step 2: Once you have identified your problem, and your arguments, ask what tail you are looking for. We want to know the probability of values as extreme, or more extreme than 6101 successes…that is, values to the right of 6101. Right tail –> lower.tail = F.

Step 3: Solve your problem using pnorm( ). Note that

n <- 9821
p <- 0.5
mean <- n*p
sd <- sqrt(n*p*(1-p))
pnorm(q = 6101, mean = mean, sd = sd, lower.tail = F)

Step 4: State your conclusion. Our output gives us a p-value of 7.44e-128, meaning P < 0.05. We reject our H0, and state that toast lands butter side down more than is expected due to chance.

Great! You have used pnorm( ) to solve a problem. Now it’s time to practice on your own.

  1. Data from 1995 to 2008 in the USA show that 531 out of 648 people struck by lightning were men. We expect the proportion of men to be struck to be 50%. Given a normal approximation of the binomial distribution, test this hypothesis. Remember to follow the steps above. Consult the scenario table for help!

  2. The table below lists the means and standard deviations for several normal distributions. For each, a sample of 10 individuals was taken, and the mean calculated. Calculate the probability of obtaining a sample mean as extreme or more extreme than the sample means below from a random sample of 10 individuals. Follow your steps, and consult the scenario table.

mean standard deviation sample mean
14 5 15
15 3 15.5
72 50 45
  1. In 1991, birth weights in the United States were recorded. Birth weights followed a normal distribution with a mean of 3432 grams, and a standard devation of 482 g. What probability of babies fall outside of these expectations (weigh less than 2000 g and more than 5000 g)? What probability weigh between 2950 and 3914 g?