Let’s apply what we’ve learned in Chapter 4 to R.

Objectives

Reminder: Save your script for practice at home! :)

Shortcuts:

Symbol or Command Keyboard Shortcut
<- Alt + -
# Shift + 3
Run one line in script Ctrl + Enter
Run entire script Ctrl + Shift + Enter
Open new script Ctrl + Shift + N


Exercise 4
1. Open RStudio and prepare a new script.

  1. Open a new script. All of your code for today’s exercises, and your notes and comments will go in this script. Write your filename, title, author, date, and description of the script.
# Filename: (what you will save the script as)
# Title: (give script a title)
# Author: (write your full name here)
# Date: Month Day Year (write the actual date here)

# Description: (describe what this script is for)


2. Let’s work with the file “SystolicBP.csv” in the GitHub. This file contains 101 samples of systolic blood pressure (in units of mm Hg). Samples were taken during preventative health exams from patients in Dallas, Texas.

Pt I: Reading in the file
Read the file into R, either directly from the GitHub using the following URL: https://github.com/lczawadzki/biostats/raw/main/data/SystolicBP.csv. Remember to assign it to a name.

     systolic <- read.csv("https://github.com/lczawadzki/biostats/raw/main/data/SystolicBP.csv", header = TRUE)



Pt II: Review: summary statistics
Calculate the mean, median, standard deviation, and interquartile range of your dataframe. Remember, we need to specify which variable we are looking at using the notation: dataframe$variable.

     mean(systolic$systolicBP)
     median(systolic$systolicBP)
     sd(systolic$systolicBP)  #standard deviation
     IQR(systolic$systolicBP) #interquartile range



Pt III: Calculate the standard error of the mean.
There is no function in R to do this, so we need to recall the formula. Standard error = standard deviation / sqrt(sample size). Calculate the standard error.

     sd(systolic$systolicBP)/sqrt(nrow(systolic))

Notice, we use the argument nrow( ) to count the number of rows in our dataframe. This gives us the sample size!

Pt IV: Calculate the 95% confidence interval of the mean.
There is also no function in R to do this. We can use the t.test( ) function to ask R to give us the confidence interval of this sample.

     t.test(systolic$systolicBP)$conf.int

Your output should give you the upper and lower bound of the confidence interval.

If we want the 99% confidence interval, we can specify this as an argument in the t-test.

     t.test(systolic$systolicBP, conf.level = 0.99)$conf.int

Your output should give you the upper and lower bound of the confidence interval, but this time for the 99% CI.