SOCI832: Lesson 4.6: Descriptive Statistics

1. Run the standard set up code

# Install Packages
if(!require(dplyr)) {install.packages("sjlabelled", repos='https://cran.csiro.au/', dependencies=TRUE)}
if(!require(sjlabelled)) {install.packages("sjlabelled", repos='https://cran.csiro.au/', dependencies=TRUE)}
if(!require(sjmisc)) {install.packages("sjmisc", repos='https://cran.csiro.au/', dependencies=TRUE)}
if(!require(sjPlot)) {install.packages("sjlabelled", repos='https://cran.csiro.au/', dependencies=TRUE)}
if(!require(summarytools)) {install.packages("summarytools", repos='https://cran.csiro.au/', dependencies=TRUE)}

# Load packages into memory
library(dplyr)
library(sjlabelled)
library(sjmisc)
library(sjPlot)
library(summarytools)

# Turn off scientific notation
options(digits=5, scipen=15) 

# Stop View from overloading memory with a large datasets
RStudioView <- View
View <- function(x) {
  if ("data.frame" %in% class(x)) { RStudioView(x[1:500,]) } else { RStudioView(x) }
}

2. Piping with %>% from magrittr, important with dplyr package

We use the %>% piping tool to make it easier to write and read our code.

The following two examples are actually the same code.

You can think of a %>% b() as meaning put a into function b().

Alternatively, you can think of it as get a, put it into b(), and then run b()

# y <- f(x(z(k(a))))

# a is a variable

# y <- a %>%
#  k() %>% 
#  z() %>% 
#  x() %>% 
#  f()

3. select() from dplyr package

Select() allows us to run various commands on just a subset of our variables, rather than the entire dataset.

  select(RESPID,
         SUICIDE2, SUICIDE3, SUICIDE4, SUICIDE9, SUICID11, SUICID12,
         STRESS, STRSSECO, STRSSJOB, STRSSFAM,
         ATTEND, RELIG, RELAFFCT,
         SEX, AGE, MARITAL, EMPLY, EDUC, INCOM0, SATFACE6)

4. summarytools::descr()

descr() is a very effective and simple command for generating descriptive statistics.

summarytools manual (.pdf)

  descr(omit.headings = TRUE,
        stats = c("mean", "sd", "min", "max"), 
        transpose = TRUE)