Chapter 4 Data analysis with the tidyverse

The tidyverse is an opinionated collection of R packages designed for data science. All packages share an underlying design philosophy, grammar, and data structures.

For learning how to do data analysis from importing data and tidying it to analyzing it and reporting results, we will use book R for Data Science. You can find most of the exercise solutions there.

4.1 Program

  1. my {R Markdown} presentation (also see https://r4ds.had.co.nz/r-markdown.html)

  2. my {ggplot2} presentation + exercises from data visualization with {ggplot2}

  3. tibbles

  4. data transformation with {dplyr}

  5. tidy data will rationalize the concept of “tidy” data that is used in the tidyverse and that is easier to work with

  6. relational data will give you tools to join information from several datasets

  7. more if time allows it (see below)

4.2 Other chapters from this book

The other chapters of R for Data Science book are very interesting and you should read them. Unfortunately, we won’t have time to cover them in class. A brief introduction of what you could learn:

  1. data import will give you tools to import data (e.g. as a replacement of read.table)

  2. strings will help you work with strings and regular expressions

  3. factors will help you work with factors

  4. dates and times will help you work with dates and times

  5. many models will introduce the concept of list-columns that enable you to store complex objects in a structured way inside a data frame

  6. databases: packages {DBI} and {dbplyr} + RStudio’s webpage

4.4 Other “tidy” packages