Welcome! Vojtech and I are excited to talk to you about ways that we use R to help with our data science and informatics tasks.
Materials
- Powerpoint Presentation
- Introduction to R
- Common Challenges in Data Science
- Case Study 1
- Case Study 2
Outline
Part 1: Introduction
- R Installation and Introduction to RStudio
- Overview of R
- Commom Data/Code Repositories (CRAN, GitHub, etc.)
- Common Challenges in Data Science and R Language Solutions
- Data Wrangling (dplyr, tidyr, purrr)
- Data Cleaning (lubridate, stringr)
- Visualization (ggplot, ggvis)
- Interactive Reporting (RMarkdown, Shiny)
-- 5 min Break --
Part 2: Demonstration of Data Science Techniques Using Case Studies
- Case Study 1: The Data Scientist's Workflow
- Doing Reproducible Research with RMarkdown and Knitr
- Getting Data In and Out of R
- Principles of Tidy Data and R Datatypes
- Text Processing in R
- Data Manipulation and Transformation
- Data Analysis and Visualization
- Dynamic Web Applications and Visualizations
- Case Study 2: Using Existing Biomedical Resources in R
- Working with XML Data in R and ClinicalTrials.gov
- Data Cleaning and Querying with the Drugs@FDA dataset
- Terminology Manipulations Using RxNorm
Part 3: Comparing R to Other Platforms
- Comparison to Other Languages
- R Advantages and Disadvantages
- Q&A Discussion