Biological Data Science Materials

Joon An, 2021-09

Preface

The aim of this book is to provide basic programming skills for biology students (beginner level). In the part of the biological data science at Korea University, we developed a range of tutorial sets to familiarize students with the discipline of biostatistics and basic analytic skills. Students will develop an appreciation of modern application in biostatistics while gaining a detailed understanding of the analytic fundamentals from data modality to programming language.

To fully utilize the tutorial sets, you should have some basic programming knowledge for R. If not, please visit this amazing resource for R programming - <Introduction to Data Science> written by Prof. Rafael A. Irizarry. Please go through the chapter from 2 to 11 (at least; you can go further if needed).

Schedules

For BSMS222 students, please note that it may change for years so please find the github link for the schedule of this year.

Session 1 - R

  • Introduction to Biostatistics

  • Introduction to Rstudio

  • Introduction to R notebook and Github

  • R basics

  • Programming basics

  • Tidyverse

Session 2 - Data visualisation

  • ggplot

  • Visualizing data distribution to Robust summaries

Session 3 - Data wrangling for human genomics

  • Introduction to UNIX (11/15)

  • UNIX practice (11/17)

  • RNA sequencing: Introduction (11/22)

  • RNA sequencing: Case Study (11/24)

Session 4 - Programming for Statistics

  • Principle component (11/29)

  • Gene set enrichment test (12/1)

  • Regression (12/6)

List of tutorial sets

  1. SCN2A mutations in neurodevelopmental disorders

  2. Human Genome Annotation

  3. Gene expression profiles in cancer patients

  4. A proteogenomic portrait of lung squamous cell carcinoma

Last updated