DSAN 6150
  • Ruminations
  1. Biological and Biomedical Data Science
  • Syllabus
  • Course notes
  • Course Content
    • 1. Introduction
    • 2. Experimental design and confounding
    • 3. Causal Inference
    • 4. Observational studies
    • 5. Survival analysis I
    • 8. Biomarker discovery
    • 9. Applied Bayesian methods
  • Project
    • Project introduction
    • Project requirements
    • Project intermediate checkpoint
    • Project ideas
    • Research compendium resources
    • Writing a review paper: Tips
  • Resources
    • Books/Blogs

Biological and Biomedical Data Science

Georgetown University
Fall 2024

Published

Wednesday Sep 18, 2024 at 12:27 am

About this course

We are bombarded everyday with multiple claims of health risks (doing this will ruin your health), new treatments and cures (just take this for 30 days for a new you), and better lifestyle choices. How are these claims made, evaluated and validated using data science? Data drives our knowledge of biology, disease and effective treatments. This data is diverse, complex, large, and in many respects unique. This data drives our understanding of whether risk factors or treatments causally change our health outcomes, whether our genes or our environment affects our health, and decisions about drugs, protocols and public health that affect all of us everyday. In this class we explore this rich, diverse data landscape and the specialized methods needed to make sense of it, leveraging the instructor’s decades-long experience in collaborative epidemiological and biomedical research across academia, government and industry. We will explore designing good experiments to extract causal relationships, and how we might still make valid decisions even in non-ideal settings. We will explore high-dimensional multivariate data and evaluate the validity of finding a “needle in a haystack” biomarker that can be targeted for treatment. We will see how statistical modeling (survival analysis in particular), machine learning, AI, and explainable AI have made an impact in helping us understand this world within. We will see how data-driven decision making works. This journey will take us through real-life applications in bioinformatics (understanding how genes, proteins and other molecular markers affect disease), epidemiology (what might cause diseases and how can interventions prevent it) and clinical research (clinical trials, observational studies, case-control studies).

The focus of the class will be on data science methodologies, with more of an emphasis on statistical foundations, inference, associations and causality. As such, we will not go deeply into the biology, but use it to provide real-life context. We will also find that many of the methods we’ll learn have applications outside the life sciences, often in manufacturing, business and finance.

Instructor information

Professor: Abhijit Dasgupta
Email: abhijit.dasgupta@georgetown.edu
Class Time: Wednesdays, 3:30-6:00 PM
Class location: Car Barn 202

Office Hours: By appointment only. All office hours will be held on Zoom. Appointments may be made at this link. You’re also free to talk after class, though I’ll have a hard stop at 7:00pm

Teaching Assistants

- Viviana Luccioli: Office Hours W 2-3pm in DSAN Suite or by appt
- Sai Prerana Mandalika: Office Hours M 2-3pm & Fr 11-12 in DSAN Suite or by appt


Topics and Class Meetings

Important

This schedule and corresponding topics are subject to change. Check the timestamp above.

Week Date Topic Notes
1 2024-08-28 Introduction to biological and biomedical data
2 2024-09-04 Experimental design and confounding
3 2024-09-11 Causal inference
4 2024-09-18 Observational studies, outcome sampling
5 2024-09-25 Survival analysis I
6 2024-10-02 Survival analysis II
7 2024-10-09 High-dimensional data
8 2024-10-16 Biomarker discovery
9 2024-10-23 Applied Bayesian methods
10 2024-10-30 Planning studies and clinical trials
11 2024-11-06 Applied machine learning
12 2024-11-13 Explainability
13 2024-11-20 Decision-making

2024-11-27 Thanksgiving break No class
14 2024-12-04 Project presentations

Deliverables

  • Labs and assignments are due at 11:59 PM on the Thursday after a class meeting. These are based on material from the previous week, and should be completed as quickly as possible so you can move your attention to the next week’s material. These are not busy work, but ways to reinforce the material you have learned. The time and effort required to do these deliverables will be commensurate with the time provided. Labs will primarily be checked for completion, while assignments will be checked for accuracy and quality.

  • Quizzes will be available 1 hour before class and will be due by the end of class. Quizzes are based on the material and readings for the current week, that will be discussed in class. The quizzes will test if you have read and understand the week’s material.

  • Refer to the Canvas site for up to date details on due dates and deliverables

Content 2024
Abhijit Dasgupta
All content licensed under a Creative Commons Attribution-NonCommercial 4.0 International license (CC BY-NC 4.0)

 

Made with and Quarto