Week | Date | Topic | Notes |
---|---|---|---|
1 | 2024-08-28 | Introduction to biological and biomedical data | |
2 | 2024-09-04 | Experimental design and confounding | |
3 | 2024-09-11 | Causal inference | |
4 | 2024-09-18 | Observational studies, outcome sampling | |
5 | 2024-09-25 | Survival analysis I | |
6 | 2024-10-02 | Survival analysis II | |
7 | 2024-10-09 | High-dimensional data | |
8 | 2024-10-16 | Biomarker discovery | |
9 | 2024-10-23 | Applied Bayesian methods | |
10 | 2024-10-30 | Planning studies and clinical trials | |
11 | 2024-11-06 | Applied machine learning | |
12 | 2024-11-13 | Explainability | |
13 | 2024-11-20 | Decision-making | |
2024-11-27 | Thanksgiving break | No class | |
14 | 2024-12-04 | Project presentations |
Biological and Biomedical Data Science
Georgetown University
Fall 2024
We are bombarded everyday with multiple claims of health risks (doing this will ruin your health), new treatments and cures (just take this for 30 days for a new you), and better lifestyle choices. How are these claims made, evaluated and validated using data science? Data drives our knowledge of biology, disease and effective treatments. This data is diverse, complex, large, and in many respects unique. This data drives our understanding of whether risk factors or treatments causally change our health outcomes, whether our genes or our environment affects our health, and decisions about drugs, protocols and public health that affect all of us everyday. In this class we explore this rich, diverse data landscape and the specialized methods needed to make sense of it, leveraging the instructor’s decades-long experience in collaborative epidemiological and biomedical research across academia, government and industry. We will explore designing good experiments to extract causal relationships, and how we might still make valid decisions even in non-ideal settings. We will explore high-dimensional multivariate data and evaluate the validity of finding a “needle in a haystack” biomarker that can be targeted for treatment. We will see how statistical modeling (survival analysis in particular), machine learning, AI, and explainable AI have made an impact in helping us understand this world within. We will see how data-driven decision making works. This journey will take us through real-life applications in bioinformatics (understanding how genes, proteins and other molecular markers affect disease), epidemiology (what might cause diseases and how can interventions prevent it) and clinical research (clinical trials, observational studies, case-control studies).
The focus of the class will be on data science methodologies, with more of an emphasis on statistical foundations, inference, associations and causality. As such, we will not go deeply into the biology, but use it to provide real-life context. We will also find that many of the methods we’ll learn have applications outside the life sciences, often in manufacturing, business and finance.
Instructor information
Professor: Abhijit Dasgupta
Email: abhijit.dasgupta@georgetown.edu
Class Time: Wednesdays, 3:30-6:00 PM
Class location: Car Barn 202
Office Hours: By appointment only. All office hours will be held on Zoom. Appointments may be made at this link. You’re also free to talk after class, though I’ll have a hard stop at 7:00pm
Teaching Assistants
- Viviana Luccioli: Office Hours W 2-3pm in DSAN Suite or by appt
- Sai Prerana Mandalika: Office Hours M 2-3pm & Fr 11-12 in DSAN Suite or by appt
Topics and Class Meetings
This schedule and corresponding topics are subject to change. Check the timestamp above.
Deliverables
Labs and assignments are due at 11:59 PM on the Thursday after a class meeting. These are based on material from the previous week, and should be completed as quickly as possible so you can move your attention to the next week’s material. These are not busy work, but ways to reinforce the material you have learned. The time and effort required to do these deliverables will be commensurate with the time provided. Labs will primarily be checked for completion, while assignments will be checked for accuracy and quality.
Quizzes will be available 1 hour before class and will be due by the end of class. Quizzes are based on the material and readings for the current week, that will be discussed in class. The quizzes will test if you have read and understand the week’s material.
Refer to the Canvas site for up to date details on due dates and deliverables