DSAN 6000
  • Syllabus
  • Schedule
  • Content
  • Labs
  • Project
    • Project
    • Using AzureML
  • Technical Reference
    • Terminal and SSH keys
    • Microsoft Azure
  • Resources

Schedule

Section Instructor Day Location
01 Jeff Jacobs M 3:30-6:00 Walsh 394
02 Jeff Jacobs Th 6:30-9:00 Walsh 394
03 Amit Arora W 9:30-12:00 Reiss 262
Wednesday (Amit Arora) Session Monday/Thursday (Jeff Jacobs) Session Notes
8/27/2025 1 8/28/2025 1 Course overview - All sections combined
9/3/2025 2 9/2/2025 (Tu) & 9/4/2025 2 Cloud computing introduction (Monday is Labor Day)
9/10/2025 3 9/8/2025 & 9/11/2025 3 Parallelization concepts
9/17/2025 4 9/15/2025 & 9/18/2025 4 DuckDB, Polars, file formats
9/24/2025 5 9/22/2025 & 9/25/2025 5 Data Warehouse (Athena, Presto, Snowflake)
10/1/2025 6 9/29/2025 & 10/2/2025 6 Introduction to Spark, RDDs
10/8/2025 7 10/6/2025 & 10/9/2025 7 Spark DataFrames and Spark SQL
10/15/2025 No class - Fall Break 10/14/2025 (Tu) & 10/16/2025 (Th) No class - Fall Break Fall Break week
10/22/2025 8 10/20/2025 & 10/23/2025 8 Spark ML and Streaming
10/29/2025 9 10/27/2025 & 10/30/2025 9 Apache Iceberg & Table Formats
11/5/2025 10 11/3/2025 & 11/6/2025 10 Data Pipeline Orchestration (Airflow)
11/12/2025 11 11/10/2025 & 11/13/2025 11 Vector Databases & RAG
11/19/2025 12 11/17/2025 & 11/20/2025 12 Modern Data Stack & Governance
11/26/2025 No class - Thanksgiving 11/24/2025 & No class 13 Serverless & Container Orchestration (Th is Thanksgiving)
12/3/2025 13 12/1/2025 & 12/4/2025 14 Final topics & Review
12/10/2025 14 12/8/2025 Projects Final project presentations

Content 2025 by Amit Arora, Jeff Jacobs
All content licensed under a Creative Commons Attribution-NonCommercial 4.0 International license (CC BY-NC 4.0)

 

Made with and Quarto
View the source at GitHub