Project requirements

Published

September 23, 2024

Important

These requirements are subject to change, but no later than October 15, 2024

Repository structure

The repo you will accept has the following structure

Following the rcompendium package
├── DESCRIPTION                            # Project metadata
├── LICENSE                                # MIT license
├── LICENSE.md                             # Content of the MIT license
├── R                                      # R functions location
│   └── fun-demo.R                         # Example function with roxygen documentation (to remove)
├── README.Rmd                             # Rmarkdown README. If you use Python, please remove and use the README.md
├── README.md
├── analyses                               # R scripts (not functions) to run analyses
│   └── README.md
├── data                                   # Raw data (or add links in READMEs)
│   ├── derived-data                       # Modified analytic datasets
│   │   └── README.md
│   └── raw-data                           # Raw data, read-only
│       └── README.md
├── documents                              # All quarto deliverables
│   ├── README.md
│   └── project-proposal.qmd               # Quarto project proposal 
├── fall-2024-project.Rproj
├── figures                                # All final figures
│   └── README.md
├── make.R                                 # A script file that can run all analyses and generate documents
├── man
└── outputs                                # Intermediate outputs (RData, RDS, csv, ...files)
    └── README.md

Final submission

The final submission will be in the form of a Quarto document/presentation/website that will be within your project Github repository. The repo should be organized as a research compendium. You will receive a repo template which you can use and update.

You will also need to submit a presentation, in Quarto/revealjs, PowerPoint, Keynote or Google Slides. You will submit in Canvas either the actual presentation file or an accessible online link. This needs to happen before the actual presentation.

You will submit peer-reviews of your team mates at the end of the project, and each team will submit a review of one other team, TBA.

The final project should be written in English, with sufficent verbiage included to be understandable to a person who has no background knowledge of the project. You will assume your target audience is someone who has an equivalent of a Master’s in Data Science:

  • Assume that they have all knowledge you may have gained in the first year of the DSAN program
  • Any new concepts must be clearly explained as far as possible, with appropriate references. References can include course notes from the class.
  • The problem should be clearly stated in English, and translated into technical specifications that can be implemented and tested
  • Study design specifications should be clear, and analyses aligned to it
    • State the study design in data-based projects, and be clear what kinds of (causal) hypotheses might be addressed
    • State what study designs are compatible with methodology (in review projects) and comment on any causal aspects
  • Model assumptions should be stated and verified in data, or implemented in simulations
  • Any simulations must have a seed set for random number generation
  • The write-up should have minimal grammatical issues and should read as a professional product.
  • You may use GenAI only for grammatical or formatting organization, and if you do, it should be clearly stated, along with the actual prompts used. You may not use GenAI for any scripts.

Technical details

  • The submission should be reproducible from material available in the repo
  • The repo should include the products of using renv (so renv.lock and a renv folder) in or a requirements.txt / environment.yml file for , that can create a virtual environment where all code can be run.
  • All references should be in a .bib file (this can be generated using software like Zotero)
  • The product should be professional quality, so that it could be proudly included in a work portfolio.