Code
GitHub Repository
Please visit our GitHub Repository for all code, data, and website references.
File Structure
Code folder
This section covers the files present in the code/ directory.
avgPF-PAPlotlyScriptwColors.ipynbinvolves the creation of the plotly interactive scatterplot showing points for/points against by NFL season since 1999. This script uses theavgPFavgPA1999-2021wColors.csvdata set.avgPtsDt.Rmdconstitutes the creation of the interactive datatable (using the R packageDT) using theavgPFavgPA1999-2021wColors.csvdata set.Combine_and_Draft_Cleaning.ipynbincludes the intial data cleaning for the raw NFL combine data sourced from thenfl-data-pypackage.Combine_and_Draft.ipynbconstitutes more cleaning and processing of data fromCombine_and_Draft_Cleaning.ipynbas well as the creation of the linked Altair charts for draft position and combine data.player-stats.qmdinvolves the processing/cleaning of receiving EPA data and the creation of the heatmap as shown in the final project website. The data used in this file were acquired via thenflreadrpackage, and the resulting heatmap is saved to the visualizations directory in the website folder.radar-plot.qmdinvolves the processing/cleaning of combine data and the creation of an interactive radar chart using data cleaned in a separate file in this repository, namelycombine_10yr.csv. This visualization is saved in the visualizations directory in the website folder.timeseries-cleaning.Rmdcovers the processing and cleaning of cumulative offensive efficiency by play over the course of the 2022 NFL season. These data were acquired via thenflreadrpackage, and the resulting modified csv file is saved in thedata/directory astimeseries_logos.csvandtimeseries_epa.csv.timeseries-vis.ipynbinvolves the creation of a linked Altair time series plot that visualizes cumulative offensive efficiency by team by play over the course of the 2022 NFL season using data cleaned intimeseries-cleaning.Rmd. The resulting visualization is saved in the visualizations directory in the website folder.win-total-cleaning.qmdcovers the importation, processing, cleaning, and visualization of NFL win total data over the past ~20 seasons (2003-present). These data were acquired via thenflreadrpackage, and the cleaned data is saved aswin-totals.csvin thedata/directory. THe visualization created in this file is an interactive plotly line plot that changes dependent on the user inputted NFL division (AFC East, NFC South, etc.) and is saved in the visualizations directory in the website folder.
Data folder
This section covers the files present in the data/ directory. Many of these files are cleaned/subsetted versions of data pulled from the nflreadr or nfl-data-py packages.
avgPFavgPA1999-2021wColors.csvcontains the cumulative points for and points against totals for each team of each season from 1999 to 2021. These data are a subsetted version of data pulled from thenflverseGitHub data repository.combine_10yr.csvcontains all combine data (e.g., player names, measurables, performance metrics) for any participating players in the past 10 NFL seasons. The raw combine data are cleaned in theCombine_and_Draft_Cleaning.ipynb, and these data are further cleaned inCombine_and_Draft.ipynbfile and saved ascombine_clean_10yr.csvcombine_clean_10yr.csvis the cleaned combine data fromcombine_10yr.csvprocessed inCombine_and_Draft.ipynbdraft_10yr.csvcontains all NFL draft data (e.g., draft position, teams, players) for the past 10 NFL drafts. The raw data sourced from thenfl-data-pypackage are cleaned in theCombine_and_Draft_Cleaning.ipynbfile.ids_10yr.csvcontains identifying information for collegiate players entering the draft for the past 10 NFL drafts. The raw data sourced from thenfl-data-pypackage are cleaned in theCombine_and_Draft_Cleaning.ipynbfile.snap_10yr.csvcontains NFL snap data (i.e., how many snaps a player has in a given game) for the past 10 NFL seasons. The raw data sourced from thenfl-data-pypackage are cleaned in theCombine_and_Draft_Cleaning.ipynbfile.timeseries_epa.csvcontains the EPA added/subtracted for each play for each NFL team over the 2022 regular season. The raw play-by-play (pbp) data were pulled and cleaned fromnflreadrpackage in thetimeseries-epa-cleaning.Rmdfile, and saved astimeseries_epa.csvtimeseries_logos.csvcontains the logo information (e.g., team picture URLs, colors, etc.) that are later used in the EPA time series plot. The raw team description (teams) data were pulled and cleaned fromnflreadrpackage in thetimeseries-epa-cleaning.Rmdfile, and saved astimeseries_logos.csvwin-totals.csvcontain the win/loss/tie totals as well as playoff outcomes and division rankings for each NFL team since 2003 (when divisions were realigned to their present status). The raw data sourced from thenfl-data-pypackage are cleaned and plotted in thewin-total-cleaning.qmdfile, and the cleaned data are additionally used for plots in thewin-totals-playoffs.ipynbfile.
Image folder
This section covers the files present in the img/ directory.
ANLY-503-Group23-Poster.pdfis our group’s poster that we presented on 05/01/2023.
Website folder
This section covers the files present in the website/ directory.
The
_book/directory contains the rendered website itself, including theindex.html,coding.html, anddata.htmlfiles which are described later on in this document. The website is too large to have resources embedded within the index.html file, so there are additional resources in this directory.custom.scsscomprises some basic stylistic changes and create the theme of the website, mainly using color.index.ipynbconstitutes the majority of the website, covering the actual visual analysis and hosts all of the visualizations of interest. This file is rendered asindex.htmlin the_bookdirectory.code.ipynbincludes links to the project github repository and a copy of this document. This file is rendered ascode.htmlin the_bookdirectory.data.ipynbincludes a brief description of the data used in this project as well as the packages of interest (used to pull our raw data). This file is rendered asdata.htmlin the_bookdirectory.references.bibcontains all citations used in this project.
Visualizations folder
This section covers the files present in the website/visualizations/ directory. This folder contains all visualizations used in this project.
avgPFavgPAPlotlyScriptColors.htmlis the interactiveplotlyscatterplot of points for/points against from 1999-2021.chart1.htmlis the linked viewaltairdraft/combine chart.combine-radar-chart.htmlis theplotlyinteractive combine event percentile by position radar charts (AKA spider plots).player-stats.pngis the staticggplot2generated receiving EPA heatmap.ptsDT.htmlis the interactive data table for points for/points againsttimeseries-epa-vis.htmlis the linkedaltairview of cumulative offensive EPA by offensive plays.win-total-plot.htmlis the interactiveplotlyline chart showcasing NFL win totals by division since 2003.win-totals-playoffs.htmlshowcases playoff outcomes given a regular season win total (e.g., what % of 14 win regular season teams made the playoffs/won the superbowl)