Understanding the Data

There were over 33,000 fatal car accidents in the year 2019 alone. Gaining a better understanding of the causes and conditions of these fatal events could provide valuable insights that could save lives. The following visualizations provide an in depth look at the conditions and factors involved in fatal car accidents in the United States over a 44-year time span. The goal of visualizing this data is to provide information on these fatal car accidents that could be utilized by automotive engineers in their improvement of vehicle safety design or by transportation engineers in their design of roads. By considering factors such as road conditions, car shapes, seat restraint use, and more, these visualizations can guide engineers in more informed directions for their research into safety improvements of cars.

0

Variables

Endless insights using this data


0

Years of Data

1975 to 2019


0

Fatal Car Crashes

1975 to 2019

Sourcing the Data

The data for this analysis is sourced from the National Highway Traffic Safety Administration’s Fatality Analysis Reporting System and provides detailed information on all car accidents resulting in at least one fatality in the United States from 1975 to 2019. The data contains information on a vast amount of detail regarding each accident including information on the vehicles and people involved, time and location of event, weather and road conditions at the site of the accident, and more. Analyzing the geographic, temporal, and conditional distributions of these fatal accidents through visualizations of this data provides actionable insights that transportation engineers, automotive engineers, and all drivers could use to make driving an overall safer experience. By illuminating the trends involved with fatal car accidents, we aim to spur action on how to prevent these accidents in the future from an engineering and individual driver perspective.

About the Data

The raw data used for this study is located publicly as follows: NHTSA-FARS → (Year-Folder) → National → FARS~year~NationalCSV.zip, with each zip file containing .csv files for different aspects of the accidents. This study used the accident.csv, vehicle.csv, and person.csv files for each year to attain data on each accident and the vehicles and persons involved. The processing pipeline merged the three files for each year into a single dataset for exploration via a common “Case ID” variable. The data was then cleaned and organized into a tidy format with the unit of analysis being each individual person involved in the fatal accident such that each row contains information on a single person, the vehicle they were in, and the accident they were involved in. The final data contains information on approximately 1.7 million fatal accidents and 26 variables associated with the location of (State, Longitude, Latitude, Rural vs Urban, Route Signing), time of (Month, Day, Hour, Minute), road conditions of (Weather, Light Condition, Road Condition, Speed Limit, Number of Lanes), people involved in (Age, Sex, Severity of Injury, Seat Position, Selt Belt Usage, Ejection Status, Drinking, Drugs), and vehicles involved in (Make, Model, Year, Traveling Speed) each accident. In addition to the spelled out variables above, the total number of fatalities per accident is recorded. Finally, from these raw data values, the date of the accident was derived for easier visualization of fatalities over time.

Temporal Changes and Trends of Fatal Car Accidents

Before diving into the different conditions involved with the fatal car accidents, it's important to understand how big of an issue this is today and how the magnitude of fatal car accidents per year has been changing over the past several decades. Gaining a baseline understanding of the issue is critical to starting an investigation of the topic, and to understanding how an increase in safety guidelines over the years has or has not impacted the magnitude of vehicle fatalities. To establish this baseline understanding, Figure 1 shows the total number of fatal car accidents per year from 1975 to 2019.

Figure 1: The plot above illustrates the changing magnitude of fatal car accidents over a 44-year time period. It is important to note that the plot shows raw, non-normalized values to give more clarity about the distribution of the data. It can easily be understood that normalizing these values by the population of the United States would further emphasize the trend of decreasing fatal car accidents over time.

Here it's seen that the number of fatal car accidents per year has generally decreased over the past several decades, with slight spikes in some years. That being said, it is interesting to see that the number of fatal car accidents seemed to increased significantly around 2015. One thing to point out for this plot is that it shows the raw, non-normalized total number of fatal car accidents in each year without taking into account the changing population of the United States. This was done deliberately in an attempt to give more clarity regarding the distribution of the data and the true net changes of fatalities per year. It can easily be understood that normalizing these values by the population of the United States would further emphasize the trend of decreasing fatal car accidents over time, as the population of the United States has drastically increased from 1975 to 2019. Understanding the trends in fatal car accidents across the past several decades sets the stage for acknowledging that, while total numbers of fatal car accidents are decreasing, there is still a lot of work to be done to reduce fatalities on the road.

To gain further understanding of the temporal changes in fatal car accidents, the yearly data shown in Figure 1 can be further broken down by considering the total number of fatal accidents occurring in each month of each 5-year increment of time over the past several decades. To investigate this, Figure 2 shows a time series plot of the total number of fatal accidents occuring in the United States during each 5-year interval from 1975 to 2015, broken down by month.

Figure 2: The above plot shows the number of fatal car accidents occurring within each month of each 5-year increment of time from 1975 to 2015. The values shown for each month-year time are normalized by the days in each month to show the counts for a standard 30-day time period.

Figure 2 illustrates a monthly breakdown of the total number of fatal car accidents occurring every five years from 1975 to 2015. Note that the total fatalities in each month has been normalized for the differing number of days in each month. Again a general trend of a decrease of total fatal car accidents over time is seen. This could be indicative of a few things. For example, this could indicate that the overall improved safety measure of cars has resulted in less fatalities during accidents. Or, this could suggest that people are simply better drivers due to increased standards of driving exams, and therefore there are less accidents in general. Considering these ideas, Figures 1 and 2 provide a starting point for launching further investigations to see if the decrease in fatal car accidents is due to improved car safety, increased driving abilities of vehicle operators, or something else.

A second insight to be examined in Figure 2 is the number of car accidents occuring in each month of the year. Across all years in general, higher quantities of fatal accidents occur between the months of July and October than throughout the rest of the year. This could be due to many different reasons such as warmer weather causing more people to drive on the roads in general during these months. To investigate this occurance, more research could be focused to see why more accidents occurr during these months and less during the other months.

Overall, Figures 1 and 2 allow the viewer to gain an initial understanding of the temporal distribution of fatal car accidents and to see how the number of accidents have been changing over time and between months. Grasping the current state of fatal car accidents in the United States allows for direction in further research and data collection. For example, seeing that more fatal car accidents occur in the summer months rather than the winter months helps to pinpoint a focus of research into seeing if this is due to weather, more cars on the highway, or something else. Thus, these visualizations equip the viewer with the tools to gain a good baseline understanding of the problem at hand and ideas for further research and investigation.

An additional way to gain a baseline understanding of the occurrences of fatal car accidents in the United States is to examine their current geographic distribution across the United States. This is done by investigating the frequencies of fatal accidents across different counties. Figure 3 shows the total number of fatal car crashes per capita that occurred in each county of the United States in 2019 alone. The total number of fatal crashes in each county is normalized by the US Census Bureau's 2019 estimate of each county's population (reference) in order to find the per captia amount.

Figure 3: The plot above shows the number of fatal car accidents per capita that occurred within each individual United States county in the year 2019. The shading of each county indicates the magnitude of fatal car accidents per capita of each county, where darker shades correlate with more fatal accidents.

Figure 3 shows the distribution of fatal car crashes within each county across the United States. There appears to be a general trend that more densely-populated areas having more fatal car accidents. For example, it's seen that most major cities such as Los Angeles, New York, Chicago, Atlanta, and others have very high amounts of fatal car accidents. In the northwest of the United States, from southeastern Pennsylvania to Maine, there is a major thread of large quantities of fatal car accidents that can likely be explained by the existance of I-95 connecting these areas. Seeing these areas of high fatality could help transportation engineers focus their research and studies on these areas to gain insights into changes that could be made to road design to decrease number of deaths. Researchers could focus on collecting more data specifically from these counties of high fatalities per capita to extend the study further.

Explore Variables Involved in Fatal Car Accidents

Investigations into different variables in the dataset and their impact on fatal car crashes.

Methods

Click here to learn more about our methods and reflections on this project.