Technical writing
NHTSA FARS: The Federal Database Behind Every US Traffic Fatality Since 1975
Since 1975, every motor vehicle crash occurring on a United States public roadway that results in a death within 30 days has been recorded in the Fatality Analysis Reporting System — a national census maintained by the National Highway Traffic Safety Administration. FARS is not a sample. It contains every fatal crash. With roughly 1.1 million fatality records spanning five decades and approximately 4,000 data elements per crash, it is among the most comprehensive public-safety longitudinal datasets the federal government maintains. Almost no one outside traffic engineering and academic injury epidemiology uses it directly.
This article covers the history and scope of FARS, the structure of its core data files (ACCIDENT, VEHICLE, PERSON, and associated supplementary files), the major crash categories that define the American traffic fatality landscape, rural and urban mortality patterns, pedestrian and cyclist trend data, the NHTSA FARS API and bulk download formats, and a Python script that queries the NHTSA API to produce annual fatality trend tables and state-level fatality rates per 100,000 population.
History and scope
FARS was created in 1975 under the auspices of the National Highway Traffic Safety Administration, which was itself established by the National Traffic and Motor Vehicle Safety Act of 1966 — legislation driven in large part by Ralph Nader's 1965 book Unsafe at Any Speed and the congressional hearings that followed it. Before FARS, national traffic fatality data was collected through voluntary state reporting to the National Safety Council and lacked the granularity or completeness necessary for systematic safety research. FARS was designed from the outset to be a census: every fatal crash on every public roadway in every state, with standardized data elements that would allow meaningful comparisons across jurisdictions and over time.
Coverage extends to all 50 states, the District of Columbia, and Puerto Rico. The definition of a fatal crash is precise: any motor vehicle traffic crash occurring on a public roadway (including interstate highways, US routes, state routes, county roads, and city streets) in which at least one person — occupant or non-occupant — dies within 30 days of the crash as a result of injuries sustained in the crash. The 30-day window is specified in the ANSI D16.1 standard for classifying motor vehicle traffic accidents. Private roadways, parking lots, and off-highway crashes are excluded. The standard has been consistent since 1975, making FARS time-series comparisons methodologically sound across the full five-decade span.
Annual fatality counts have declined substantially from the 1972 peak of 54,589 deaths (on a much smaller vehicle fleet and road network), driven by mandatory seat belt laws, airbag requirements, drunk driving enforcement, graduated licensing, improved trauma medicine, and vehicle crashworthiness standards. The recent trend, however, has been upward. After reaching a post-war low of 32,479 in 2011, fatalities rose through the mid-2010s, fell modestly before 2020, and then surged: 38,824 in 2020 (despite a dramatic COVID-related drop in vehicle miles traveled, implying a sharp increase in crash risk per mile), 42,939 in 2021, 42,795 in 2022, and 40,990 in 2023. The 2021 and 2022 totals were the highest since 2005. NHTSA attributes the surge to increased speeding, impaired driving, and reduced seat belt use documented in behavioral surveys taken during and after the pandemic period.
Data collection process
FARS data is collected at the state level by NHTSA contractors (state data collection contractors, or SDCs) working from multiple source documents for each fatal crash. The primary source is the police accident report (PAR) — the officer-completed crash report filed by the responding law enforcement agency. PARs vary in format by state but are standardized within each state and contain the core crash geometry (location, road type, direction of travel), vehicle information (plate, VIN, make, model, year), driver information (license data, apparent sobriety), and initial injury assessment. PARs alone are insufficient for FARS coding.
SDCs supplement PARs with death certificates (which provide official cause of death and confirm the death occurred within the 30-day window), emergency medical service records (pre-hospital care, transport times, trauma center designation), hospital records (in-hospital treatment, laboratory test results including blood alcohol concentration where available), and driver licensing and vehicle registration records (to verify identity, license class, prior violations). The integration of these sources allows FARS to code data elements that no single source document contains: blood alcohol concentration from hospital toxicology results, airbag deployment from vehicle inspection, driver distraction type from witness statements, and drug involvement from toxicology screens. The result is approximately 4,000 data elements organized across multiple relational data files for each fatal crash.
Core data files
FARS data is distributed as a set of relational files linked by a case number (state FIPS code + year + sequential crash number). The primary files and their contents are as follows:
| File | Level | Key variables |
|---|---|---|
| ACCIDENT | Crash | Date, time, state, county, route type (interstate/US/state/local), land use (urban/rural), junction type, light condition, weather, first harmful event, manner of collision, school bus related flag, fatality count |
| VEHICLE | Vehicle | Vehicle type, make, model, year, body type, estimated travel speed, driver vision obstruction, driver distraction, vehicle contributing factor, driver condition at time of crash, airbag deployed, seat belt use, rollover flag |
| PERSON | Person | Age, sex, person type (driver/passenger/pedestrian/cyclist/other non-motorist), injury severity, seating position, restraint use, ejection status, alcohol test type, blood alcohol concentration (BAC), drug involvement |
| DISTRACT | Vehicle | Driver distraction coded by type: cell phone handheld, cell phone hands-free, eating, outside vehicle, inattention/daydreaming, other electronic device, passenger, reaching for object |
| MANEUVER | Vehicle | Pre-crash maneuver: going straight, turning, changing lanes, overtaking/passing, entering/leaving traffic, negotiating curve, backing, stopped |
| CEVENT | Event | Crash event sequence: each harmful event in order (first, second, third) with event type and area of vehicle contacted |
| DRIMPAIR | Driver | Driver impairment type: alcohol, drugs (illicit, prescription, over-the-counter), fatigue, illness, asleep, emotional disturbance |
| WEATHER | Crash | Weather condition at time of crash: clear, cloudy, rain, snow, fog/smog/smoke, sleet/hail, freezing rain, blowing snow, severe crosswinds |
The PERSON file is the most analytically productive for fatality analysis because it contains the individual-level records that allow demographic breakdowns, injury type classification, and BAC data. Person types are coded as: DR (driver), PS (passenger), PN (pedalcyclist), PD (pedestrian), and a set of other non-motorist codes covering persons on in-line skates, electric scooters, motorized mobility devices, and other conveyances. The approximate distribution of fatalities by person type in recent years is: drivers approximately 38%, passengers approximately 22%, pedestrians approximately 17%, motorcycle riders approximately 14%, and pedalcyclists approximately 3%.
Alcohol-involved and speed-related crashes
Alcohol-involved crashes are the most extensively studied category in FARS. NHTSA defines an alcohol-involved crash as one in which any driver, pedestrian, or cyclist had a blood alcohol concentration of 0.01 g/dL or higher. The legal per se limit in all 50 states and DC is 0.08 g/dL. NHTSA separately tracks crashes at the 0.08 threshold. In recent years, approximately 10,000 to 11,000 traffic fatalities annually involve a driver or non-occupant with BAC ≥ 0.08 — roughly 25 to 28 percent of all traffic fatalities. The figure has been roughly stable for a decade after large declines from the 1980s peak, when drunk driving accounted for approximately 50 percent of all traffic deaths.
BAC data in FARS is complicated by the fact that not all fatally injured persons receive blood alcohol tests. Testing rates vary by state, person type, and circumstances. NHTSA uses a multiple imputation methodology to estimate BAC for cases where testing was not performed, producing an imputed BAC variable alongside the observed BAC variable. The imputed BAC is based on a model trained on cases where testing did occur, incorporating driver age, sex, time of day, day of week, vehicle type, and crash characteristics. Researchers using FARS for alcohol analysis should work with the imputed BAC variable to avoid systematic undercounting of alcohol involvement.
Speed-related crashes are defined in FARS as crashes in which speeding was coded as a contributing factor in the VEHICLE file — which includes not only exceeding the posted speed limit but also driving too fast for conditions (wet road, reduced visibility, congestion). Approximately 12,000 fatalities annually, roughly 29 percent of the total, are speed-related by this definition. Speed involvement is coded by the investigating officer, making it dependent on police report quality and officer training. NHTSA research suggests that speed contribution to crashes is systematically underreported in police accident reports relative to reconstruction-based estimates, so the 29 percent figure is a floor.
Distraction-involved crashes represent the most contested FARS category. Official FARS coding shows approximately 3,000 distraction-involved fatalities annually, roughly 7 to 8 percent of the total. Researchers and safety advocates widely regard this as a severe undercount, for two reasons: first, drivers rarely self-report distraction to responding officers; second, the cell phone use flag in FARS depends on officer notation and phone records are not routinely subpoenaed for non-criminal investigations. Studies using naturalistic driving data (forward- facing cameras that capture driver behavior in the seconds before a crash) suggest that distraction involvement may be two to three times higher than official FARS coding indicates.
Crash type categories
FARS supports analysis across a range of crash type categories that reflect the manner of collision and road geometry. Key categories and their approximate annual volumes in recent years:
| Crash category | Annual fatalities (approx.) | Share of total |
|---|---|---|
| Run-off-road | ~15,000 | ~36% |
| Intersection-related | ~11,000 | ~27% |
| Speeding-related | ~12,000 | ~29% |
| Rollover | ~6,000 | ~14% |
| Pedestrian fatalities | ~7,500 | ~18% |
| Motorcycle fatalities | ~5,900 | ~14% |
| Large truck-involved | ~5,800 | ~14% |
| Alcohol-involved (BAC ≥0.08) | ~10,000 | ~25% |
| Pedalcyclist fatalities | ~1,000 | ~3% |
Categories overlap: a single fatal crash can be simultaneously speed-related, alcohol-involved, a run-off-road crash, and involve a rollover. NHTSA publishes annual Traffic Safety Facts reports that tabulate each category separately, so the percentages above sum to well above 100 percent. The run-off-road category deserves special attention: departing the travel lane — whether the driver then strikes a fixed object, enters a ditch, or rolls the vehicle — is the most common sequence of events in fatal rural crashes and is closely associated with both speed and alcohol involvement.
Pedestrian and cyclist trends
Pedestrian fatalities have risen sharply over the past fifteen years, reversing decades of prior improvement. In 2009, 4,109 pedestrians were killed in traffic crashes. By 2022, that figure had reached 7,522 — the highest total since 1981 — representing an 83 percent increase over thirteen years even as overall traffic fatalities were rising far more slowly. As a share of all traffic deaths, pedestrians went from approximately 12 percent in 2009 to approximately 18 percent in 2022. The 2023 figure was approximately 7,314, modestly lower but still near historic highs.
NHTSA research and independent academic studies identify several contributing factors. The shift in the US vehicle fleet toward SUVs, crossovers, and pickup trucks is implicated: these vehicles have higher front-end profiles than sedans, which changes the strike geometry in pedestrian impacts — a sedan hood typically contacts a struck pedestrian at the legs and pelvis, while an SUV hood contacts at the thorax and head. Studies using FARS data linked to vehicle registration and VIN-based body type coding show significantly higher pedestrian mortality rates in collisions involving SUVs and trucks compared to passenger cars at equivalent speeds. The trend toward increased pedestrian exposure (more walking trips, e-scooters, delivery on foot) has also been documented. Smartphone distraction among both drivers and pedestrians is widely cited but difficult to isolate in FARS data.
Pedalcyclist fatalities have followed a similar trend: from 630 in 2009 to approximately 966 in 2022 and 1,105 in 2023. The increase in cycling infrastructure in urban areas has been accompanied by increased cycling exposure, particularly in cities and among older cyclists (FARS data shows a notable shift in cyclist fatality age distribution toward riders over 50, associated with e-bike adoption). Urban areas account for the majority of cyclist fatalities; the highest absolute cyclist death counts are in Florida, California, and Texas, which also have the largest populations and significant year-round cycling.
Rural and urban mortality patterns
The ACCIDENT file's land use variable (urban/rural, coded using Census Urban Area boundaries) reveals one of the most consistent findings in traffic safety research: rural roads kill at twice the rate of urban roads per vehicle mile traveled. In absolute terms, roughly 43 percent of traffic fatalities occur in rural areas despite those areas accounting for approximately 19 percent of total vehicle miles traveled. Rural crash characteristics differ systematically from urban: higher travel speeds (posted limits commonly 55–70 mph on two-lane rural roads), longer emergency medical response times (a key determinant of survival after severe trauma), lower seat belt use rates (surveys consistently show rural belt use 5–10 percentage points below urban rates), and a higher proportion of single-vehicle run-off-road crashes.
Emergency medical response time is particularly important. FARS does not directly capture EMS response time, but the time gap between crash time and death time (coded in the PERSON file) combined with the urban/rural variable permits indirect analysis. Rural fatalities are more likely to show deaths that occur at the scene or in transit, while urban fatalities more often occur in-hospital — reflecting the difference in trauma center proximity and EMS transport time. The rural EMS access disparity is the subject of ongoing federal safety research and is a factor in the Federal Highway Administration's rural safety initiative programs.
Urban fatalities, by contrast, are disproportionately pedestrians, cyclists, and motorcycle riders. Urban intersections are the highest-risk crash locations; left-turn crashes (in which a turning driver fails to yield to an oncoming vehicle, pedestrian, or cyclist) are a major urban fatality pattern. High-fatality urban corridors — specific arterial roadways with elevated crash densities — are the focus of NHTSA's Safe Streets and Roads for All program, which funds local and state safety improvement projects targeted at FARS-identified high-risk locations.
Motorcycle fatalities
Motorcycle fatalities represent approximately 14 percent of all traffic deaths despite motorcycles accounting for roughly 3 percent of all registered vehicles and a far smaller share of vehicle miles traveled, implying a per-VMT fatality rate approximately 28 times higher than for passenger cars. In 2022, 5,932 motorcyclists were killed; in 2023, 6,218. The motorcycle fatality rate has proven resistant to the safety improvements that drove down overall traffic fatalities in the 1980s and 1990s. Helmet use is the most well-documented protective factor: NHTSA estimates helmets are 37 percent effective in preventing motorcycle fatalities. States with universal helmet laws consistently show lower motorcycle fatality rates than states with partial laws (applying only to riders under a specified age) or no helmet law.
FARS codes helmet use, helmet type (compliant with FMVSS No. 218 or non-compliant), and helmet condition for each motorcyclist. The data has been central to decades of debate over state helmet law policy: motorcycle industry groups and rider advocacy organizations oppose universal helmet mandates as a liberty restriction, while the NHTSA analysis of FARS consistently shows elevated fatality rates in states without universal coverage. Alcohol involvement in motorcycle fatalities is higher than in any other vehicle category; approximately 28 percent of fatally injured motorcycle riders have a BAC at or above 0.08.
Large truck-involved crashes
Large trucks (gross vehicle weight rating over 10,000 pounds) were involved in approximately 5,837 fatal crashes in 2022. Because of the mass disparity between large trucks and the passenger vehicles and non-motorists they frequently collide with, large truck crashes are disproportionately fatal to the other party: roughly 72 percent of fatalities in large truck crashes are occupants of the other vehicle or non-motorists, not the truck driver or truck occupants. The Federal Motor Carrier Safety Administration (FMCSA) maintains a parallel dataset — the Motor Carrier Management Information System (MCMIS) — that links large truck crashes to carrier safety records, enabling analysis of carrier compliance history as a predictor of crash involvement. FMCSA's SAFER database (Safety and Fitness Electronic Records) is the public-facing interface for carrier lookup.
Electric vehicles and emerging data in FARS
FARS has incorporated electric vehicle identification since approximately 2019 via VIN decoding. The National Highway Traffic Safety Administration uses NHTSA's own VIN decoder (also accessible via public API at vpic.nhtsa.dot.gov/api) to classify vehicle make, model, model year, and body type for each vehicle in a FARS crash. Fuel type (including battery electric, plug-in hybrid, hydrogen fuel cell) is derivable from VIN decoding and has been added as a coded variable in recent FARS annual files. EV representation in FARS fatality data remains small in absolute terms relative to fleet share, which complicates statistical analysis; NHTSA has noted methodological caution about drawing crash rate conclusions from early EV FARS data given the geographic concentration of EV ownership (urban, coastal states) and the relatively newer age of the EV fleet, both of which affect crash exposure patterns.
The NHTSA FARS API and bulk data access
NHTSA provides a public REST API for FARS summary data atapi.nhtsa.dot.gov/FARS requiring no API key. The API supports three primary query patterns: retrieving available years (/FARS/years), retrieving national summary statistics for a given year (/FARS/{year}/summary), and retrieving state-level fatality counts for a given year and FIPS state code (/FARS/{year}/{state}/fatality). The API returns JSON with a Results array. The summary endpoint includes total fatal crashes, total fatalities, and breakdowns by vehicle type. Thefatality endpoint returns a record per county within the state, which can be summed to a state total.
For ad-hoc query-based analysis without downloading raw data files, NHTSA operates the FARS Query System (NQARS) at crashstats.nhtsa.dot.gov/Api/Public/ViewFrequency. NQARS allows users to select variables from ACCIDENT, VEHICLE, and PERSON files, apply filters, and download tabular results without handling raw SAS files. The NHTSA CrashStats portal at crashstats.nhtsa.dot.gov is the primary interface for browsing NQARS and accessing annual FARS data files.
Bulk annual data files are distributed at crashstats.nhtsa.dot.gov in two formats: SAS transport format (.sas7bdat) and CSV exports. The CSV exports have been available for recent years and are the most accessible format for Python analysis. Each annual file set contains all of the relational files described above — ACCIDENT, VEHICLE, PERSON, DISTRACT, MANEUVER, CEVENT, DRIMPAIR, WEATHER, and several additional supplementary files — as separate CSV files joined by the case number. File sizes are modest: the annual ACCIDENT file for a recent year contains approximately 38,000–43,000 rows (one per fatal crash), VEHICLE approximately 55,000–65,000 rows (more vehicles than crashes due to multi-vehicle crashes), and PERSON approximately 50,000–60,000 rows.
Python: querying the NHTSA FARS API for trend and state-level data
The following script queries the NHTSA FARS API to retrieve annual fatality totals for 2015–2023, computes year-over-year changes, then fetches state-level fatality counts for the most recent year and ranks states by fatality rate per 100,000 population using 2023 Census estimates. The script requires only requests from the standard data science toolkit; no API key is needed.
import requests
import json
# ---------------------------------------------------------------------------
# Part 1: NHTSA FARS API — Annual Fatality Totals 2015-2023
# ---------------------------------------------------------------------------
# The NHTSA FARS API requires no API key.
# Base URL: https://api.nhtsa.dot.gov/FARS
# Endpoint: /FARS/{year}/summary -> returns national total fatalities for year
BASE_URL = "https://api.nhtsa.dot.gov/FARS"
YEARS = list(range(2015, 2024)) # 2015 through 2023
print("Fetching FARS annual fatality totals from NHTSA API...")
print()
annual_totals = {}
for year in YEARS:
url = f"{BASE_URL}/{year}/summary"
resp = requests.get(url, timeout=30)
resp.raise_for_status()
data = resp.json()
# NHTSA API returns {"Count": N, "Message": "...", "Results": [...]}
results = data.get("Results", [])
if results:
total = results[0].get("TotalFatalCrashes") or results[0].get("Fatalities") or 0
# Some years the key is TotalFatalities
if not total:
total = results[0].get("TotalFatalities", 0)
annual_totals[year] = int(total)
else:
annual_totals[year] = None
# Print year-over-year table
print(f"{'Year':<6} {'Fatalities':>12} {'Change':>8} {'Pct Change':>11}")
print("-" * 42)
prev = None
for year in YEARS:
total = annual_totals[year]
if total is None:
print(f"{year:<6} {'N/A':>12}")
continue
if prev is not None:
change = total - prev
pct = (change / prev) * 100
sign = "+" if change >= 0 else ""
print(f"{year:<6} {total:>12,} {sign}{change:>7,} {sign}{pct:>9.1f}%")
else:
print(f"{year:<6} {total:>12,} {'---':>8} {'---':>11}")
prev = total
# ---------------------------------------------------------------------------
# Part 2: State-level fatalities for most recent available year (2023)
# ---------------------------------------------------------------------------
# Endpoint: /FARS/{year}/{state}/fatality
# {state} is the FIPS state code (integer 1-56, omit territories for simplicity)
RECENT_YEAR = 2023
# FIPS state codes and names (48 contiguous + DC + AK + HI = 51 entries)
STATES = {
1: "Alabama", 2: "Alaska", 4: "Arizona", 5: "Arkansas", 6: "California",
8: "Colorado", 9: "Connecticut", 10: "Delaware", 11: "District of Columbia",
12: "Florida", 13: "Georgia", 15: "Hawaii", 16: "Idaho", 17: "Illinois",
18: "Indiana", 19: "Iowa", 20: "Kansas", 21: "Kentucky", 22: "Louisiana",
23: "Maine", 24: "Maryland", 25: "Massachusetts", 26: "Michigan",
27: "Minnesota", 28: "Mississippi", 29: "Missouri", 30: "Montana",
31: "Nebraska", 32: "Nevada", 33: "New Hampshire", 34: "New Jersey",
35: "New Mexico", 36: "New York", 37: "North Carolina", 38: "North Dakota",
39: "Ohio", 40: "Oklahoma", 41: "Oregon", 42: "Pennsylvania",
44: "Rhode Island", 45: "South Carolina", 46: "South Dakota",
47: "Tennessee", 48: "Texas", 49: "Utah", 50: "Vermont", 51: "Virginia",
53: "Washington", 54: "West Virginia", 55: "Wisconsin", 56: "Wyoming",
}
# 2023 Census population estimates (millions -> raw count) for rate calculation
# Source: US Census Bureau, July 1 2023 estimates
STATE_POP_2023 = {
1: 5108468, 2: 733583, 4: 7431344, 5: 3045637, 6: 38965193,
8: 5877610, 9: 3617176, 10: 1031890, 11: 678972, 12: 22610726,
13: 11029227, 15: 1435138, 16: 1964726, 17: 12549689, 18: 6833037,
19: 3207004, 20: 2940865, 21: 4526154, 22: 4573749, 23: 1395722,
24: 6180253, 25: 7001399, 26: 10037261, 27: 5737915, 28: 2939690,
29: 6196156, 30: 1132812, 31: 1978379, 32: 3194176, 33: 1402054,
34: 9290841, 35: 2114371, 36: 19571216, 37: 10698973, 38: 779261,
39: 11785935, 40: 4053824, 41: 4233358, 42: 12961683, 44: 1095962,
45: 5373555, 46: 919318, 47: 7126489, 48: 30503301, 49: 3417734,
50: 647464, 51: 8715698, 53: 7812880, 54: 1775156, 55: 5910955,
56: 584057,
}
print()
print(f"Fetching FARS {RECENT_YEAR} state-level fatalities...")
state_fatalities = {}
for fips, name in STATES.items():
url = f"{BASE_URL}/{RECENT_YEAR}/{fips}/fatality"
try:
r = requests.get(url, timeout=30)
r.raise_for_status()
d = r.json()
results = d.get("Results", [])
count = sum(
int(row.get("Fatalities", 0) or row.get("TotalFatalities", 0))
for row in results
)
state_fatalities[fips] = count
except Exception as e:
state_fatalities[fips] = None
# Compute fatality rate per 100,000 population and rank
ranked = []
for fips, name in STATES.items():
count = state_fatalities.get(fips)
pop = STATE_POP_2023.get(fips)
if count is not None and pop:
rate = (count / pop) * 100000
ranked.append((name, count, pop, rate))
ranked.sort(key=lambda x: -x[3]) # descending by rate
print()
print(f"{'State':<25} {'Fatalities':>10} {'Population':>12} {'Rate/100k':>10}")
print("-" * 63)
for name, count, pop, rate in ranked:
print(f"{name:<25} {count:>10,} {pop:>12,} {rate:>10.2f}")
# Summary: highest and lowest rate states
print()
print(f"Highest fatality rate: {ranked[0][0]} ({ranked[0][3]:.2f} per 100k)")
print(f"Lowest fatality rate: {ranked[-1][0]} ({ranked[-1][3]:.2f} per 100k)")
print(f"National total {RECENT_YEAR}: {sum(r[1] for r in ranked):,}")
The NHTSA API response structure has varied slightly across years; the script handles the most common key names (TotalFatalCrashes,Fatalities, TotalFatalities) with fallback logic. For production use, validate the response structure against the current API documentation at api.nhtsa.dot.gov before deploying. For deeper analysis requiring the full ACCIDENT–VEHICLE–PERSON relational structure, download the annual CSV files from crashstats.nhtsa.dot.gov and load them with pandas.read_csv(); the case number field (ST_CASE) serves as the join key across all files.
State fatality rate variation and the rural-urban factor
State-level fatality rates per 100,000 population vary by a factor of roughly five between the safest and most dangerous states. States with high rural road share, high posted speed limits, lower seat belt use rates, and lower enforcement density consistently rank at the top of per-capita fatality rates. Mississippi, Wyoming, South Carolina, Montana, and Arkansas have historically led the national fatality rate rankings. Massachusetts, New York, New Jersey, Minnesota, and Hawaii consistently rank among the lowest-rate states.
Per-VMT rates tell a somewhat different story than per-capita rates. States with large populations concentrated in urban areas (New York, California) have high absolute fatality counts but low per-VMT rates because urban driving accumulates many vehicle miles at lower speeds. States with sparse populations and long driving distances (Montana, Wyoming, North Dakota) have elevated per-VMT rates reflecting their high-speed rural road network. NHTSA publishes both per-capita and per-VMT fatality rates in its annual state Traffic Safety Facts reports, and the FHWA's Highway Statistics series provides state VMT estimates for rate calculation.
Data limitations and research notes
FARS is a census of fatal crashes, not all crashes. Injury-only crashes, property-damage-only crashes, and unreported crashes are outside its scope. For non-fatal crash analysis, the General Estimates System (GES) — now superseded by the Crash Report Sampling System (CRSS) — provides a nationally representative probability sample of all police-reported crashes regardless of injury severity. The National Automotive Sampling System (NASS) Crashworthiness Data System (CDS) provides detailed biomechanical data on a subsample of crashes. Together, FARS, CRSS, and the legacy NASS CDS form the core of NHTSA's national crash data infrastructure.
Data release lag is a known limitation. FARS annual files are typically released approximately 12–18 months after the reference year end. The 2023 annual file, for example, was released in late 2024. NHTSA releases early estimates of annual fatalities — based on preliminary state data — within 4–6 months of year-end, which appear in NHTSA's traffic safety early estimates publications before the final FARS file is available. Early estimates are subsequently revised when the final FARS file is released, and researchers should use final FARS data for publishable analyses rather than early estimates.
Geocoding quality in FARS has improved substantially since GPS-based reporting became standard, but older years (particularly pre-2000) have significant geographic missingness. For spatial analysis — mapping crash density, identifying high-fatality corridors, linking crashes to roadway characteristics from FHWA's Highway Performance Monitoring System — recent years (2010–present) provide much higher-quality latitude/longitude coordinates than the pre-GPS era. NHTSA also provides a linked FARS–HPMS file for recent years that joins fatal crash locations to roadway-level characteristics (speed limit, lane count, roadway function class, median type) from the FHWA infrastructure database — enabling crash rate analysis that controls for roadway design characteristics rather than just geography.