Technical writing

NOAA Storm Events Database: The Federal Record Behind 50 Years of US Weather Disasters

· 15 min read· AI Analytics
NOAAStorm EventsWeather DisastersClimateFederal Data

The NOAA National Centers for Environmental Information Storm Events Database is the official federal record of severe weather in the United States — 48 recognized event types, records reaching back to 1950, and detailed entries since 1996 covering property damage, crop losses, direct and indirect fatalities, injuries, and narrative descriptions written by local forecasters at every National Weather Service office in the country.

What the Storm Events database contains

The Storm Events Database is maintained by the NOAA National Centers for Environmental Information (NCEI), headquartered in Asheville, North Carolina. NCEI is the world's largest provider of atmospheric, coastal, geophysical, and oceanic data; the Storm Events Database is one of its most widely accessed public data products. The database aggregates storm reports submitted by the 122 local offices of the National Weather Service (NWS) across the continental United States, Alaska, Hawaii, Puerto Rico, Guam, and the Virgin Islands.

Each record in the database corresponds to a single weather event in a single county (or county-equivalent jurisdiction). A tornado that crosses four counties generates four separate event records. A hurricane that produces storm surge, flooding, and tornadoes simultaneously will generate separate records for each phenomenon in each affected county. This geographic disaggregation is a defining structural choice: it allows county-level damage attribution and enables joining storm records to socioeconomic data via Federal Information Processing Standard (FIPS) county codes, which appear as distinct fields in every record.

The scope of the database across time is uneven. Records from 1950 through 1954 cover only tornadoes. Thunderstorm wind and hail records were added in 1955. The full suite of 48 event types was not standardized until January 1996, when NCEI introduced a new event classification directive that expanded coverage to all significant weather events including heat waves, dense fog, rip currents, astronomical low tides, and heavy snow. The post-1996 data is therefore qualitatively richer and more consistently classified than the historical record, though the earlier decades remain valuable for climatological trend analysis of the event types that were tracked.

In recent years the database has recorded between 60,000 and 80,000 weather events annually. The 2023 edition contains approximately 75,000 distinct event records. The total database, spanning 1950 through the present, contains more than 1.8 million event records, making it one of the most comprehensive long-run severe weather archives in the world.

Event type taxonomy

NCEI recognizes exactly 48 official storm event types, a classification scheme that has been stable since 1996. The types span the full range of significant US weather phenomena from violent convective events to prolonged hydrometeorological extremes. The 48 types are organized loosely by atmospheric mechanism:

Convective events include Tornado, Thunderstorm Wind, Hail, Lightning, Funnel Cloud, Waterspout, Dust Devil, and Dust Storm. Thunderstorm Wind is the single most frequently recorded event type in the modern database, with more than 20,000 entries per year nationally. Hail is the second most common. Tornadoes, despite their cultural prominence, account for roughly 1,200 to 1,800 recorded events per year — less than 3% of annual event volume but a disproportionate share of fatalities and insured losses.

Tropical events include Hurricane (Typhoon), Tropical Storm, Tropical Depression, and Storm Surge/Tide. Storm Surge records are among the most economically significant in the database: a single Gulf Coast hurricane can produce storm surge records with property damage tallies in the billions of dollars. The damage amounts in these records represent NCEI estimates and are not insurance payouts or FEMA obligation figures; they are derived from post-event surveys, news reports, NWS damage assessments, and state emergency management agency reports.

Winter events include Winter Storm, Winter Weather, Ice Storm, Sleet, Blizzard, Lake-Effect Snow, Cold/Wind Chill, Extreme Cold/Wind Chill, Freezing Fog, Frost/Freeze, and Heavy Snow. The 2021 Texas winter storm (Winter Storm Uri, February 2021) appears across multiple event types in the Storm Events Database: Winter Storm records for the freezing precipitation, Extreme Cold/Wind Chill records for the temperature anomaly, and Winter Weather records for affected counties that did not meet the Winter Storm threshold. Total estimated damage from Uri exceeds $200 billion in the records, though NCEI's estimate methodology carries wide uncertainty at that scale.

Hydrological events include Flash Flood, Flood, Coastal Flood, Lakeshore Flood, Debris Flow, Seiche, Tsunami, and Rip Current. Flash Flood is among the deadliest event types in the database on a per-event basis. The distinction between Flash Flood and Flood in the database follows NWS issuance criteria: Flash Flood records correspond to events for which a Flash Flood Warning was issued (rapid-onset flooding within 6 hours of causative rainfall); Flood records correspond to slower-onset riverine flooding associated with Flood Warnings or River Flood Statements.

Heat and drought events include Heat, Excessive Heat, Drought, and Wildfire. Excessive Heat is the historically deadliest weather event type in the continental United States by total direct fatalities over the full database period, exceeding tornado and flood deaths combined in most decades. The 1995 Chicago heat wave, which killed an estimated 739 people in five days, is captured across multiple Cook County records in the database.

Other event types include Dense Fog, Dense Smoke, Volcanic Ash, High Surf, Marine Dense Fog, Marine Strong Wind, Marine Thunderstorm Wind, Marine High Wind, Marine Hail, Marine Lightning, Sneaker Wave, and Astronomical Low Tide. Marine event types are recorded for events affecting coastal and inland waterways and typically carry lower property damage estimates because the exposed population and infrastructure are smaller than on land.

Selected Storm Event types with approximate national averages (2010–2024)
Event TypeAvg Events/yrAvg Deaths/yrAvg Prop Dmg/yr
Thunderstorm Wind~22,000~55~$2.5B
Hail~10,000~1~$4.0B
Flash Flood~4,500~80~$1.8B
Tornado~1,400~70~$2.2B
Excessive Heat~700~130~$0.1B
Winter Storm~1,200~30~$0.8B
Hurricane~50~30~$12.0B
Rip Current~800~65~$0.0B
Lightning~3,500~25~$0.2B
High Wind~4,000~25~$0.4B

Data collection and quality

The pipeline from weather event to published database record runs through the National Weather Service local office network. The United States is divided into 122 NWS Weather Forecast Offices (WFOs), each responsible for issuing watches, warnings, and advisories for its geographic area of responsibility (CWA — County Warning Area). After a significant weather event, the WFO meteorologist responsible for the affected area enters a storm report into NCEI's Storm Data preparation system. The meteorologist fills in structured fields for event type, location (county, latitude/longitude begin and end points for linear events like tornadoes), event magnitude (wind speed in knots, hail diameter in inches, tornado F/EF rating), dates, times, and damage estimates, plus a free-text narrative describing the event in detail.

The free-text narrative field is one of the most distinctive aspects of the Storm Events Database. Unlike purely machine-generated sensor data, each record contains a paragraph or more of prose written by a trained meteorologist who reviewed damage survey photos, newspaper accounts, emergency management reports, insurance adjusters' estimates, and in some cases personally conducted post-event damage surveys. For major tornado events, narratives often extend to several hundred words. The Joplin, Missouri tornado of May 22, 2011 — the deadliest single tornado in the US since modern record-keeping — has narratives spanning multiple records with extensive documentation of the EF5 damage path, the 161 confirmed deaths, and the mile-wide track through the city.

NWS WFO offices submit storm reports monthly, and NCEI publishes updated data files with a lag of approximately 60 to 90 days. The most recent month's data is sometimes preliminary and subject to revision; NCEI periodically issues corrected files when damage estimates are revised upward after insurance settlements or FEMA preliminary damage assessments are completed. Users performing trend analysis should anchor on the annual CSV files rather than monthly updates to avoid double-counting revisions.

Data quality varies systematically across event types and geographic regions. Events in densely populated areas with active local emergency management offices tend to have more complete records than events in sparsely populated rural areas. Damage estimates in particular carry wide uncertainty: the NWS damage estimation guidance instructs meteorologists to distinguish property damage from content loss and to use visible structural damage as the primary basis for estimates, but the precision of these estimates is limited. NCEI and NOAA explicitly state that Storm Events damage figures are order-of-magnitude estimates, not actuarial values. The billion-dollar disaster estimates published separately by NOAA's NCEI team use a distinct methodology (described below) that is not directly derived from the per-event Storm Events records.

Major tornado records

The tornado record in the Storm Events Database constitutes the most authoritative long-run inventory of US tornado occurrence, intensity, and impacts. Combined with the NOAA Storm Prediction Center's complementary tornado database (which extends back to 1950 with path-level geographic data), the Storm Events tornado records support climatological research, engineering loss modeling, and emergency preparedness planning across the tornado-prone regions of the central and southeastern United States.

The Enhanced Fujita Scale (EF Scale) replaced the original Fujita Scale in February 2007 after a multi-year review led by the Wind Science and Engineering Center at Texas Tech University. The revised scale recalibrated damage indicators and degree of damage thresholds to better reflect the relationship between wind speed and structural failure. The Storm Events Database records EF ratings for all tornadoes since 2007 and F ratings for tornadoes before 2007; both appear in the TOR_F_SCALEfield of the CSV files. EF0 and EF1 tornadoes (65–110 mph estimated winds) account for roughly 80% of all recorded events; EF4 and EF5 tornadoes (168+ mph) account for less than 1% of events but a substantial majority of fatalities.

Several tornado events in the Storm Events record stand as benchmarks for catastrophic loss:

The April 2011 Super Outbreak (April 25–28, 2011) is the largest tornado outbreak in recorded US history by event count, confirmed fatalities, and number of significant (EF2+) tornadoes. Over the four-day period, NOAA confirmed 362 tornadoes across 21 states, including 11 rated EF4 or EF5. The outbreak killed 324 people and caused an estimated $11 billion in property damage. Alabama bore the greatest losses: Tuscaloosa and Birmingham were struck by multiple long-track tornadoes on April 27, including one EF4 that killed 64 people along a path of more than 80 miles. The Storm Events records for the April 2011 outbreak span hundreds of individual county-level entries across Alabama, Mississippi, Tennessee, Georgia, Arkansas, Virginia, and six other states.

Joplin, Missouri (May 22, 2011) produced a single EF5 tornado rated as the deadliest US tornado since the 1947 Glazier–Higgins–Woodward outbreak. The Joplin tornado tracked 22.1 miles through Jasper and Newton Counties, reached a maximum width of approximately one mile, and killed 161 people directly. More than 7,000 homes and 500 businesses were destroyed or severely damaged; total estimated damage exceeded $2.8 billion. The storm's path through a densely built residential and commercial area of approximately 50,000 people, combined with its EF5 intensity, produced a per-unit-area damage concentration that exceeded virtually all other documented US tornado events.

Moore, Oklahoma (May 20, 2013) was struck by an EF5 tornado for the third time since 1999 (the 1999 Bridge Creek–Moore tornado and the 2003 Moore tornado were predecessors). The 2013 tornado tracked approximately 14 miles through Moore and south Oklahoma City, was up to 1.3 miles wide, and killed 24 people including 7 children at Plaza Towers Elementary School. Total estimated damage was approximately $2 billion. The recurrent catastrophic tornado strikes on the same Oklahoma City suburban community have made Moore a paradigmatic case study in tornado-resilient building code policy; Oklahoma subsequently enacted storm shelter requirements for new school construction.

The December 2021 Midwest outbreak (December 10–11, 2021) was the most significant December tornado outbreak in recorded US history, producing tornadoes across eight states during the typically quiet winter severe weather minimum. The Quad States tornado — tracking approximately 165 to 200 miles across Missouri, Tennessee, Arkansas, and Kentucky — was one of the longest-track tornadoes in US history and killed at least 89 people, including 57 at the Mayfield Consumer Products candle factory in Mayfield, Kentucky. The December timing of a violent, long-track tornado challenged established mental models of US tornado risk geography and seasonality.

Hurricane and flood records

Atlantic and Gulf Coast hurricanes generate some of the largest property damage entries in the Storm Events Database, though the full economic cost of major hurricanes is generally captured across multiple event types: the Hurricane or Tropical Storm record covers wind damage; separate Storm Surge/Tide records cover inundation damage in coastal counties; Flash Flood or Flood records capture inland rainfall flooding; and Tornado records document the embedded mesovortices that most landfalling hurricanes produce.

Hurricane Katrina (August 2005) is represented in the Storm Events Database primarily through the storm surge records for Louisiana, Mississippi, and Alabama, which carry property damage estimates totaling more than $80 billion across the affected counties. The Storm Events figures for Katrina are necessarily approximations: the catastrophic failure of the New Orleans levee system flooded approximately 80% of the city, destroyed or severely damaged more than 200,000 homes, and produced losses that insurance adjusters and FEMA were still assessing years after the event. The database records 1,833 direct deaths attributed to Katrina, making it the deadliest US weather event in the Storm Events record for the post-1996 period.

Hurricane Harvey (August 2017) holds the US record for rainfall from a tropical cyclone: the Cedar Bayou gauge in Harris County, Texas recorded 60.58 inches of rainfall over four days as Harvey stalled over the Houston metropolitan area. Storm Events records for Harvey span hundreds of entries across Texas and Louisiana, with Flash Flood and Flood records in Harris, Fort Bend, Brazoria, and surrounding counties carrying aggregate property damage estimates exceeding $125 billion. The rainfall totals and damage extents in the Harvey records are extraordinary even by the standards of major US hurricane events; Harvey alone accounted for more flood insurance claims through the National Flood Insurance Program than any event since Katrina.

Hurricane Ian (September 2022) made landfall near Fort Myers, Florida as a Category 4 hurricane with maximum sustained winds of 150 mph on September 28, 2022. The storm surge along the Lee County coastline reached 12 to 18 feet in some locations, inundating barrier islands and the Fort Myers Beach community. Storm Events records for Ian show property damage estimates exceeding $100 billion across Florida and South Carolina (where Ian made a second landfall). Ian killed at least 156 people directly, making it the deadliest Florida hurricane since the 1935 Labor Day Hurricane. The Storm Events damage figures for Ian reflect the challenge of post-event estimation: early NCEI estimates were subsequently revised significantly as insurance claims data became available.

Inland flooding events — not associated with named tropical storms — also generate some of the largest per-event damage records. The 1993 Great Flood of the Mississippi and Missouri Rivers, the 2010 Nashville flood, the 2013 Colorado Front Range floods, the 2016 Louisiana floods (which killed 13 people and flooded 60,000 homes in a rain event that did not receive a name or a FEMA Major Disaster Declaration for several weeks), and the 2022 Eastern Kentucky floods all appear in the Storm Events record with property damage estimates in the billions of dollars and significant direct fatality counts.

Damage estimation methodology

The damage figures in the Storm Events Database are among the most cited and most frequently misunderstood numbers in federal weather data. Understanding their provenance and limitations is essential for any analysis that uses them.

NCEI instructs NWS WFO meteorologists to estimate property damage as the total replacement or repair cost of structures damaged or destroyed by the event, exclusive of contents. Crop damage is estimated separately as the total loss of crop value, based on reports from state departments of agriculture, USDA Farm Service Agency loss assessments, or agricultural extension services. Both figures are supposed to represent gross physical replacement cost at the time of the event, not insurance payments, not FEMA obligations, and not secondary economic losses from business interruption.

In practice, meteorologists rely on whatever damage information is available at the time of report submission — typically 30 to 90 days after the event. Sources include news media reports, county emergency management damage assessments, Red Cross reports, FEMA preliminary damage assessments, and in some cases direct contact with local governments. For small to moderate events, where the total damage is under $1 million, the estimates are often rough approximations based on visible structural damage and local construction cost benchmarks. For major events, where total damage exceeds $1 billion, the uncertainty in the NCEI estimates can be substantial — early post-event figures are frequently revised by 20% to 50% in subsequent monthly updates as more complete assessments become available.

The Storm Events CSV files encode property and crop damage as string values using an abbreviated notation: amounts less than $1,000 are given in full; amounts from $1,000 to $999,999 are given as a number followed by “K” (thousands); amounts from $1 million to $999 million are given as a number followed by “M”; amounts of $1 billion or more are given as a number followed by “B”. This encoding means that before any quantitative damage analysis, theDAMAGE_PROPERTY and DAMAGE_CROPS fields must be parsed with a custom function — the Python workflow below handles this transformation.

Billion-dollar disasters

The NOAA NCEI Billion-Dollar Weather and Climate Disasters database is a related but distinct product from the per-event Storm Events records. Published annually since 1980, the Billion-Dollar database tracks US weather and climate events that each caused at least $1 billion in total losses (property damage plus crop damage, adjusted to the current year using the Consumer Price Index). As of 2024, the database contains more than 380 separate billion-dollar events since 1980.

The methodology for the Billion-Dollar database is more rigorous and more consistent than the per-county Storm Events damage estimates. NCEI analysts compile damage estimates from a defined set of authoritative sources: Property Claim Services (PCS) industry-insured loss data, the reinsurance firm Munich Re's NatCatSERVICE database, Swiss Re's sigma database, FEMA Public Assistance obligation data, USDA crop loss data from the Risk Management Agency and FSA, and state emergency management agency reports. The insured loss is then converted to a total economic loss estimate using historical insured-to-total-loss multipliers derived from post-event industry studies.

The Billion-Dollar database distinguishes seven event categories: Tropical Cyclone, Flooding, Severe Storm (tornado, hail, and non-tropical wind), Winter Storm, Wildfire, Drought, and Freeze. Across the 1980–2024 period, the cumulative adjusted losses exceed $3 trillion. The annual frequency of billion-dollar events has increased from an average of approximately 3 events per year in the 1980s to more than 20 events per year in the 2020s, reflecting a combination of increasing event intensity, greater exposure in high-risk areas due to population and asset growth, and rising construction costs that inflate nominal damage totals.

The top-10 costliest US weather events in the Billion-Dollar database by total estimated damage (2024 dollars) are dominated by Gulf Coast hurricanes:

Ten costliest US weather events (NOAA Billion-Dollar Database, 2024 dollars)
EventYearTypeEst. Damage
Hurricane Katrina2005Tropical Cyclone~$200B
Hurricane Harvey2017Tropical Cyclone~$148B
Hurricane Ian2022Tropical Cyclone~$115B
Hurricane Maria2017Tropical Cyclone~$107B
Hurricane Sandy2012Tropical Cyclone~$80B
Winter Storm Uri (Texas)2021Winter Storm~$24B
Hurricane Irma2017Tropical Cyclone~$59B
Midwest Derecho2020Severe Storm~$12B
April 2011 Tornado Outbreak2011Severe Storm~$14B
Hurricane Florence2018Tropical Cyclone~$24B

The Billion-Dollar database is published at https://www.ncei.noaa.gov/access/billions/ and is available as CSV, JSON, and interactive map formats. It is updated at least annually and is the primary reference for policy discussions about whether US extreme weather losses are increasing over time.

Data access and file format

The Storm Events Database is available as compressed CSV files through the NCEI FTP mirror at:

https://www.ncei.noaa.gov/pub/data/swdi/stormevents/csvfiles/

Three file types are available for each year, each as a separate gzip-compressed CSV:

  • StormEvents_details-ftp_v1.0_dYYYY_cYYYYMMDD.csv.gz — the primary event detail file. One row per event-county occurrence. Contains all structured fields plus the full narrative text. This is the file used for the Python analysis below.
  • StormEvents_fatalities-ftp_v1.0_dYYYY_cYYYYMMDD.csv.gz — the fatalities file. One row per individual fatality, linked to the detail file viaEVENT_ID. Contains age, sex, fatality type (direct or indirect), fatality location (in a vehicle, in water, permanent home, mobile home, outdoors, etc.), and a fatality narrative. Critical for demographic analysis of storm deaths.
  • StormEvents_locations-ftp_v1.0_dYYYY_cYYYYMMDD.csv.gz — the locations file. One row per geographic point associated with an event. For tornadoes, this includes the start and end latitude/longitude of each segment of the tornado path. For point events, it includes a single coordinate. Useful for GIS mapping.

The filename suffix _cYYYYMMDD is the creation date of the file, not the event date. NCEI occasionally re-releases corrected files with updated creation dates; when downloading programmatically, the latest available file for each year should be selected. The index page lists all available files in chronological order and can be scraped with a simple regex to identify the current file for each year.

Key fields in the details CSV:

  • BEGIN_YEARMONTH, BEGIN_DAY, BEGIN_TIME — event start date and local time (HHMM format, local standard time)
  • END_YEARMONTH, END_DAY, END_TIME — event end date and time
  • STATE, STATE_FIPS, CZ_NAME, CZ_FIPS — state name, state FIPS code, county or zone name, county/zone FIPS
  • CZ_TYPE — “C” for county, “Z” for NWS forecast zone, “M” for marine zone
  • EVENT_TYPE — one of the 48 official event types
  • SOURCE — the source of the storm report (trained spotter, law enforcement, NWS damage survey, broadcast media, etc.)
  • MAGNITUDE, MAGNITUDE_TYPE — numeric magnitude and type (EG = estimated gust, MG = measured gust, ES = estimated sustained, E = estimated hail diameter, MS = measured hail diameter)
  • DEATHS_DIRECT, DEATHS_INDIRECT — direct fatalities caused by the event mechanism; indirect fatalities caused by unsafe conditions resulting from the event
  • INJURIES_DIRECT, INJURIES_INDIRECT — injuries categorized similarly
  • DAMAGE_PROPERTY, DAMAGE_CROPS — damage strings in abbreviated notation (“5K”, “1.2M”, “500B”)
  • TOR_F_SCALE, TOR_LENGTH, TOR_WIDTH — tornado-specific fields (EF rating, path length in miles, path width in yards)
  • BEGIN_LAT, BEGIN_LON, END_LAT, END_LON — coordinates for the event or its start and end points
  • EPISODE_NARRATIVE — narrative for the weather episode (may span multiple events); EVENT_NARRATIVE — narrative specific to this event-county record

Python: downloading and analyzing storm events

The following script downloads annual Storm Events detail CSV files for a range of years, parses the abbreviated damage notation, and produces five analyses: annual event volume and damage trends, the 15 deadliest event types, property damage by state, EF2+ tornado counts by state, and event types with cumulative property damage exceeding $1 billion. Requirements: requests and pandas. Downloads are streamed in chunks to avoid memory pressure on large files.

import requests
import pandas as pd
import gzip
import io
import re
from pathlib import Path

# ---------------------------------------------------------------------------
# NOAA Storm Events Database — CSV Download and Analysis
# Base URL: https://www.ncei.noaa.gov/pub/data/swdi/stormevents/csvfiles/
# Files: StormEvents_details-ftp_v1.0_dYYYY_cYYYYMMDD.csv.gz
# No API key required. Files are updated monthly.
# ---------------------------------------------------------------------------

BASE_URL = "https://www.ncei.noaa.gov/pub/data/swdi/stormevents/csvfiles/"
ANALYSIS_YEARS = list(range(2010, 2025))  # 2010 through 2024


def list_available_files() -> list[dict]:
    """Scrape the index page to discover available .csv.gz filenames."""
    resp = requests.get(BASE_URL, timeout=30)
    resp.raise_for_status()
    # Extract filenames matching the details pattern
    pattern = r'(StormEvents_details-ftp_v1\.0_d(\d{4})_c\d{8}\.csv\.gz)'
    matches = re.findall(pattern, resp.text)
    seen_years: set[str] = set()
    files = []
    for filename, year in matches:
        if year not in seen_years:
            seen_years.add(year)
            files.append({"filename": filename, "year": int(year)})
    return sorted(files, key=lambda x: x["year"])


def fetch_year(filename: str) -> pd.DataFrame:
    """Download and parse a single year's storm events CSV.gz file."""
    url = BASE_URL + filename
    resp = requests.get(url, timeout=120, stream=True)
    resp.raise_for_status()
    content = b"".join(resp.iter_content(chunk_size=65536))
    with gzip.open(io.BytesIO(content)) as f:
        df = pd.read_csv(f, dtype=str, low_memory=False)
    return df


# ---------------------------------------------------------------------------
# Step 1: Discover files and download the requested years
# ---------------------------------------------------------------------------
print("Listing available Storm Events files...")
available = {f["year"]: f["filename"] for f in list_available_files()}

frames = []
for year in ANALYSIS_YEARS:
    if year not in available:
        print(f"  {year}: not available, skipping")
        continue
    print(f"  Downloading {available[year]} ...")
    try:
        df_year = fetch_year(available[year])
        df_year["_year"] = year
        frames.append(df_year)
        print(f"    -> {len(df_year):,} events")
    except Exception as e:
        print(f"    ERROR: {e}")

df = pd.concat(frames, ignore_index=True)
print(f"\nTotal events loaded: {len(df):,} across {len(frames)} years\n")

# ---------------------------------------------------------------------------
# Step 2: Standardize damage columns
# ---------------------------------------------------------------------------
# DAMAGE_PROPERTY and DAMAGE_CROPS are strings like "5K", "1.2M", "0"
def parse_damage(s: str) -> float:
    """Convert NOAA damage strings ('5K', '2.5M', '1B') to float dollars."""
    if pd.isna(s) or str(s).strip() in ("", "0"):
        return 0.0
    s = str(s).strip().upper()
    multipliers = {"K": 1_000, "M": 1_000_000, "B": 1_000_000_000}
    if s[-1] in multipliers:
        try:
            return float(s[:-1]) * multipliers[s[-1]]
        except ValueError:
            return 0.0
    try:
        return float(s)
    except ValueError:
        return 0.0

df["prop_dmg_usd"] = df["DAMAGE_PROPERTY"].apply(parse_damage)
df["crop_dmg_usd"] = df["DAMAGE_CROPS"].apply(parse_damage)
df["total_dmg_usd"] = df["prop_dmg_usd"] + df["crop_dmg_usd"]

# Numeric deaths and injuries
df["DEATHS_DIRECT"] = pd.to_numeric(df.get("DEATHS_DIRECT", pd.Series(dtype=float)), errors="coerce").fillna(0)
df["DEATHS_INDIRECT"] = pd.to_numeric(df.get("DEATHS_INDIRECT", pd.Series(dtype=float)), errors="coerce").fillna(0)
df["INJURIES_DIRECT"] = pd.to_numeric(df.get("INJURIES_DIRECT", pd.Series(dtype=float)), errors="coerce").fillna(0)
df["total_deaths"] = df["DEATHS_DIRECT"] + df["DEATHS_INDIRECT"]

# ---------------------------------------------------------------------------
# Step 3: Annual event counts
# ---------------------------------------------------------------------------
annual = (
    df.groupby("_year")
    .agg(
        events=("EVENT_ID", "count"),
        deaths=("total_deaths", "sum"),
        injuries=("INJURIES_DIRECT", "sum"),
        prop_dmg=("prop_dmg_usd", "sum"),
    )
    .reset_index()
    .sort_values("_year")
)

print("=== Annual Storm Events Summary (2010-2024) ===")
print(f"  {'Year':<6}  {'Events':>8}  {'Deaths':>8}  {'Injuries':>10}  {'Prop Dmg (B)':>14}")
print("  " + "-" * 54)
for _, row in annual.iterrows():
    print(
        f"  {int(row['_year']):<6}  {int(row['events']):>8,}  "
        f"{int(row['deaths']):>8,}  {int(row['injuries']):>10,}  "
        f"${row['prop_dmg'] / 1e9:>13.2f}"
    )

# ---------------------------------------------------------------------------
# Step 4: Top 15 deadliest event types (2010-2024)
# ---------------------------------------------------------------------------
event_type_summary = (
    df.groupby("EVENT_TYPE")
    .agg(
        events=("EVENT_ID", "count"),
        deaths=("total_deaths", "sum"),
        injuries=("INJURIES_DIRECT", "sum"),
        prop_dmg=("prop_dmg_usd", "sum"),
    )
    .reset_index()
    .sort_values("deaths", ascending=False)
    .head(15)
)

print("\n=== Top 15 Deadliest Event Types (2010-2024) ===")
print(f"  {'Event Type':<30}  {'Deaths':>8}  {'Injuries':>10}  {'Events':>8}  {'Prop Dmg (M)':>14}")
print("  " + "-" * 80)
for _, row in event_type_summary.iterrows():
    print(
        f"  {row['EVENT_TYPE'][:30]:<30}  {int(row['deaths']):>8,}  "
        f"{int(row['injuries']):>10,}  {int(row['events']):>8,}  "
        f"${row['prop_dmg'] / 1e6:>13.1f}"
    )

# ---------------------------------------------------------------------------
# Step 5: Total property damage by state (top 15)
# ---------------------------------------------------------------------------
state_dmg = (
    df.groupby("STATE")
    .agg(
        events=("EVENT_ID", "count"),
        prop_dmg=("prop_dmg_usd", "sum"),
        deaths=("total_deaths", "sum"),
    )
    .reset_index()
    .sort_values("prop_dmg", ascending=False)
    .head(15)
)

print("\n=== Top 15 States by Property Damage (2010-2024) ===")
print(f"  {'State':<22}  {'Prop Dmg (B)':>14}  {'Events':>8}  {'Deaths':>8}")
print("  " + "-" * 60)
for _, row in state_dmg.iterrows():
    print(
        f"  {row['STATE'][:22]:<22}  "
        f"${row['prop_dmg'] / 1e9:>13.2f}  "
        f"{int(row['events']):>8,}  "
        f"{int(row['deaths']):>8,}"
    )

# ---------------------------------------------------------------------------
# Step 6: Tornado analysis &mdash; all EF2+ tornadoes
# ---------------------------------------------------------------------------
tornadoes = df[df["EVENT_TYPE"].str.upper() == "TORNADO"].copy()
tornadoes["TOR_F_SCALE"] = tornadoes.get("TOR_F_SCALE", pd.Series(dtype=str))

ef2_plus = tornadoes[
    tornadoes["TOR_F_SCALE"].str.match(r"^(EF|F)[2-5]", na=False)
].copy()

ef2_plus_by_state = (
    ef2_plus.groupby("STATE")
    .agg(
        count=("EVENT_ID", "count"),
        deaths=("total_deaths", "sum"),
        prop_dmg=("prop_dmg_usd", "sum"),
    )
    .reset_index()
    .sort_values("count", ascending=False)
    .head(10)
)

print("\n=== EF2+ Tornadoes by State (2010-2024) ===")
print(f"  {'State':<22}  {'Count':>8}  {'Deaths':>8}  {'Prop Dmg (M)':>14}")
print("  " + "-" * 58)
for _, row in ef2_plus_by_state.iterrows():
    print(
        f"  {row['STATE'][:22]:<22}  {int(row['count']):>8,}  "
        f"{int(row['deaths']):>8,}  ${row['prop_dmg']/1e6:>13.1f}"
    )

# ---------------------------------------------------------------------------
# Step 7: Billion-dollar event types (aggregate damage >= $1B over period)
# ---------------------------------------------------------------------------
billion_dollar_types = (
    df.groupby("EVENT_TYPE")["prop_dmg_usd"]
    .sum()
    .reset_index()
    .query("prop_dmg_usd >= 1e9")
    .sort_values("prop_dmg_usd", ascending=False)
)

print("\n=== Event Types with Cumulative Property Damage >= \$1B (2010-2024) ===")
print(f"  {'Event Type':<30}  {'Total Prop Dmg (B)':>20}")
print("  " + "-" * 54)
for _, row in billion_dollar_types.iterrows():
    print(f"  {row['EVENT_TYPE'][:30]:<30}  ${row['prop_dmg_usd']/1e9:>19.2f}")

print("\nAnalysis complete.")

The parse_damage function handles the three-multiplier notation (“K”, “M”, “B”) that NCEI uses throughout the raw CSV files. Because NCEI occasionally re-releases corrected files under the same year with an updated creation date suffix, the list_available_filesfunction deduplicates by year and always selects the most recently published file. For production pipelines, persist the downloaded files locally and check the_cYYYYMMDD suffix against the index page to detect updates without re-downloading unchanged years.

Research applications and limitations

The Storm Events Database is used across a wide range of research and applied contexts. Academic climate scientists use the long-run tornado and hail records to investigate whether the frequency or intensity of US convective events has changed over the observational period. The consensus finding in this literature is nuanced: the number of EF0 and EF1 tornadoes has declined since the 1970s while the number of EF2+ tornadoes has been roughly stable, but this pattern partially reflects improved detection (Doppler radar introduced in the early 1990s, proliferating storm spotter networks) rather than true climatological change in tornado occurrence. Separating real trend signals from observational artifacts is a central methodological challenge.

The insurance and reinsurance industry uses Storm Events data — particularly the tornado path data from the locations file and the hail magnitude data from the details file — as one input to catastrophe models. Firms including AIR Worldwide, RMS, and Karen Clark & Company license the Storm Events data as a foundation for calibrating their wind and hail loss models. The industry also subscribes to commercial enhancements of the public data: vendors such as CoreLogic and Verisk supplement the NWS reports with radar-derived hail size estimates, photogrammetric wind damage analysis, and claims-linked damage validation to produce higher-resolution loss data for actuarial pricing.

FEMA cross-references Storm Events records in its disaster declaration preliminary damage assessments. When NWS WFOs submit storm reports for events that subsequently receive presidential disaster declaration requests, FEMA field assessors use the Storm Events narratives as one reference point for understanding event severity and geographic scope. The Storm Events FIPS codes facilitate joins to the FEMA disaster declarations database: both use the same county FIPS code standard, enabling analysts to determine which Storm Events records generated FEMA declarations and which significant events did not meet the declaration threshold.

Academic social science research uses the Storm Events Database to study the political economy of disaster policy. Because the database provides event-level damage and fatality data with county FIPS identifiers and dates, it can be merged with electoral data, income data, and racial composition data to test hypotheses about which communities receive disaster declarations, which communities receive large PA grants, and whether disaster relief is distributed equitably across income and demographic lines. A substantial literature finds that lower-income counties and counties with higher minority populations are less likely to receive individual assistance declarations conditional on observed storm damage, though causal identification in this literature is contested.

Several important limitations constrain the Storm Events data for certain analyses. The pre-1996 record is incomplete and inconsistently classified; any trend analysis spanning the 1950–1995 period should account for the expansion of event type coverage and the introduction of Doppler radar (NEXRAD operational nationwide by 1997) as confounders. The damage figures are estimates with wide uncertainty intervals, not independently verified loss measurements. Marine zone events (coastal and offshore) use NWS marine zone designations rather than county FIPS codes, which complicates joining marine events to county-level socioeconomic data. The CZ_TYPEfield (“C” for county, “Z” for forecast zone, “M” for marine zone) must be checked before joining on CZ_FIPS, as the same numeric FIPS value can refer to different geographies across zone types.

Finally, the Storm Events Database records reported events, not all events. Small tornadoes that touch down briefly in uninhabited areas and dissipate without causing observed damage may not be reported if no spotter, sensor, or structure registers the event. Conversely, minor events in densely populated urban areas with active spotter networks tend to be well-documented. This reporting completeness bias affects both the event count trends and the geographic distribution of recorded events, and is particularly acute for low-intensity events (EF0 tornadoes, sub-severe hail, marginal thunderstorm wind gusts) that are near the NWS issuance threshold.


Related: FEMA Disaster Declarations: 70 Years of Federal Disaster Policy

Related: EPA Drinking Water Violations: Safe Drinking Water Act Enforcement

Related: DOT FARS: Fatality Analysis Reporting System