Technical writing

Who won, who lost: five years of union elections in NLRB data

· 8 min read· AI Analytics
Labor dataNLRBUnion electionsRegulatory data

The National Labor Relations Board publishes every union election it supervises as a public record. Case type, employer, union, eligible voters, votes cast, result — all of it is in the Advanced Data Search at nlrb.gov, exportable as CSV. The records go back decades. What they show for 2019 through 2024 is one of the more significant shifts in US labor relations since the 1950s: a surge in organizing activity, a jump in union win rates, and a clear industry concentration that the raw numbers make legible.

This post covers what's in the NLRB election dataset, where to get the full historical archive, what the 2021–2024 data actually shows, and how to cross-reference elections against OSHA injury data and CFPB complaint records.

NLRB jurisdiction

The National Labor Relations Board was created in 1935 by the National Labor Relations Act. Its two primary functions are supervising union elections and investigating unfair labor practice (ULP) charges filed by workers or employers.

The election data covers two case types:

  • RC (Representation Certification) — workers petition to certify a union. The union wins if it gets a majority of votes cast among eligible voters. This is the “workers want a union” election type.
  • RD (Representation Decertification) — workers petition to remove an existing union. The union loses if a majority votes to decertify. This is the “workers want to remove a union” election type.

A third type, RM (employer-filed), exists when an employer has a good-faith doubt about a union's majority support and files its own petition. RM elections are rarer and use the same ballot format as RD elections. All three appear in the Advanced Data Search export.

The data

The NLRB Advanced Data Search at nlrb.gov/advanced-search generates CSV exports up to 100,000 records per pull. The fields in the election export:

case_name         # NLRB docket number, e.g. "01-RC-123456"
employer          # employer legal name as filed
union_name        # union name as filed
election_date     # date of the vote
eligible_voters   # count of workers in the bargaining unit
votes_for         # votes for union representation
votes_against     # votes against union representation
votes_challenged  # challenged ballots (can affect outcome if determinative)
election_type     # RC / RD / RM
election_result   # Won / Lost / Runoff / Void
city              # employer city
state             # employer state
industry_code     # SIC code

For historical coverage before the Advanced Data Search tool: legacy Data.gov XML files cover 1999–2009. The DataLumos archive (datalumos.org) covers 2001–February 2025 and is the most complete public version of the full dataset — approximately 2 million records across all NLRB case types, not just elections.

The 2021–2024 surge

NLRB filings hit a 10-year high in fiscal year 2022. The organizing wave was concentrated in a handful of high-profile employers: Starbucks (388 petitions filed in a single fiscal year), Amazon (the ALB1 Staten Island warehouse, the first Amazon facility to successfully certify a union), REI, Apple retail, Trader Joe's, and Barnes & Noble. These names are directly visible in the data — filter by union_name LIKE '%Workers United%' or union_name LIKE '%Amazon Labor Union%', sort by election_date.

The win rate moved with the volume. In fiscal year 2023, unions won approximately 72% of elections that reached a vote — the highest win rate recorded since the 1950s. This is not a projection or an estimate from a survey; it is directly computable from the election_result field in the NLRB export.

# Example: compute win rate by year from the CSV export
import pandas as pd

df = pd.read_csv('nlrb_elections.csv')
df['year'] = pd.to_datetime(df['election_date']).dt.year

# Filter to RC elections only (certification, not decertification)
rc = df[df['election_type'] == 'RC']

# Exclude runoffs and voids — only count decided elections
decided = rc[rc['election_result'].isin(['Won', 'Lost'])]

win_rate = decided.groupby('year').apply(
    lambda x: (x['election_result'] == 'Won').sum() / len(x)
)
print(win_rate)

The employer response pattern

NLRB election data can be cross-referenced with NLRB ULP (unfair labor practice) case data from the same Advanced Data Search tool. The two datasets share the employer name field and, for cases filed close in time, can be linked by docket number prefix (the regional office and case type encoded in the case number).

What this cross-reference shows: employers who lose union elections file more post-election objections than employers who win. Employers who have active ULP charges filed against them during the election period have higher win rates in the election itself than employers with no concurrent ULP activity. The enforcement gap — the time between a ULP charge being filed and a remedial order being issued — routinely exceeds the election timeline, meaning the conduct is unremedied at the time of the vote.

Neither of these is a causal finding. The data does not let you prove that ULP conduct caused the election outcome. But the correlation is measurable in the public records without any additional data source. The NLRB publishes both the election results and the ULP case outcomes; linking them requires only a join on employer name and case date proximity.

The 100k cap problem

The NLRB Advanced Data Search maxes out at 100,000 records per export. The full historical election dataset — going back to the 1990s — exceeds that limit. Getting the complete dataset requires multiple pulls segmented by date range, by state, or by industry code, then deduplicating on case_name.

A workable segmentation strategy:

# Pull by year to stay under the 100k cap
# NLRB Advanced Data Search: filter election_date range, export CSV

years = range(1990, 2026)
for year in years:
    # Query: election_date >= {year}-01-01 AND election_date <= {year}-12-31
    # Download CSV, save as nlrb_elections_{year}.csv

# Combine and deduplicate
import glob
dfs = [pd.read_csv(f) for f in glob.glob('nlrb_elections_*.csv')]
combined = pd.concat(dfs).drop_duplicates(subset=['case_name'])
combined.to_csv('nlrb_elections_full.csv', index=False)

For most analytical purposes the DataLumos archive is simpler: it has already done this segmentation and provides the full 2001–2025 dataset as a single download. The tradeoff is that the DataLumos version lags by several months relative to the live NLRB tool.

Industry breakdown

The industry_code field is a SIC (Standard Industrial Classification) code. This makes it joinable to BLS, OSHA, and Census industry data without additional crosswalks. Key findings by industry group:

  • Healthcare (SIC 80xx) — the highest volume of RC elections across the entire dataset by raw count. Nursing homes, hospitals, and home health agencies have been the most consistently active organizing targets for decades.
  • Retail (SIC 52xx–59xx) — the largest percentage increase in RC petitions from 2020 to 2023. Starbucks, REI, Apple, Trader Joe's, and Barnes & Noble are all retail SIC codes. This is the sector that drove the 2021–2024 surge.
  • Manufacturing (SIC 20xx–39xx) — the highest union win rate in elections that actually reach a vote. Smaller bargaining units, more established union infrastructure, and lower employer resistance than retail or logistics.
# Industry win rate analysis
sic_groups = {
    'Healthcare':     (8000, 8099),
    'Retail':         (5200, 5999),
    'Manufacturing':  (2000, 3999),
}

for name, (sic_low, sic_hi) in sic_groups.items():
    mask = (
        df['industry_code'].between(sic_low, sic_hi) &
        df['election_type'].eq('RC') &
        df['election_result'].isin(['Won', 'Lost'])
    )
    group = df[mask]
    win_rate = (group['election_result'] == 'Won').mean()
    print(f"{name}: {win_rate:.1%} win rate, n={len(group)}")

Cross-reference: OSHA injury data and CFPB complaints

Two cross-dataset correlations are computable from public sources without any non-public data:

OSHA Form 300A injury rates

OSHA's injury and illness records (Form 300A summaries, available via the OSHA Injury Tracking Application API) report establishment-level injury and illness rates by year. Joining NLRB election records to OSHA 300A records on employer name and state — fuzzy-matched on the normalized name string — shows that establishments with worse total recordable incident rates (TRIR) are more likely to see union organizing petitions in the following 12–24 months.

This is not a causal claim. Workplace safety grievances are one of many factors workers cite in organizing drives. But the directional correlation is consistent across industries and year ranges, and it is measurable in the public data.

CFPB complaint data for financial employers

The CFPB Consumer Complaint Database (api.consumerfinance.gov/data/complaints) reports complaints against financial institutions by company name. Banks and financial services firms that appear in the NLRB election dataset can be cross-referenced: institutions with higher complaint volumes in the CFPB database show modestly elevated rates of organizing activity in the NLRB election records.

Again, correlation not causation. Consumer-facing dysfunction and worker-facing conditions at the same employer may share common drivers (management practices, cost pressure, regulatory stress) without one causing the other. The value is that both datasets are public, joinable on company name, and the combined signal is richer than either source alone.

Accessing the data via the hub

The Federal Regulatory Data Hub indexes the NLRB election dataset alongside 196 other federal datasets. The endpoint:

# Election results by employer name
curl https://api.ai-analytics.org/datasets/nlrb-elections?employer=Starbucks

# All elections in a state, sorted by date
curl https://api.ai-analytics.org/datasets/nlrb-elections?state=CA&sort=election_date:desc

# RC elections with union win, 2022–2024
curl https://api.ai-analytics.org/datasets/nlrb-elections?election_type=RC&election_result=Won&date_from=2022-01-01&date_to=2024-12-31

# SIC-filtered query: retail elections
curl https://api.ai-analytics.org/datasets/nlrb-elections?sic_from=5200&sic_to=5999&election_type=RC

The hub normalizes employer names using the same entity resolution pipeline described in the compliance screening post — phonetic normalization and token-sorted cosine similarity — so queries for “Starbucks” return records filed as “Starbucks Corporation,” “Starbucks Coffee Company,” and subsidiary variants without requiring exact string matching.


For how foreign-registered organizations interact with federal registration and enforcement databases: FARA disclosures as structured data: foreign agent registrations in federal records →

For how the hub scores entities across 30+ federal enforcement lists including OSHA and DOL actions: Compliance screening across 30+ federal enforcement lists: how the risk score works →