Technical writing

BLS QCEW: The Federal Database Behind US Payroll Data for Every Industry and County

· 16 min read· AI Analytics
BLSQCEWPayroll DataEmploymentFederal Data

Every quarter, the Bureau of Labor Statistics publishes the most comprehensive payroll dataset in the world: roughly eleven million establishment records covering approximately 95% of all civilian employment in the United States, down to the county and six-digit industry level, with average weekly wages for every cell. That dataset—the Quarterly Census of Employment and Wages—is the administrative bedrock on which the entire BLS employment measurement system rests. Almost no one outside economics and workforce research has heard of it.

This article covers the institutional structure of the QCEW program, the data fields and file formats the BLS publishes, the geographic and industry granularity that makes QCEW uniquely useful, how QCEW relates to the monthly Current Employment Statistics survey that produces the Jobs Friday headline number, the disclosure suppression rules that limit small-area data, the BLS API and flat-file download infrastructure, Location Quotient analysis for identifying industrial specialization, and a Python script for querying the QCEW API by 2-digit NAICS supersector with year-over-year wage growth comparison.

What the QCEW Is

The Quarterly Census of Employment and Wages is a joint federal–state program administered by the BLS in cooperation with State Workforce Agencies (SWAs). It is not a survey. It is an administrative census drawn directly from the unemployment insurance (UI) tax records that employers are required to file with state workforce agencies every quarter. Every employer subject to a state's UI tax law must report, for each quarter, the number of covered workers employed during each of the three months in that quarter, the total wages paid to those workers, and certain taxable wage amounts. The BLS collects, standardizes, and publishes those records as the QCEW.

Because the source is an administrative tax filing rather than a voluntary survey, QCEW coverage is effectively universal for the employer universe subject to UI law. Approximately 11 million establishment records appear each quarter, representing roughly 95% of all US civilian employment and generating in excess of 40 million quarterly records across the full historical database. The program's long history—quarterly data is available back to 1990 at the county level, with national series extending further—makes it a uniquely stable longitudinal source for studying structural shifts in employment composition, wage growth, and geographic concentration of economic activity.

Publication lag is the QCEW's primary limitation. Because the BLS must collect UI filings from 53 state and territorial agencies, reconcile them, apply disclosure suppression, and publish the results, a reference quarter's data appears approximately five months after the quarter ends. The first quarter of a calendar year (January–March) typically publishes in August. This lag contrasts sharply with the Current Employment Statistics survey, which publishes national employment estimates roughly 30 days after a reference month closes.

Survey Coverage and Exclusions

QCEW covers employment subject to state and federal UI laws. That universe includes virtually all private-sector wage and salary employment plus most government employment. Workers are covered if their employer pays UI taxes or is a government entity filing UI wage records. Several categories fall outside the covered universe:

Excluded categoryReason / governing law
Self-employed, sole proprietors, independent contractorsNot subject to UI tax; no employer-reported wages
Active-duty military personnelFederal military payroll not in UI system
Railroad workersCovered under Railroad Retirement Board (RRB), not state UI
Elected officials (in some states)State law variation; some states exclude elected officials from UI
Some agricultural and domestic workersSmall-farm and household employer UI exemptions vary by state
Student workers at their own institutionExempt from FUTA under IRC §3306(c)(10)

The excluded categories collectively account for approximately 5% of civilian employment. The largest excluded segment is self-employment, which the BLS measures separately through the Current Population Survey and the Bureau of Economic Analysis's proprietors' income accounts. For most industry-geography analyses, the 95% coverage is sufficient; exceptions arise in agricultural counties and in industries with high independent-contractor rates (real estate agents, owner-operator truckers, gig-economy workers).

Data Fields and Structure

QCEW flat files are distributed as CSV records. Each record represents one area–industry–ownership combination for one quarter or one year. The core fields are:

FieldValues / formatNotes
area_fips2–5 character string“US000” = national; 2-digit = state; 5-digit = county; “C”-prefixed = MSA
own_code0, 1, 2, 3, 50=total all ownership; 1=federal; 2=state; 3=local; 5=private
industry_codeNAICS string, 2–6 digits“10” or “00” = all industries total
agglvl_code2-digit integerEncodes area type x industry detail level; drives API filtering
year4-digit integerReference year
qtr1, 2, 3, 4, or A“A” in annual files; 1–4 in quarterly files
disclosure_codeblank or “N”“N” = suppressed; blank = disclosed
month1_emplvl / month2_emplvl / month3_emplvlintegerEmployment count for each month of the quarter (12th-day employment)
total_qtrly_wagesinteger (dollars)Total wages paid to covered workers during the quarter
taxable_qtrly_wagesinteger (dollars)Wages subject to UI tax (capped at state taxable wage base)
avg_weekly_wageinteger (dollars)Derived: total_qtrly_wages / (13 × average monthly employment)
annual_avg_emplvlintegerAnnual files only; average of four quarters' monthly employment
annual_avg_weekly_wageinteger (dollars)Annual files only; annual wages / (52 × annual avg employment)

Employment counts in QCEW reflect the number of workers employed on the 12th day of each month—the same reference date used in the Current Employment Statistics survey. This is not a headcount of all workers who worked at any point in the month; it is a point-in-time snapshot. Seasonal industries such as agriculture, construction, and accommodation and food services show large month-to-month variation within a quarter that is fully visible in the month1/month2/month3 fields.

The average weekly wage is a derived field computed by the BLS, not reported directly by employers. The formula divides total quarterly wages by 13 times the average of the three monthly employment levels. Because it is a wage-bill-weighted average rather than a median, high-wage establishments with small employment counts (investment banks, hedge funds, law firms) pull the average upward in their respective area-industry cells. Median wages by occupation are available from the BLS Occupational Employment and Wage Statistics (OEWS) survey, which uses QCEW as its sampling frame.

Geographic Granularity

QCEW publishes employment and wage data at six geographic levels. Each level has a corresponding set of agglvl_code values that control which records appear in API queries and flat-file downloads.

Geographic levelUnitsarea_fips format
National1“US000”
State (and DC)512-digit FIPS (e.g., “48” = Texas)
County3,200+5-digit FIPS (e.g., “48113” = Dallas County TX)
Metropolitan Statistical Area380+“C” + 5-digit OMB MSA code (e.g., “C35620” = NY-Newark)
Workforce Investment Area~600State-defined WIA boundaries; used by workforce development agencies
Congressional District435+“D” + state FIPS + district number

County-level QCEW data is the primary source for local labor market analysis. Economic developers, site selectors, and regional planners use county-level employment and wage data to characterize the industrial structure and wage competitiveness of specific labor markets. At the county level, however, the intersection of narrow industry codes and low establishment counts produces frequent disclosure suppression—a limitation addressed in a later section.

Metropolitan Statistical Area (MSA) data is aggregated from constituent counties by the BLS and follows OMB's annual delineation of metro area boundaries. MSA boundaries change periodically as the OMB revises county-to-metro assignments; researchers constructing long time series must account for boundary changes, which the BLS documents in its geographic area concordances. MSA-level data has significantly less disclosure suppression than county data because the larger establishment pool in metro areas reduces the frequency of cells with fewer than three employers.

Industry Classification (NAICS)

QCEW uses the North American Industry Classification System (NAICS) to classify establishment-level employment and wages by economic activity. NAICS is a hierarchical system with six levels of specificity. In QCEW, establishments are assigned a NAICS code based on their primary economic activity as reported to state agencies at the time of UI account registration, with periodic reclassification when the primary activity changes.

NAICS levelDigit countExample
Sector (supersector)2 digits52 — Finance and Insurance
Subsector3 digits522 — Credit Intermediation and Related Activities
Industry group4 digits5221 — Depository Credit Intermediation
NAICS industry5 digits52211 — Commercial Banking
National industry6 digits522110 — Commercial Banking (US detail)

For total covered employment across all industries, QCEW usesindustry_code “10” at most geographic and ownership levels (and “00” in some legacy file formats). The 2-digit supersectors correspond to the BLS's standard supersector groupings used in CES reporting, enabling consistent comparisons between QCEW and CES at that level of aggregation. At finer NAICS levels—5- and 6-digit— QCEW is substantially more granular than any other national employment dataset.

NAICS was revised most recently in 2022. BLS crosswalks between NAICS editions are available and necessary when constructing time series that span revision years. The most significant NAICS 2022 changes affected the Information sector (51) and the Professional, Scientific, and Technical Services sector (54), where digital economy activities were reclassified to better reflect current industry boundaries.

QCEW vs. CES: Two-Track Employment System

The BLS operates two parallel employment measurement programs that cover overlapping but distinct universes and serve complementary analytical purposes. Understanding the relationship between them is essential for interpreting any BLS employment statistic.

The Current Employment Statistics (CES) program produces the monthly “Jobs Report” headline figure—the nonfarm payroll employment number released on the first Friday of each month for the prior reference month. CES is a sample survey of approximately 119,000 businesses and government agencies representing roughly 629,000 individual worksites. The sample is drawn from the QCEW universe; QCEW is literally the sampling frame for CES. CES is designed for timeliness: the survey collects data by phone and web within days of the reference period and publishes results approximately 30 days after the reference month closes. It provides national, state, and major-metro employment and average hourly earnings estimates by broad industry sector.

DimensionQCEWCES
Data sourceAdministrative (UI tax records)Sample survey (~119,000 businesses)
Coverage~95% of civilian employment~629,000 worksites; weighted to universe
Publication lag~5 months after reference quarter~30 days after reference month
Geographic detailCounty, MSA, congressional districtNational, state, selected metros
Industry detail2–6 digit NAICS2–4 digit NAICS (limited 5-digit)
Wage measureAverage weekly wage (total wages / employment)Average hourly earnings, average weekly hours
Benchmark roleIs the benchmark universeBenchmarked annually to QCEW (March)

The March benchmark revision is the annual reconciliation between the two systems. Each February, the BLS publishes revised CES estimates for the prior year that adjust the survey-based monthly employment levels to align with the QCEW universe as of the March reference period. These benchmark revisions can be substantial—sometimes several hundred thousand jobs nationally— when the CES sample has drifted from the true employment level due to business births and deaths that are underrepresented in a fixed sample frame. The March 2024 benchmark revision, for example, revised down the prior year's CES payroll employment by approximately 818,000 jobs, the largest downward revision since the financial crisis. That revision originated in QCEW data showing fewer establishments and workers than CES had estimated, particularly in leisure and hospitality and private education.

For most real-time economic analysis, CES is the appropriate source: it is current, frequently revised, and covers the economy at a level of timeliness that QCEW cannot match. For any analysis requiring county-level detail, narrow industry specificity, or wage data at the intersection of geography and industry, QCEW is the only adequate source in the federal statistical system.

Disclosure Suppression

Because QCEW is built from establishment-level administrative records, the BLS applies disclosure suppression rules to prevent the publication of data that could identify individual employers or allow reconstruction of a specific employer's payroll. A cell—one area–industry–ownership combination for one period—is suppressed when either of two conditions holds:

First, the cell has fewer than three establishments. A county-industry cell with one or two employers cannot be published without effectively disclosing those employers' payrolls. Second, one employer accounts for 80% or more of the total wages in the cell. Even when three or more establishments are present, a dominant employer's wages are inferable from the suppression of its complement, so the BLS applies the concentration threshold as an additional guard.

Suppressed cells are marked with disclosure_code = “N”and the employment and wage fields are set to zero. In rural counties and narrow industry codes, suppression rates are high: a 6-digit NAICS code in a small county may be suppressed in every quarter simply because no county in a particular rural state has three petroleum extraction establishments. Researchers address suppression by aggregating upward—to 4-digit NAICS or to the state level—until cells become disclosed.

Suppression creates systematic gaps that are not random. Industries that are geographically concentrated (mining, oil and gas, certain manufacturing subsectors) have more suppression in the counties where they are actually present, precisely because a small number of large employers dominates local employment. This means that naive cell-level analysis underestimates concentration in exactly the markets where concentration is highest. The BLS publishes disclosure rates by state and industry as a metadata supplement to the quarterly files.

API Access and Data Download

BLS provides three access pathways for QCEW data, suited to different use cases.

The BLS public API at api.bls.gov/publicAPI/v2/timeseries/data/supports time-series queries using QCEW series IDs. A QCEW series ID has the format ENU + area_fips + ownership_digit + industry_code + data_type. For example, ENU0000000010 is the national total all-ownership all-industries employment series. Data type codes include 0 (all employees), 1 (average weekly wages), and others. The public API is rate-limited to 25 series per query and 500 queries per day without a registration key; registered users receive 500 series per query and higher daily limits. The public API is best suited for pulling specific series over time rather than broad cross-sectional downloads.

The QCEW-specific API at data.bls.gov/cew/api/data/v1/area/supports cross-sectional queries by area, year, quarter, industry aggregation level, ownership code, and size class. This endpoint returns JSON records without requiring a registration key and is the most flexible programmatic access method for QCEW tabulations. It does not support time-series pagination; for multi-year comparisons, callers must issue one request per year. The Python script in this article uses this endpoint.

For bulk downloads, the BLS distributes complete quarterly and annual flat files at blsdownload.bls.gov/pub/time.series/en/. Files are organized by year and quarter, compressed with gzip, and typically range from 200 to 500 MB per quarter. The annual average files are smaller. Each compressed file contains all geographic levels and all industry codes for a given quarter or year. Bulk download is the appropriate approach for database-loading the full QCEW universe for multi-year, multi-geography analysis—the API becomes slow and unwieldy at that scale.

Location Quotient Analysis

The Location Quotient (LQ) is the standard economic geography tool for identifying industrial specialization using QCEW data. It measures how concentrated a given industry is in a local area relative to the national economy, using employment shares as the basis for comparison.

The formula is: LQ = (local industry employment / local total employment) / (national industry employment / national total employment). An LQ of 1.0 means the industry's local employment share exactly equals its national share. An LQ greater than 1.0 indicates above-average local concentration—the area specializes in that industry relative to the rest of the economy. An LQ below 1.0 indicates below-average concentration.

Examples from QCEW data illustrate the LQ's interpretive value. The Midland–Odessa, Texas MSA has a Location Quotient for oil and gas extraction (NAICS 211) typically exceeding 30—meaning oil and gas employment accounts for thirty times the national share of total employment in that metro area. The New York–Newark–Jersey City MSA has an LQ for securities and commodity contracts intermediation (NAICS 5231) that exceeds 4.0. Las Vegas–Henderson–Paradise, Nevada has an LQ for accommodation (NAICS 721) above 5.0. The Detroit–Warren–Dearborn MSA has an LQ for motor vehicle manufacturing (NAICS 3361) that has historically exceeded 8.0.

Location Quotients computed from QCEW data are used by regional economists to identify export-base industries (industries with LQ > 1.0 that are presumed to export goods and services outside the local economy), to benchmark clusters, and to assess the diversification or concentration risk of a local economy. State economic development agencies publish LQ reports using QCEW as the standard input. The Brookings Institution, the Economic Policy Institute, and regional Federal Reserve Banks all publish analyses using QCEW-derived LQs.

LQ analysis has known limitations. It treats employment as a proxy for economic activity, which understates capital-intensive industries where wage income is high relative to headcount (e.g., petroleum extraction, finance). It also conflates local-serving and export-oriented activity within an industry code—not all employment above LQ 1.0 represents genuine export-base activity. For nuanced cluster analysis, LQs are typically supplemented with shift-share decompositions and establishment-count comparisons, all available from QCEW.

Python Code

The following script queries the BLS QCEW API for private-sector (own_code=5) employment and average weekly wages at the 2-digit NAICS supersector level, prints a ranked table by employment size, and computes year-over-year wage growth. It then displays illustrative Location Quotient benchmarks for the Finance and Insurance supersector using the national private-sector share as the denominator. The script requires only requests from the standard Python environment; no API key is needed for the QCEW data endpoint.

import requests
import json
import csv
import io
from collections import defaultdict

# ---------------------------------------------------------------------------
# BLS QCEW: Download national private-sector employment and wages
# by 2-digit NAICS supersector, plus year-over-year wage growth
# ---------------------------------------------------------------------------
# The BLS QCEW API (data.bls.gov/cew/api) exposes quarterly tabulations
# without requiring a registration key for area-level aggregates.
# Endpoint: https://data.bls.gov/cew/api/data/v1/area/
#   ?area=US000        -> national totals
#   &year=YYYY
#   &qtr=A             -> annual average file
#   &industry=0        -> all industries (use 10 for NAICS rollup)
#   &agglvl=15         -> 2-digit NAICS national (agglvl code)
#   &size=0            -> all size classes
#   &type=10           -> all ownership (use type=50 for private)
#   &layout=join

BASE_URL = "https://data.bls.gov/cew/api/data/v1/area/"

SUPERSECTORS = {
    "10": "Total, all industries",
    "11": "Agriculture, forestry, fishing, hunting",
    "21": "Mining, quarrying, oil and gas extraction",
    "22": "Utilities",
    "23": "Construction",
    "31": "Manufacturing",
    "42": "Wholesale trade",
    "44": "Retail trade",
    "48": "Transportation and warehousing",
    "51": "Information",
    "52": "Finance and insurance",
    "53": "Real estate, rental, leasing",
    "54": "Professional, scientific, technical services",
    "55": "Management of companies",
    "56": "Administrative and waste services",
    "61": "Educational services",
    "62": "Health care and social assistance",
    "71": "Arts, entertainment, recreation",
    "72": "Accommodation and food services",
    "81": "Other services (except public administration)",
    "92": "Public administration",
}

def fetch_qcew_annual(year: int, own_code: str = "5") -> dict:
    """
    Fetch QCEW annual average file for the US (area=US000).
    own_code: 5 = private, 0 = all ownership, 1 = federal, 2 = state, 3 = local
    Returns dict keyed by industry_code -> record dict.
    """
    params = {
        "area": "US000",
        "year": str(year),
        "qtr": "A",           # annual average
        "industry": "0",      # placeholder; agglvl drives the actual grouping
        "agglvl": "15",       # 2-digit NAICS, national
        "size": "0",
        "type": own_code,
        "layout": "join",
    }
    resp = requests.get(BASE_URL, params=params, timeout=60)
    resp.raise_for_status()

    # The QCEW API returns a JSON envelope with "data" -> list of records
    payload = resp.json()
    records = {}
    data_list = payload.get("data", [])

    # data_list may be a list of column arrays (joined layout) or dicts
    if data_list and isinstance(data_list[0], dict):
        for row in data_list:
            ind = str(row.get("industry_code", "")).strip()
            records[ind] = row
    else:
        # Parse CSV-style joined layout: first element is header list
        if data_list and isinstance(data_list[0], list):
            headers = data_list[0]
            for row in data_list[1:]:
                rec = dict(zip(headers, row))
                ind = str(rec.get("industry_code", "")).strip()
                records[ind] = rec

    return records


def safe_int(val) -> int:
    try:
        return int(str(val).replace(",", "").strip())
    except (ValueError, TypeError):
        return 0


def safe_float(val) -> float:
    try:
        return float(str(val).replace(",", "").strip())
    except (ValueError, TypeError):
        return 0.0


def print_supersector_table(year: int, records: dict) -> None:
    print(f"\n=== QCEW {year} Annual: Private Sector Employment & Wages by 2-Digit NAICS ===")
    print(f"  {'NAICS':<5}  {'Supersector':<48}  {'Annual Avg Empl':>16}  {'Avg Weekly Wage':>16}")
    print("  " + "-" * 92)

    rows = []
    for code, label in SUPERSECTORS.items():
        if code == "10":
            continue  # skip total row for sorted list
        rec = records.get(code, {})
        empl = safe_int(rec.get("annual_avg_emplvl", 0))
        wage = safe_int(rec.get("annual_avg_weekly_wage", 0))
        if empl > 0:
            rows.append((code, label, empl, wage))

    rows.sort(key=lambda x: -x[2])  # sort by employment descending

    for code, label, empl, wage in rows:
        print(f"  {code:<5}  {label:<48}  {empl:>16,}  ${wage:>15,}")

    # Print total row at bottom
    total_rec = records.get("10", {})
    total_empl = safe_int(total_rec.get("annual_avg_emplvl", 0))
    total_wage = safe_int(total_rec.get("annual_avg_weekly_wage", 0))
    if total_empl:
        print("  " + "-" * 92)
        print(f"  {'10':<5}  {'TOTAL, all private':<48}  {total_empl:>16,}  ${total_wage:>15,}")


def compute_yoy_wage_growth(records_curr: dict, records_prev: dict) -> None:
    print("\n=== Year-Over-Year Average Weekly Wage Growth by Supersector ===")
    print(f"  {'NAICS':<5}  {'Supersector':<48}  {'Prior Yr Wage':>14}  {'Curr Yr Wage':>14}  {'YoY %':>8}")
    print("  " + "-" * 96)

    rows = []
    for code, label in SUPERSECTORS.items():
        if code == "10":
            continue
        curr_rec = records_curr.get(code, {})
        prev_rec = records_prev.get(code, {})
        curr_wage = safe_float(curr_rec.get("annual_avg_weekly_wage", 0))
        prev_wage = safe_float(prev_rec.get("annual_avg_weekly_wage", 0))
        if prev_wage > 0 and curr_wage > 0:
            pct = (curr_wage - prev_wage) / prev_wage * 100
            rows.append((code, label, prev_wage, curr_wage, pct))

    rows.sort(key=lambda x: -x[4])  # sort by growth rate descending

    for code, label, prev_w, curr_w, pct in rows:
        direction = "+" if pct >= 0 else ""
        print(f"  {code:<5}  {label:<48}  ${prev_w:>13,.0f}  ${curr_w:>13,.0f}  {direction}{pct:>7.1f}%")


# ---------------------------------------------------------------------------
# Main
# ---------------------------------------------------------------------------

CURRENT_YEAR = 2023
PRIOR_YEAR   = 2022

print(f"Fetching QCEW {CURRENT_YEAR} annual private-sector data (own_code=5, agglvl=15)...")
records_curr = fetch_qcew_annual(CURRENT_YEAR, own_code="5")

print(f"Fetching QCEW {PRIOR_YEAR} annual private-sector data for YoY comparison...")
records_prev = fetch_qcew_annual(PRIOR_YEAR, own_code="5")

print_supersector_table(CURRENT_YEAR, records_curr)
compute_yoy_wage_growth(records_curr, records_prev)

# ---------------------------------------------------------------------------
# Location Quotient demo: Finance (NAICS 52) in selected metro areas
# ---------------------------------------------------------------------------
# For county/MSA LQ you would switch agglvl to 73 (county, 2-digit NAICS)
# and area to the 5-digit FIPS code.  Here we show the national private-
# sector share as the denominator benchmark.

fin_rec    = records_curr.get("52", {})
total_rec  = records_curr.get("10", {})
nat_fin    = safe_float(fin_rec.get("annual_avg_emplvl", 0))
nat_total  = safe_float(total_rec.get("annual_avg_emplvl", 0))
nat_share  = nat_fin / nat_total if nat_total else 0

print(f"\n=== Location Quotient Benchmark: Finance & Insurance (NAICS 52) ===")
print(f"  National private-sector finance employment share: {nat_share:.4f} ({nat_share*100:.2f}%)")
print("  LQ = (local finance share) / (national finance share)")
print("  LQ > 1.0 => above-average local concentration")
print("\n  Example expected LQs (illustrative; query MSA-level data for actuals):")
EXAMPLE_LQS = {
    "New York-Newark MSA (35620)":     3.1,
    "Hartford-East Hartford MSA (25540)": 2.4,
    "Des Moines-West Des Moines (19780)": 2.2,
    "Charlotte-Concord MSA (16740)":   1.9,
    "San Francisco-Oakland MSA (41860)": 1.6,
    "United States (benchmark)":        1.0,
    "Las Vegas-Henderson MSA (29820)": 0.6,
    "Midland TX MSA (33260)":           0.4,
}
for area, lq in EXAMPLE_LQS.items():
    bar = "#" * int(lq * 10)
    print(f"  {area:<42}  LQ={lq:.1f}  {bar}")

For county-level LQ analysis, change area to the 5-digit county FIPS code and agglvl to 73 (county, 2-digit NAICS) in the API call. For MSA-level analysis, use the “C”-prefixed MSA codes and agglvl 73. Bulk county downloads for all US counties are available as compressed flat files at blsdownload.bls.gov; loading the full annual county file into SQLite or DuckDB enables LQ computation across all 3,200+ counties simultaneously without API rate limits.

Related writing

BLS Current Employment Statistics: The Federal Database Behind the Monthly Jobs Report — the monthly Jobs Friday release, CES survey methodology, average hourly earnings, and how to query the BLS API.

CMS Nursing Home Compare: The Federal Database Behind Quality Ratings for 14,700 US Nursing Homes — CMS Five-Star Quality Rating System, health inspections, staffing data, and abuse/neglect flags.