Technical writing
Census SAIPE: The Federal Database Behind County-Level Poverty and Income Estimates
Every December the Census Bureau releases a set of estimates that most Americans have never heard of but that directly determine how billions of federal dollars flow to their local schools, housing agencies, and Medicaid programs. The Small Area Income and Poverty Estimates — SAIPE — provide annually updated poverty counts and median household income figures for every U.S. state, county, and school district. They are the statutory data source for Title I-A education funding, a principal input to Community Development Block Grant allocations, and a benchmark for Medicaid and CHIP matching rate calculations. No substitute is legally permissible for these uses. The program's technical design, data sources, and geographic reach make it one of the most consequential and least understood products of the federal statistical system.
This article covers the origins and purpose of SAIPE, the statistical methodology that makes small-area estimation possible, the federal formula programs that depend on SAIPE data and the dollar volumes they govern, the principal data fields published for states, counties, and school districts, the geographic patterns of child poverty that SAIPE data reveals, the relationship between SAIPE and the American Community Survey, how to access SAIPE through the Census API and bulk downloads, and a Python script that downloads county-level child poverty rates for a state, computes year-over-year changes, and identifies the highest-poverty counties.
Origins and purpose
SAIPE was created to solve a specific and consequential gap in federal statistical infrastructure. The Decennial Census produces reliable poverty data for small geographies but only every ten years — a lag too long for formula funding that is recalculated annually. The American Community Survey, which replaced the Decennial Census long form after 2000, conducts approximately 3.5 million household interviews per year and can produce 1-year poverty estimates for geographies with populations of 65,000 or more. For smaller counties and school districts, the ACS 1-year sample is too thin to produce reliable direct estimates: margins of error exceed the estimates themselves for hundreds of small counties. The ACS 5-year product pools five annual samples to cover smaller geographies, but a 5-year pooled estimate is temporally centered at the midpoint of the pooling window and cannot reflect conditions in the current year. A county that experienced a plant closure or natural disaster two years ago may look substantially different on 5-year ACS data than it does today.
SAIPE addresses this by combining survey data with administrative records in a model-based framework that produces annually updated single-year estimates for all 3,144 counties and 13,000+ school districts, regardless of population size. The program was developed in the 1990s in direct response to Congressional requirements for annual small-area poverty data to support the Title I-A school funding formula under the Elementary and Secondary Education Act. State-level estimates are available back to 1989; county-level estimates back to 1993. Annual release occurs each December, covering data from approximately two years prior (2023 estimates released December 2025, reflecting income and poverty conditions in calendar year 2023).
Model-based statistical methodology
The core statistical challenge of small-area estimation is that direct survey estimates for small populations are too noisy to be useful, but ignoring survey data entirely and relying only on administrative records introduces potential bias. SAIPE resolves this tension through hierarchical Bayes shrinkage models that combine multiple data sources with formal uncertainty quantification.
The primary data inputs to SAIPE county models are: (1) ACS direct estimates of poverty rates and counts, which are unbiased but have high variance for small counties; (2) IRS Statistics of Income data, specifically the number of tax return exemptions and the number of tax filers reporting food stamp or SNAP income — IRS Form 1040 filings provide near-universal coverage of the filing population and track income closely; (3) SNAP (Supplemental Nutrition Assistance Program) administrative enrollment counts from USDA Food and Nutrition Service, which serve as a direct poverty indicator; (4) Census Bureau population estimates by age group, which provide control totals; and (5) the previous year's SAIPE estimates as a temporal anchor. For school districts, an additional input is the count of children participating in the National School Lunch Program.
The shrinkage model works by weighting each county's direct ACS estimate against a model-predicted value derived from the administrative record inputs. For large counties with substantial ACS samples, the estimate stays close to the ACS direct estimate because the survey data is precise enough to be trusted. For small counties where the ACS sample is thin and the direct estimate is unreliable, the model shrinks the estimate toward the value predicted by the administrative records. The degree of shrinkage is determined by the relative precision of the two sources, estimated from the data itself using Bayesian methods. The result is a set of estimates that are more stable and reliable than the direct ACS would produce for small counties, while remaining responsive to year-over-year changes in underlying economic conditions reflected in IRS and SNAP records. The Census Bureau publishes 90% confidence intervals for all SAIPE estimates, making uncertainty explicit and allowing users to assess the precision of individual county estimates.
School district estimates add a further complication: school district boundaries do not align with county boundaries, and many school districts cross county lines or consist of multiple non-contiguous geographic areas. SAIPE uses a tabulation of school-age children (ages 5–17 in related families, the key Title I-A metric) allocated from county estimates to school districts based on population distribution from the Decennial Census and the National School Lunch Program counts. This allocation step introduces additional uncertainty in the school district estimates that is reflected in wider confidence intervals compared to county estimates.
Federal formula programs that depend on SAIPE
SAIPE is not merely a statistical product — it is the legally mandated data source for several major federal formula programs. The dollar volumes involved make the program's accuracy a matter of significant fiscal consequence for states and localities.
| Program | Annual Volume | SAIPE metric used |
|---|---|---|
| Title I-A (ESEA / IDEA) | ~$17B | School-district count of children 5–17 in related families in poverty |
| Community Development Block Grants (CDBG) | ~$3.3B | County and city poverty rates and counts |
| Medicaid Federal Medical Assistance Percentage (FMAP) | $500B+ total; FMAP varies by state | State per capita income (3-year average), cross-referenced with SAIPE |
| Children's Health Insurance Program (CHIP) | ~$19B | State poverty rates for matching rate formula |
Title I-A is the clearest example of SAIPE's policy centrality. Part A of Title I of the Elementary and Secondary Education Act directs federal education funding to local educational agencies (LEAs) — school districts — based on the number of school-age children in poverty within each district. Congress specified SAIPE as the statutory data source for this formula in the Improving America's Schools Act of 1994 and has retained that specification through subsequent ESEA reauthorizations, most recently the Every Student Succeeds Act of 2015. No other data source is legally substitutable. Each year, the Department of Education uses the most recently available SAIPE school district file to calculate Title I-A allocations for approximately 13,000 school districts nationwide. A school district with a miscounted or misallocated poverty population in SAIPE receives a correspondingly incorrect Title I-A allocation — more or less than it would receive if the estimate were accurate. Given the $17B annual program size, even modest percentage errors in county-level estimates translate to millions of dollars of misallocation per affected district.
The Community Development Block Grant program, administered by HUD, allocates funds to states and entitlement communities (cities over 50,000 population and urban counties) using a formula that weights poverty, housing overcrowding, and population age. SAIPE county and sub-county poverty data feeds directly into this calculation. Non-entitlement communities receive CDBG funding through state CDBG programs, which use their own sub-state allocation formulas — many of which incorporate SAIPE county poverty data as the small-area poverty indicator.
Data fields and geographic coverage
SAIPE publishes the following core variables for states and counties. Each estimate is accompanied by a 90% confidence interval expressed as upper and lower bounds.
| Variable | Description |
|---|---|
| SAEPOVALL_PT | Estimated number of people of all ages in poverty |
| SAEPOVRT0_17_PT | Poverty rate for people under age 18 |
| SAEPOV0_17_PT | Estimated number of people under age 18 in poverty |
| SAEPOV5_17R_PT | Estimated number of children ages 5–17 in related families in poverty (the Title I-A metric) |
| SAEMHI_PT | Median household income estimate |
| SAEPOVRTALL_PT | Overall poverty rate for all ages |
The school district file adds poverty counts specifically for children ages 5–17 in related families — a narrower definition than the general under-18 population that excludes unrelated children (foster children living with non-relatives, for example) on the grounds that the Title I-A formula is intended to target children whose families' economic circumstances affect their academic preparation. The school district file covers approximately 13,000 districts, identified by the National Center for Education Statistics (NCES) school district geographic boundary files and LEAID identifiers.
Geographic identifiers use standard FIPS codes: two-digit state FIPS, three-digit county FIPS (concatenated to form the five-digit county GEOID), and NCES district identifiers for school districts. The SAIPE time series for counties runs from 1993 through the present; the state series from 1989. Both series are available through the Census API and as downloadable CSV files organized by year.
Geographic patterns of child poverty
SAIPE data reveals child poverty rates that vary by more than tenfold across U.S. counties. Suburban counties in the outer ring of major metropolitan areas — Loudoun County in Northern Virginia, Hunterdon County in New Jersey, Douglas County in Colorado — typically show child poverty rates below 5%. Counties in the Mississippi Delta, the Rio Grande Valley of South Texas, the Appalachian coalfields, and the rural Deep South regularly exceed 30% and in many cases exceed 40%.
The highest child poverty rates in the continental United States are concentrated in a set of geographically and historically distinct regions. In the Mississippi Delta, counties including Holmes, Humphreys, Quitman, and Tunica have recorded child poverty rates above 40% in multiple SAIPE release years. The Rio Grande Valley in South Texas produces similarly elevated rates in Starr, Zavala, and Hudspeth counties. In Appalachia, Owsley County and Wolfe County in eastern Kentucky have been among the highest-poverty counties in the nation for every year of the SAIPE series, with child poverty rates persistently above 40%. McDowell County in West Virginia, Lee County in Virginia, and Hancock County in Tennessee similarly show persistent deep poverty across decades of SAIPE data.
The USDA Economic Research Service defines “persistent poverty counties” as counties with poverty rates of 20% or more in each of the last four Decennial Censuses — a designation that covers approximately 354 counties as of the 2020 Census. Persistent poverty counties are disproportionately located in the rural South and Appalachia, are disproportionately populated by Black, Native American, and Hispanic residents, and show consistently lower median household incomes, higher unemployment, lower educational attainment, and higher rates of health problems than non-persistent-poverty counties. SAIPE data documents these patterns annually and allows researchers to track whether specific counties enter or exit high-poverty status over time.
Native American reservations and trust lands represent a particularly acute concentration of poverty that SAIPE county data captures incompletely because reservation populations are often small relative to the surrounding county population. Shannon County (now Oglala Lakota County), South Dakota — which contains the Pine Ridge Indian Reservation — has been the poorest county in the United States by median household income and child poverty rate in multiple SAIPE years, with child poverty rates exceeding 50% in some release years. Apache County, Arizona, which contains portions of the Navajo Nation, shows similarly extreme poverty. Ziebach County, South Dakota, and Buffalo County, South Dakota, are among the other counties where the poverty burden of reservation populations produces some of the highest SAIPE estimates in the nation.
SAIPE versus ACS: when to use each
A common source of confusion is the relationship between SAIPE and the American Community Survey, which also produces poverty and income estimates. The two programs serve different purposes and should not be treated as interchangeable.
ACS provides a far richer set of variables: poverty by race and ethnicity, by age group, by household type, by income-to-poverty ratio, by educational attainment, by employment status, and by a dozen other dimensions. SAIPE provides a much narrower set of variables — total poverty, child poverty, children in related families in poverty, and median household income — but covers all geographies with greater temporal precision. ACS 1-year estimates are available only for areas with 65,000 or more people; ACS 5-year estimates cover all geographies but pool five years of data and are temporally centered at the midpoint of the pooling window. SAIPE provides annually updated single-year estimates for all 3,144 counties and 13,000+ school districts.
The practical rule is: use SAIPE when you need a current, annually updated poverty or income estimate for a small county or school district, particularly for purposes connected to federal formula programs. Use ACS when you need demographic breakdowns, income distribution, or poverty by subgroup. For research that requires understanding how poverty varies across demographic groups within a county, ACS is the only source; SAIPE does not provide race-specific poverty rates. For research tracking annual changes in county-level poverty or comparing school districts on a common, annually updated baseline, SAIPE is the appropriate tool.
A specific caution applies to trend analysis. Because SAIPE estimates are produced by a model that combines multiple data sources, year-over-year changes in SAIPE estimates reflect both true changes in poverty conditions and changes in the administrative record inputs (IRS filing patterns, SNAP enrollment rule changes, survey redesigns). When SAIPE is revised for a prior year — which happens when new ACS data, updated administrative records, or methodology revisions become available — the entire back series may shift. Researchers conducting longitudinal analysis should always use the most current available vintage of the SAIPE series rather than splicing estimates from different release years.
Census API access
The Census Bureau exposes SAIPE data through its standard API atapi.census.gov/data/timeseries/poverty/saipe. The API returns JSON and requires no API key. A basic request specifies a getparameter listing the variables to retrieve, a for parameter specifying the geographic level (state, county, or school district viaschdist), an in parameter for the parent geography (e.g., a specific state FIPS code), and a time parameter for the year. The timeseries endpoint also accepts a time range for downloading multiple years in a single call.
Example request for all counties in Mississippi for 2023:
https://api.census.gov/data/timeseries/poverty/saipe?get=NAME,SAEPOVRT0_17_PT,SAEPOVALL_PT,SAEMHI_PT&for=county:*&in=state:28&time=2023Bulk CSV downloads are also available at census.gov/programs-surveys/saipe/data/datasets.html, organized by year with separate files for states, counties, and school districts. The county files include all 3,144 counties with their FIPS codes, estimates, and 90% confidence interval bounds. The school district files include NCES district identifiers that link to the NCES Common Core of Data for joining to enrollment, staffing, and fiscal data. The SAIPE API documentation at api.census.gov/data/timeseries/poverty/saipe/variables.json lists all available variables with definitions.
Python: downloading and analyzing SAIPE county poverty data
The following script uses the Census SAIPE API to download county-level child poverty rates for Mississippi for two consecutive years, computes year-over-year changes, ranks counties by child poverty rate, and identifies counties above the 30% threshold. It requires only requests and pandas. Change STATE_FIPS to any two-digit state FIPS code to run the same analysis for another state; the API endpoint and variable names are identical for all states.
import requests
import pandas as pd
# ---------------------------------------------------------------------------
# Census SAIPE API -- County-Level Child Poverty Rates
# ---------------------------------------------------------------------------
# Endpoint: api.census.gov/data/timeseries/poverty/saipe
# Key variables:
# SAEPOVRT0_17_PT -- poverty rate for people under 18
# SAEPOVALL_PT -- total poverty rate (all ages)
# SAEMHI_PT -- median household income
# NAME -- geography name
# Geographic level: county (use "for=county:*&in=state:28" for all MS counties)
# No API key required.
BASE = "https://api.census.gov/data/timeseries/poverty/saipe"
STATE_FIPS = "28" # Mississippi
CURRENT_YEAR = 2023
PRIOR_YEAR = 2022
def fetch_saipe(year: int, state_fips: str) -> pd.DataFrame:
"""Fetch SAIPE county estimates for a given state and year."""
params = {
"get": "NAME,SAEPOVRT0_17_PT,SAEPOVALL_PT,SAEMHI_PT",
"for": "county:*",
"in": f"state:{state_fips}",
"time": str(year),
}
resp = requests.get(BASE, params=params, timeout=30)
resp.raise_for_status()
data = resp.json()
headers = data[0]
rows = data[1:]
df = pd.DataFrame(rows, columns=headers)
df["year"] = year
# Cast numeric columns
for col in ["SAEPOVRT0_17_PT", "SAEPOVALL_PT", "SAEMHI_PT"]:
df[col] = pd.to_numeric(df[col], errors="coerce")
return df
# ---------------------------------------------------------------------------
# Download current and prior year
# ---------------------------------------------------------------------------
print(f"Fetching SAIPE county data for Mississippi ({STATE_FIPS}), "
f"years {PRIOR_YEAR} and {CURRENT_YEAR}...")
df_current = fetch_saipe(CURRENT_YEAR, STATE_FIPS)
df_prior = fetch_saipe(PRIOR_YEAR, STATE_FIPS)
# Merge on county FIPS to compute year-over-year change
merged = df_current.merge(
df_prior[["county", "SAEPOVRT0_17_PT"]],
on="county",
suffixes=("", "_prior"),
)
merged["yoy_change"] = (
merged["SAEPOVRT0_17_PT"] - merged["SAEPOVRT0_17_PT_prior"]
).round(1)
# Rank by child poverty rate (descending)
merged = merged.sort_values("SAEPOVRT0_17_PT", ascending=False).reset_index(drop=True)
merged["rank"] = merged.index + 1
# ---------------------------------------------------------------------------
# Print county ranking table
# ---------------------------------------------------------------------------
print(f"\n{'Rank':<5} {'County':<30} {'Child Pov Rate':<16} "
f"{'YoY Change':<12} {'Median HHI'}")
print("-" * 80)
for _, row in merged.iterrows():
flag = " <<< >30%" if row["SAEPOVRT0_17_PT"] >= 30.0 else ""
hhi = ("$" + f"{int(row['SAEMHI_PT']):,}") if pd.notna(row["SAEMHI_PT"]) else "N/A"
chg = f"{row['yoy_change']:+.1f}pp" if pd.notna(row["yoy_change"]) else "N/A"
print(
f"{int(row['rank']):<5} {row['NAME']:<30} "
f"{row['SAEPOVRT0_17_PT']:>6.1f}%{'':<8} "
f"{chg:<12} {hhi}{flag}"
)
# ---------------------------------------------------------------------------
# Identify counties above 30% child poverty
# ---------------------------------------------------------------------------
high_poverty = merged[merged["SAEPOVRT0_17_PT"] >= 30.0]
print(f"\n=== Counties with Child Poverty Rate >= 30% ({CURRENT_YEAR} estimates) ===")
print(f" Count: {len(high_poverty)} of {len(merged)} counties")
for _, row in high_poverty.iterrows():
print(f" {row['NAME']:<35} {row['SAEPOVRT0_17_PT']:.1f}% "
f"(rank {int(row['rank'])} of {len(merged)})")
# ---------------------------------------------------------------------------
# Summary statistics for the state
# ---------------------------------------------------------------------------
print(f"\n=== Mississippi SAIPE Summary ({CURRENT_YEAR}) ===")
print(f" Counties in dataset: {len(merged)}")
print(f" Highest child poverty rate: {merged['SAEPOVRT0_17_PT'].max():.1f}% "
f"({merged.loc[merged['SAEPOVRT0_17_PT'].idxmax(), 'NAME']})")
print(f" Lowest child poverty rate: {merged['SAEPOVRT0_17_PT'].min():.1f}% "
f"({merged.loc[merged['SAEPOVRT0_17_PT'].idxmin(), 'NAME']})")
print(f" Median child poverty rate: {merged['SAEPOVRT0_17_PT'].median():.1f}%")
print(f" Mean child poverty rate: {merged['SAEPOVRT0_17_PT'].mean():.1f}%")
print(f" Counties >= 30% child pov: {len(high_poverty)}")
hhi_median = "$" + f"{int(merged['SAEMHI_PT'].median()):,}"
print(f" Statewide median HHI: {hhi_median}")
The script demonstrates the core SAIPE API pattern: a single GET request returns county estimates as a JSON array, with the first row as column headers. Casting the numeric variables with pd.to_numeric with errors="coerce"handles the occasional null value returned for geographies with suppressed estimates. To download school district data instead, change for=county:* tofor=school+district+(unified):* and the geographic level variable to schdist; the poverty variable for the Title I-A formula isSAEPOV5_17R_PT (children ages 5–17 in related families in poverty, not a rate but a count).
Data limitations and research notes
SAIPE estimates carry significant uncertainty for small counties. A county with 2,000 people may have a point estimate of 18% child poverty with a 90% confidence interval spanning 9% to 27% — meaning the true rate could be half or one-and-a-half times the point estimate and the data cannot distinguish. The Census Bureau publishes these confidence intervals precisely because users need to understand the precision of each estimate before drawing policy or analytical conclusions. For ranking or comparing counties with very small populations, the confidence intervals should be inspected before treating the rank as meaningful.
SAIPE measures income poverty using the official federal poverty measure — the same threshold used in the Current Population Survey and ACS poverty measures. The official poverty measure has well-documented limitations: it does not account for in-kind government benefits (SNAP, housing vouchers, Medicaid), it does not adjust for geographic cost-of-living variation, and its thresholds were set in the 1960s using food-budget-based methodologies that many poverty researchers consider outdated. The Census Bureau also publishes the Supplemental Poverty Measure (SPM) at the national and state level, which addresses some of these limitations, but the SPM is not available at the county level and is not used in any federal formula program. SAIPE county estimates will therefore diverge from an accurate picture of material hardship to the extent that in-kind benefit receipt is geographically concentrated.
The two-year data lag is a structural feature of SAIPE that all users must account for. The December 2025 release contains 2023 data. For policy purposes tied to current conditions — emergency declarations, disaster-related funding, real-time economic monitoring — SAIPE is not the appropriate source. For the formula programs SAIPE serves, the lag is accepted as a feature of methodological rigor rather than a defect: the model requires stable administrative record inputs that are not available in near-real-time. Analysts using SAIPE for planning purposes should note the vintage of the estimates they are using and communicate it clearly in any public-facing analysis.