Technical writing
Census PL 94-171: The Redistricting Data Behind Every Congressional Map
Every ten years the United States counts its residents, and within months of that count a single federal data product reshapes political power for the next decade. Public Law 94-171 — a 1975 statute requiring the Census Bureau to deliver block-level population and race data to each state by April 1 of the year following the decennial Census — is the statutory foundation beneath every congressional district, every state legislative map, and every Voting Rights Act lawsuit filed in the United States.
What Public Law 94-171 Requires
Congress enacted Public Law 94-171 in 1975 after states complained that the Census Bureau's redistricting data arrived too late and in formats too unwieldy to use for drawing district lines before legislative sessions began. The law imposed a statutory deadline: the Bureau must deliver tabulated population data to each governor by April 1 of the year following the decennial count. For the 2020 Census that deadline was April 1, 2021 — though the COVID-19 pandemic forced a one-time extension that pushed final delivery to August 12, 2021.
The delivery is not a raw microdata file. PL 94-171 specifies exactly which population tables the Bureau must produce, at which geographic levels, and in what format. States receive the data for their territory only. The Bureau then publishes the full national file publicly, giving researchers, journalists, and advocacy organizations access to the same underlying numbers that state redistricting offices use.
The PL 94-171 file is the canonical data product for redistricting — not the American Community Survey, not the Decennial Census Demographic and Housing Characteristics file, not any administrative record. When a court orders a state to redraw its maps, the remedy is measured against PL 94-171 population counts. When a legislative staffer draws a proposed district boundary, the population total shown on their screen traces back to this file.
Data Structure: Five Tables and a Geographic Header
The PL 94-171 file is deliberately lean. Where the full decennial Census releases hundreds of detailed demographic tables, PL 94-171 publishes only five population tables plus geographic header records.
P1 — Race provides counts for the seven racial categories defined under the Office of Management and Budget's 1997 standards: White alone; Black or African American alone; American Indian and Alaska Native alone; Asian alone; Native Hawaiian and Other Pacific Islander alone; Some Other Race alone; and the set of two-or-more-race combinations. Because the Census race question allows respondents to select any combination of the six primary categories, the P1 table contains 63 distinct race-group cells.
P2 — Hispanic or Latino, and Not Hispanic or Latino by Raceis the most operationally important table for redistricting and Voting Rights Act analysis. It crosses the Hispanic/Latino ethnicity question — a separate question from race on the Census form — with the full race category set. The result is 74 cells per geography. Because Hispanic origin is ethnicity rather than race in the OMB framework, a respondent who identifies as Hispanic and White is counted differently from a non-Hispanic White respondent, enabling precise measurement of the Hispanic population independently of racial self-identification.
P3 — Race for the Population 18 Years and Over repeats the P1 race breakdown but restricts the universe to the adult voting-age population. This matters because redistricting law focuses on the citizen voting-age population for Voting Rights Act analysis, and P3 provides the first step: the voting-age universe regardless of citizenship.
P4 — Hispanic or Latino, and Not Hispanic or Latino by Race for the Population 18 Years and Over is the voting-age equivalent of P2, again crossing Hispanic ethnicity with full race categories for adults only.
H1 — Occupancy Status counts housing units as occupied or vacant. This table is used less often in pure redistricting work but matters for verifying group-quarters populations and for housing policy analysis.
Geographic header records accompany each table record and carry the full geography identifiers: state FIPS code, county FIPS, tract code, block group, and census block number. These identifiers allow every record to be linked to a shapefile and placed on a map.
The Geographic Hierarchy: Down to the Census Block
What distinguishes PL 94-171 from most other federal statistical products is its geographic resolution. The file descends the full Census geographic hierarchy: nation, state, county, county subdivision, place, census tract, block group, and finally the census block — the smallest unit the Census Bureau tabulates.
Census blocks are the atoms of American geography. Bounded by streets, waterways, railroads, and administrative boundaries, a typical block contains roughly 40 people, though this varies enormously — urban blocks may hold hundreds in a single apartment tower while rural blocks may be entirely uninhabited. The 2020 Census delineated approximately 8.1 million census blocks nationwide. Only about 5.9 million contained any population; the remaining 2.2 million were uninhabited at the time of enumeration.
Redistricting software ingests PL 94-171 at the block level and aggregates blocks into proposed districts. A staffer drawing a district boundary can move a single block from one district to another and immediately see the population change reflected in the district's total. Without block-level data, achieving the precision that federal equal-population requirements demand would be impossible.
One Person, One Vote: The Constitutional Mandate
The legal architecture that makes PL 94-171 essential traces to two Supreme Court decisions in 1964. Wesberry v. Sanders held that congressional districts within a state must contain substantially equal populations, derived from Article I, Section 2 of the Constitution's requirement that Representatives be chosen “by the People.” Reynolds v. Sims extended the equal-population requirement to state legislative districts under the Equal Protection Clause of the Fourteenth Amendment.
For congressional districts, the standard is essentially absolute: the Supreme Court has held that any deviation from mathematical equality must be justified by a legitimate state objective, and in practice districts in a state may differ from each other by no more than one person. Achieving this precision requires block-level population data because congressional district populations in large states run to roughly 760,000 people (total U.S. resident population divided by 435 seats), and rounding errors compound rapidly at coarser geographic units.
For state legislative districts the standard is somewhat more forgiving. The Court has generally tolerated overall deviations up to 10 percent — that is, the most populous district may contain up to 10 percent more people than the least populous district — without requiring mathematical justification, though deviations beyond 10 percent require a showing that they arise from a rational state policy. Even within this more flexible standard, block-level data is the practical tool for minimizing deviation and insulating maps from legal challenge.
2020 Census Apportionment Results
The 2020 decennial Census counted 331,449,281 residents of the 50 states and the District of Columbia as of April 1, 2020. This resident population figure — which excludes overseas federal employees and military personnel for apportionment purposes under current law — drives the allocation of the 435 House seats among the states.
The apportionment calculation, performed using the method of equal proportions, produced seven seat changes from the prior decade. Texas gained two seats, reflecting rapid population growth in the Dallas–Fort Worth and Houston metropolitan areas. Colorado, Florida, Montana, North Carolina, and Oregon each gained one seat. On the losing side, California, Illinois, Michigan, New York, Ohio, Pennsylvania, and West Virginia each lost one seat.
New York's loss attracted particular attention because the state missed retaining its seat by 89 people out of a state population exceeding 20 million — a margin of roughly 0.0004 percent. Whether that margin reflects true population differences or differential undercounting between New York and gaining states became a point of political and methodological dispute, touching directly on the Census Bureau's differential privacy implementation.
Differential Privacy and the 2020 File
The 2020 PL 94-171 file was the first decennial Census data product to apply formal differential privacy protections to individual responses. The Census Bureau implemented a framework called the TopDown Algorithm, which introduces carefully calibrated statistical noise into tabulated counts to prevent statistical reconstruction of individual responses from the published data.
The noise injection works roughly as follows: the algorithm begins with exact enumeration counts, adds random noise drawn from a mathematical distribution governed by a privacy budget parameter (epsilon), and then applies a post-processing step that constrains noisy counts to be non-negative integers that sum consistently across geographic levels. The result is a file where each count is accurate on average but may differ from the true count by a small amount at any specific geography.
A critical design constraint was that total state-level populations must remain exact. The Constitution requires apportionment to be based on actual enumeration, and any artificial alteration of state population totals would create an apportionment that does not reflect the true count. The Census Bureau therefore applied noise at sub-state geographies only, constraining the totals to match exact enumeration at the state level.
The practical consequences are most severe at the block level. A block with a true population of 12 might appear in the published file with a population of 9 or 15. For small racial or ethnic group counts at the block level, the noise can be proportionally large enough to render specific cell values unreliable. A block with 2 Black residents might appear with 0 or 5. Redistricting practitioners generally treat block-level racial detail with appropriate skepticism and aggregate to larger geographies before drawing conclusions about racial composition.
The controversy generated by differential privacy prompted the Census Bureau to publish a second data product — the Demographic and Housing Characteristics (DHC) file — with a separate noise injection designed to preserve accuracy at slightly coarser geographies. Redistricting practitioners and researchers now have access to both files, though PL 94-171 remains the legally specified product for redistricting deadlines.
Race Categories and Their Complexity
The race question on the 2020 Census form was revised from prior decades to encourage more detailed write-in responses and to more prominently reflect the “Some Other Race” category. The question asks respondents to select all races that apply from a list of checkboxes and write-in fields, and the Census Bureau codes responses into the six primary OMB categories plus combination categories.
Because respondents can select any combination of the six primary race categories, the full race enumeration contains 63 possible cells: 6 alone categories, 15 two-race combinations, 20 three-race combinations, 15 four-race combinations, 6 five-race combinations, and 1 six-race combination. The P1 table contains all 63, though most multi-race combinations contain very small counts at any given geography.
The Hispanic or Latino ethnicity question is asked separately and explicitly states that Hispanic origin is not a race. Respondents who identify as Hispanic may also identify as White, Black, American Indian, or any combination. The P2 table crosses Hispanic/Latino status with all 63 race combinations for a total of 74 cells: 1 total, 1 Hispanic alone, and 63 race-specific cells for each of the Hispanic and non-Hispanic populations. For redistricting and Voting Rights Act purposes, the “Hispanic or Latino” total from P2_002N and the non-Hispanic group totals from P2_003N through P2_074N are the working variables.
Analysts typically use a simplified crosswalk: P2_002N for the total Hispanic or Latino population, P2_005N for the non-Hispanic White alone population, P2_006N for the non-Hispanic Black alone population, P2_007N through P2_009N for non-Hispanic AIAN, Asian, and NHPI alone populations, and the sum of remaining non-Hispanic cells for multiracial and other populations. This simplification loses information about multiracial Hispanics but is standard practice in redistricting litigation and academic research.
Census API Access
The 2020 PL 94-171 data is available through the Census Bureau's public API at the endpoint api.census.gov/data/2020/dec/pl. The variable naming convention uses the table identifier, an underscore, the sequential cell number, and the suffix N for numeric count. Thus P1_001N is total population from table P1, P2_002N is total Hispanic or Latino from table P2, P2_005N is non-Hispanic White alone, and P2_006N is non-Hispanic Black alone.
Geographic filtering follows the Census API's standard for andin parameter pattern. To retrieve data for all census tracts in a state, use for=tract:*&in=state:XX where XX is the two-digit state FIPS code. To retrieve block-level data, which requires specifying the containing county and tract, use for=block:*&in=state:XX county:YYY tract:ZZZZZZ. Block-level queries must be made one county or tract at a time because the API limits response sizes.
The Census Bureau provides free API keys through a registration page at api.census.gov/data/key_signup.html. Unauthenticated requests are rate-limited to 500 per day per IP address; a key raises the limit substantially. For bulk downloads of the full national block-level file, the Bureau also distributes flat files by state through data.census.gov, which are faster than the API for complete-state extracts.
Voting Rights Act Section 2 and Majority-Minority Districts
The Voting Rights Act of 1965 prohibits election practices that discriminate on the basis of race. Section 2 of the Act provides a private right of action against voting practices — including redistricting plans — that result in the denial or abridgement of the right to vote on account of race or color. Unlike Section 5, which required certain jurisdictions to obtain preclearance before changing election laws (a requirement the Supreme Court effectively suspended in Shelby County v. Holder in 2013), Section 2 applies nationwide and remains in full effect.
The Supreme Court established the operative framework for Section 2 redistricting claims in Thornburg v. Gingles (1986), which articulated a three-part precondition test. A minority group claiming a Section 2 violation must show: first, that the group is sufficiently large and geographically compact to constitute a majority in a reasonably configured single-member district; second, that the group is politically cohesive; and third, that the white majority votes sufficiently as a bloc to defeat the minority group's preferred candidate. If all three preconditions are met, the court proceeds to a totality-of-circumstances analysis.
PL 94-171 racial data is directly implicated in the first Gingles precondition. To demonstrate that a minority group is large enough to constitute a majority in a reasonably configured district, plaintiffs and defendants alike draw proposed districts using block-level PL 94-171 counts and compute the voting-age population shares from P3 and P4 tables. Courts have grappled with whether the relevant population is total population, voting-age population, or citizen voting-age population — and because PL 94-171 does not include citizenship data, the latter requires supplementing with ACS citizenship estimates.
In Alabama Legislative Black Caucus v. Alabama (2015), the Supreme Court held that Alabama had unconstitutionally relied on a mechanical requirement to maintain fixed percentages of Black population in majority-minority districts, improperly packing Black voters. In Allen v. Milligan (2023), the Court affirmed that Alabama's congressional map violated Section 2 by failing to draw a second majority-Black district despite the Black share of the state's population warranting it under Gingles. Both cases turned on careful analysis of PL 94-171 racial distributions at the block level.
Partisan Gerrymandering and Its Federal Limits
In Rucho v. Common Cause (2019), the Supreme Court held that federal courts lack authority to adjudicate partisan gerrymandering claims. The majority opinion concluded that there are no judicially manageable standards for distinguishing permissible partisan considerations from unconstitutional partisan gerrymanders, and therefore the federal courts must leave such claims to the political process. This holding means that only racial gerrymandering and Voting Rights Act violations remain cognizable federal causes of action.
State courts have sometimes filled the gap. The Pennsylvania Supreme Court struck down that state's congressional map in 2018 under the state constitution's free and equal elections clause. North Carolina's supreme court similarly struck down partisan maps under state law in 2022, though a subsequent change in the court's composition led to reconsideration. The variability of state court outcomes reflects the fragmented landscape that Rucho created.
Political scientists have developed quantitative measures of partisan gerrymandering, most notably the efficiency gap — a metric that computes the difference in wasted votes between parties across all districts in a plan. Computing efficiency gaps requires both election results and district populations, and researchers typically anchor population data to PL 94-171 to ensure denominators reflect the same universe used for drawing the maps.
Independent Redistricting Commissions
Dissatisfaction with legislative self-interest in map drawing has driven a wave of reform since the early 2000s. Arizona voters adopted an independent redistricting commission by ballot initiative in 2000; California voters followed in 2008 and 2010; Michigan and Colorado adopted commissions in 2018. Several other states have created bipartisan or advisory commissions with varying degrees of independence from the legislature.
Independent commissions use the same PL 94-171 data as legislative bodies — the statutory requirement does not change based on who draws the lines. What changes is the process: commissions typically hold public hearings, receive community input on neighborhoods of interest, and apply criteria specified in the authorizing law or state constitution. Common criteria include equal population, compliance with the Voting Rights Act, geographic compactness, preservation of political subdivisions and communities of interest, and partisan fairness — where the last criterion is permitted under state law.
Commission processes have not eliminated litigation. Arizona's commission faced a challenge that reached the Supreme Court in Arizona State Legislature v. Arizona Independent Redistricting Commission (2015), where the Court held 5–4 that the Elections Clause of the Constitution permits states to vest redistricting authority in a commission rather than the legislature. The decision preserved the commission model but did not resolve the underlying political conflicts that redistricting inevitably produces.
Python: County-Level P2 Summary from the Census API
The script below queries the Census Bureau's PL 94-171 API for all census tracts in a specified state, computes population shares for Hispanic, non-Hispanic White, non-Hispanic Black, and all other groups at the tract level, then aggregates to a county-level summary table. Adjust STATE_FIPS to the two-digit FIPS code for your target state.
import requests
import pandas as pd
# Download the PL 94-171 P2 table (Hispanic/Latino by race) for all
# census tracts in a given state and compute group population shares.
# Then summarize at the county level.
#
# Register for a free Census API key at: api.census.gov/data/key_signup.html
# State FIPS codes: https://www.census.gov/library/reference/code-lists/ansi/ansi-codes-for-states.html
API_KEY = "YOUR_CENSUS_API_KEY"
STATE_FIPS = "48" # Texas; change to your target state
BASE = "https://api.census.gov/data/2020/dec/pl"
# P2 variables (selected)
# P2_001N Total population
# P2_002N Hispanic or Latino (of any race)
# P2_005N Not Hispanic or Latino: White alone
# P2_006N Not Hispanic or Latino: Black or African American alone
# P2_007N Not Hispanic or Latino: American Indian and Alaska Native alone
# P2_008N Not Hispanic or Latino: Asian alone
# P2_009N Not Hispanic or Latino: Native Hawaiian and Other Pacific Islander alone
# P2_010N Not Hispanic or Latino: Some Other Race alone
# P2_011N Not Hispanic or Latino: Two or more races
params = {
"get": "P2_001N,P2_002N,P2_005N,P2_006N,P2_007N,P2_008N,P2_009N,P2_010N,P2_011N,NAME",
"for": "tract:*",
"in": "state:" + STATE_FIPS,
"key": API_KEY,
}
resp = requests.get(BASE, params=params, timeout=120)
resp.raise_for_status()
data = resp.json()
columns = data[0]
rows = data[1:]
df = pd.DataFrame(rows, columns=columns)
rename_map = {
"P2_001N": "total",
"P2_002N": "hispanic",
"P2_005N": "nh_white",
"P2_006N": "nh_black",
"P2_007N": "nh_aian",
"P2_008N": "nh_asian",
"P2_009N": "nh_nhpi",
"P2_010N": "nh_other",
"P2_011N": "nh_multirace",
}
df = df.rename(columns=rename_map)
numeric_cols = list(rename_map.values())
for col in numeric_cols:
df[col] = pd.to_numeric(df[col], errors="coerce")
# Drop tracts with zero or missing total population (uninhabited blocks)
df = df[df["total"] > 0].copy()
# Compute "all other" as everything not Hispanic, NH White, or NH Black
df["other"] = (
df["nh_aian"] + df["nh_asian"] + df["nh_nhpi"] + df["nh_other"] + df["nh_multirace"]
)
# Population shares for each tract
df["hispanic_share"] = round(df["hispanic"] / df["total"], 4)
df["nh_white_share"] = round(df["nh_white"] / df["total"], 4)
df["nh_black_share"] = round(df["nh_black"] / df["total"], 4)
df["other_share"] = round(df["other"] / df["total"], 4)
# ---- County-level summary ----
county_grp = df.groupby("county")[["total", "hispanic", "nh_white", "nh_black", "other"]].sum()
county_grp["hispanic_share"] = round(county_grp["hispanic"] / county_grp["total"], 4)
county_grp["nh_white_share"] = round(county_grp["nh_white"] / county_grp["total"], 4)
county_grp["nh_black_share"] = round(county_grp["nh_black"] / county_grp["total"], 4)
county_grp["other_share"] = round(county_grp["other"] / county_grp["total"], 4)
county_summary = county_grp[
["total", "hispanic_share", "nh_white_share", "nh_black_share", "other_share"]
].sort_values("total", ascending=False)
print("County-level PL 94-171 P2 summary for state FIPS " + STATE_FIPS)
print("Tracts analyzed: " + str(len(df)))
print("")
print(county_summary.to_string())
The output is a county-level table sorted by total population, showing the share of each major group. Counties with high Hispanic shares and moderate non-Hispanic Black shares, combined with the Gingles preconditions analysis, form the starting point for VRA Section 2 opportunity district analysis. The tract-level data underlying the county aggregation can be exported to GIS software by joining on the state, county, and tract FIPS fields against a Census TIGER shapefile.
Related writing
Census ACS: The American Community Survey and the Federal Demographic Dataset Behind Every Policy Decision — The ACS complements PL 94-171 by providing citizenship, income, and detailed housing data at the tract level; together the two Census products cover the core variables in redistricting and VRA litigation.
Census County Business Patterns: Annual Establishment Counts, Employment, and Payroll for Every US County — County Business Patterns tracks economic activity at the county level, providing context for the demographic distributions revealed by PL 94-171.
HMDA: The Home Mortgage Disclosure Act Dataset Behind Every Redlining Investigation — HMDA pairs naturally with PL 94-171 racial geography to analyze whether mortgage lending patterns align with or diverge from neighborhood racial composition.