Technical writing
Congressional Voting Records: The Federal Database Behind Every House and Senate Roll Call Vote
Congressional roll call vote data—maintained through VoteView, Congress.gov, and GovInfo—covers every recorded vote in the House and Senate dating back to the First Congress in 1789, enabling researchers to calculate legislator ideology scores, track party loyalty, analyze bipartisan coalitions, and build comprehensive political science datasets covering 250 years of American legislative history.
Sources of roll call vote data
Five primary sources distribute congressional roll call vote records. Each covers a different scope, update cadence, and level of analytical enrichment.
VoteView (voteview.com) is the gold standard for legislative research. Maintained by political scientists at UCLA, VoteView distributes the Poole–Rosenthal dataset: every roll call vote cast in the House and Senate from the 1st Congress (1789) through the current Congress, linked to DW-NOMINATE ideology scores for every legislator. Data is available as plain CSV files, updated within days of new votes. No API key is required—files are direct HTTP downloads. VoteView is the primary dataset for peer-reviewed work in legislative behavior.
Congress.gov API (api.congress.gov) is the official congressional data service, operated by the Library of Congress. It provides real-time structured JSON for current-Congress votes, member biographical data, committee assignments, and sponsored legislation. The API is free with registration at api.data.gov. It is the authoritative source for live vote data as floor votes occur.
GovInfo (govinfo.gov) is the federal depository for official government publications and bulk data. The Congressional Record, House Journal, and Senate Journal—the official legislative minutes—are available as MODS XML and plain text. GovInfo's bulk data API allows programmatic download of vote data in structured formats for redistribution and archival purposes.
ProPublica Congress API provides member-level vote data with pre-calculated party unity scores, vote comparison tools, and bill tracking. It is particularly useful for accountability journalism workflows because it surfaces votes-with-party-pct for each member without requiring researchers to compute it from raw cast data. Registration is free at ProPublica's data store.
The Clerk of the House and the Secretary of the Senate maintain the authoritative official records. The House Clerk publishes XML vote files at clerk.house.gov within minutes of a vote's conclusion. The Senate's official votes are published at senate.gov. These are the upstream sources that Congress.gov and VoteView ultimately derive from.
VoteView and DW-NOMINATE
Keith Poole and Howard Rosenthal developed the NOMINATE (Nominal Three-Step Estimation) scaling algorithm in the early 1980s to extract latent ideological positions from observed voting patterns. The core intuition is that if two legislators vote together consistently, they are ideologically proximate; if they vote against each other consistently, they are distant. NOMINATE fits a spatial model of voting to the complete roll call matrix to recover legislator positions in an ideological space.
The standard release distributed through VoteView is DW-NOMINATE (Dynamic Weighted NOMINATE), which extends the original model to allow legislator positions to evolve gradually over a career while constraining movement to be smooth. DW-NOMINATE produces two-dimensional ideology scores for every legislator who ever served in either chamber from the 1st Congress through the present day—currently covering over 11,000 unique legislators.
Dimension 1 runs from −1.0 (most liberal) to +1.0 (most conservative) on the economic left-right axis. It captures the dominant cleavage in American legislative politics: government intervention in the economy, taxation, social insurance, and regulatory policy. Dimension 1 alone explains over 90 percent of observed roll call voting behavior in most Congresses. It is the number that political scientists use when they refer to a legislator's ideological position.
Dimension 2 captures cross-cutting issues that do not map onto the economic axis. During the 19th century it reflected geographic and sectional alignments. During the mid-20th century it captured the civil rights cleavage—southern Democrats and northern Democrats voted together on economic questions but diverged sharply on civil rights legislation. After the realignment of the 1960s and 1970s, Dimension 2 collapsed in explanatory power because the parties sorted geographically and racially. In modern Congresses, Dimension 2 explains only 2–5 percent of variance and is rarely used in empirical research.
The DW-NOMINATE ideology file is available at voteview.com/data as a CSV. Key columns include:
icpsr— unique legislator identifier (ICPSR political science standard)congress— Congress number (1–118+)state_abbrev— two-letter state abbreviationparty_code— ICPSR party code (100 = Democrat, 200 = Republican, 328 = Independent)bioname— legislator name in Last, First formatnominate_dim1— first-dimension DW-NOMINATE scorenominate_dim2— second-dimension scorenominate_geo_mean_probability— model fit statistic; lower values indicate more unpredictable votingnominate_number_of_votes— count of votes used to estimate the score
Party mean DW-NOMINATE scores over time tell the story of polarization precisely. In the 88th Congress (1963–1964), the mean Dimension 1 score for House Democrats was approximately −0.28 and for Republicans approximately +0.26—a gap of 0.54 units, with substantial ideological overlap between moderate members of both parties. By the 118th Congress (2023–2024), the mean House Democrat score was approximately −0.38 and the mean Republican score approximately +0.51—a gap of 0.89 units, with no ideological overlap. The two distributions do not overlap at all. Keith Poole documented this polarization trend extensively; the DW-NOMINATE data is the primary empirical basis for the academic consensus that Congress is more polarized today than at any point since Reconstruction.
Vote data structure
Every roll call vote in the House or Senate generates a structured record with several layers of data. Understanding the structure is essential before working with any of the upstream sources.
The vote header record contains:
- Vote identifier — a unique reference number, e.g., “H RCS 117-450” (House Roll Call Series, 117th Congress, vote 450). Senate votes use “S Vote” nomenclature.
- Congress and session — each Congress spans two years; each Congress has two sessions (one per year). Most votes occur in the first or second session, with session breaks for recesses.
- Date and time — precise timestamp, typically Eastern time, recorded by the presiding officer.
- Question — the formal parliamentary question before the chamber, e.g., “On Passage”, “On the Amendment”, “On the Motion to Table”.
- Description — the bill number, amendment number, or motion being voted on. E.g., “H.R. 3684, Infrastructure Investment and Jobs Act”.
- Result — the outcome: “Passed”, “Agreed to”, “Rejected”, or “Failed”. The exact language varies by chamber.
- Aggregate totals — counts of Yea, Nay, Present, and Not Voting. The House requires 218 for a majority; the Senate requires 51 (or 60 for cloture).
Each member's individual vote is coded as one of four values:
Y/Yea— affirmative voteN/Nay— negative voteP/Present— member answered present but declined to vote yes or no; counts toward quorum but not toward passageNV/Not Voting— member did not answer the roll; most common for illness, travel, or deliberate abstention
In VoteView's encoding, the cast-vote record uses numeric codes: 1 = Yea, 6 = Nay, 0 = Not Voting, 9 = Not a member. The cast-vote matrix is stored as a member×vote array with these codes.
Vote categories matter significantly for analysis. The House takes approximately 1,000 roll call votes per year; the Senate takes approximately 300–500. The majority of House votes are procedural: the Previous Question motion (which ends debate and brings the underlying question to a vote), motions to recommit, rule adoption votes (approving the structured rule that governs floor debate on a bill), and quorum calls. Substantive votes—final passage of legislation, amendments, conference reports—represent perhaps 30–40 percent of the total. DW-NOMINATE uses all votes including procedural ones, which is part of why Dimension 1 has such high explanatory power: procedural votes are highly party-line.
Party loyalty and key votes
Congressional Quarterly—now CQ Roll Call—has since 1945 calculated party unity scores for every member of Congress. A party unity vote is defined as any vote where a majority of Democrats voted against a majority of Republicans (or vice versa). A member's party unity score is the percentage of party unity votes on which they voted with their party's majority. In the 118th Congress, average party unity scores exceeded 95 percent for both chambers—meaning members almost always voted with their party when the parties were divided. In the 1960s, average party unity scores were closer to 60 percent, reflecting the cross-cutting alliances of that era.
Each year CQ Roll Call designates approximately 20–30 votes as Key Votes: the most significant and contested votes of the session, chosen based on their legislative importance, the closeness of the outcome, and the degree to which they defined the political positions of members. Key Vote designations are used extensively by political scientists studying position-taking behavior, because the selection is made independently of the outcome and provides a standardized cross-Congress comparison series going back to 1945.
Interest groups maintain their own scoring systems. The Americans for Democratic Action (ADA) liberal score has been published since 1947: the ADA selects 20 votes per year and calculates the percentage on which each member voted the liberal position as defined by ADA. The ADA score is one of the longest-running and most widely cited interest group ratings. On the conservative side, the American Conservative Union (ACU) publishes an equivalent rating, as does Heritage Action. These scores correlate strongly with DW-NOMINATE Dimension 1 but are not identical—they reflect normative choices about which votes are ideologically diagnostic.
Several landmark votes illustrate the range of what roll call data captures. The Authorization for Use of Military Force of 2001, passed three days after the September 11 attacks, cleared the House 420–1 and the Senate 98–0—one of the most lopsided substantive votes in modern history. Representative Barbara Lee of California cast the single dissenting House vote. The Affordable Care Act passed the House in March 2010 on a 219–212 vote, entirely along party lines: no Republican voted in favor, and 34 Democrats voted against. The Tax Cuts and Jobs Act of 2017 cleared the House 227–203 and the Senate 51–48, again on strict party lines. These votes are in VoteView and Congress.gov with complete individual member cast records.
The concept of vote trading and logrolling— exchanging votes across unrelated bills to build coalitions—is well documented in the legislative politics literature. While individual instances of logrolling are difficult to prove from vote data alone, statistical patterns of unexpected vote pairings across bills can surface candidate instances. The VoteView cast-vote matrix combined with bill topic coding from Congress.gov enables this analysis.
Congress.gov API
The Congress.gov API is the Library of Congress's official programmatic interface to congressional data. As of the 118th Congress, the API covers votes, members, bills, amendments, committees, and congressional records. A free API key is obtained at api.data.gov/signup/ and included as either a query parameter (api_key=KEY) or an X-Api-Key HTTP header.
The primary vote endpoint is:
GET /v3/vote/{congress}/{chamber}— list of votes for a given Congress and chamber. Chamber values:senateorhouse.GET /v3/vote/{congress}/{chamber}/{session}/{rollNumber}— a specific vote with complete member cast data.GET /v3/member/{bioguideId}/votes— all votes cast by a specific member, identified by their BioGuide ID (a Library of Congress identifier such asL000174for Senator Patrick Leahy).
Useful query parameters for the list endpoint include:
limit(max 250),offset— paginationstartDate/endDate— ISO 8601 date filteringcategory— filter by vote category; common values includeamendment,cloture,passage,nominationcongress— filter to specific Congress numberformat—jsonorxml
The member endpoint at GET /v3/member returns biographical data for every current and historical member: name, state, district, party, dates of service, committee assignments, and links to their sponsored and cosponsored legislation. The BioGuide ID is the stable identifier that should be used for longitudinal member tracking.
ProPublica Congress API
ProPublica's Congress API (propublica.org/datastore/api/propublica-congress-api) provides a curated layer on top of official congressional data with pre-calculated analytical fields not available directly from Congress.gov. The API requires a free API key and returns JSON.
The most analytically useful fields in ProPublica's member data include:
votes_with_party_pct— percentage of votes where the member voted with their party's majorityvotes_against_party_pct— percentage of votes against party majoritymissed_votes_pct— percentage of votes where member did not vote (Not Voting)bills_sponsored,bills_cosponsored— legislative activity countsideal_point— DW-NOMINATE Dimension 1 score, pass-through from VoteView
ProPublica's vote comparison endpoint (GET /congress/v1/{congress}/{chamber}/sessions/{session}/votes/{roll-call-number}.json) returns the individual position of every member on a specific vote alongside the party breakdown. The democratic_majority_position and republican_majority_position fields make it straightforward to determine whether each member voted with or against their party on that specific vote without computing it from the cast data.
For accountability journalism, ProPublica's bill tracking endpoint is useful for monitoring active legislation: it flags which members co-sponsored a bill, which committee it was referred to, whether it passed committee, and the full vote history if it reached the floor.
Research applications
The political science literature built on roll call vote data is extensive. Several well-established research programs illustrate the analytical possibilities.
Polarization measurement is the most prominent application. The collapse of ideological overlap between House Democrats and Republicans, documented through DW-NOMINATE, is one of the most robust findings in modern political science. As recently as the 1970s, the distributions of Democratic and Republican Dimension 1 scores overlapped substantially—conservative Democrats from the South and liberal Republicans from the Northeast occupied shared ideological space. After approximately 1980, the overlap shrank steadily. By the 110th Congress (2007), it had effectively disappeared. The DW-NOMINATE time series is the standard evidence base for this claim.
Position-taking behavior, developed by David Mayhew in Congress: The Electoral Connection (1974), holds that legislators are primarily motivated by reelection and use roll call votes partly as position-taking signals to constituents rather than purely as policy decisions. This framework generates testable predictions: members in competitive districts should vote differently from members in safe districts, and members should vote differently in election years than off-years. Roll call data makes these hypotheses directly testable.
Constituency influence research uses roll call data to test whether and how district characteristics—median income, racial composition, urban-rural balance—predict member voting. Warren Miller and Donald Stokes's 1963 “Constituency Influence in Congress” established the foundational methodology. The availability of ACS demographic data at the congressional district level makes this analysis much easier now than in the 1960s.
Legislative productivity measurement uses roll call data to track the volume and success rate of legislation: bills introduced, bills passed committee, bills reaching the floor, bills passed, bills becoming law. These counts, stratified by policy area using CRS subject coding from Congress.gov, document the decline in legislative output that scholars associate with increased polarization and the filibuster's expansion.
Interest group score correlations across ADA, ACU, AFL-CIO, Chamber of Commerce, and other rating systems produce a rich portrait of the ideological structure of Congress. When these scores are factor-analyzed alongside DW-NOMINATE, the first factor almost always maps onto Dimension 1—confirming that there is fundamentally one dimension of political ideology in the current Congress.
Python workflow
The following script demonstrates two approaches in parallel: the Congress.gov API for recent Senate votes (requires a free API key), and the VoteView CSV download for historical member ideology data (no key required).
import requests, pandas as pd
from datetime import datetime
# Congress.gov API — free key at api.data.gov
API_KEY = "DEMO_KEY"
base = "https://api.congress.gov/v3"
headers = {"X-Api-Key": API_KEY}
# Get recent Senate roll call votes (118th Congress)
congress = 118
chamber = "senate"
resp = requests.get(
f"{base}/vote/{congress}/{chamber}",
headers=headers,
params={"limit": 20, "offset": 0, "format": "json"},
timeout=20,
)
data = resp.json()
votes = data.get("votes", {}).get("vote", [])
print(f"Recent 118th Congress Senate votes (showing up to 20):")
for v in votes[:10]:
date = v.get("date", "")
question = v.get("question", "")[:50]
result = v.get("result", "")
yeas = v.get("totals", {}).get("yeas", 0)
nays = v.get("totals", {}).get("nays", 0)
print(f" {date} {question}")
print(f" Result: {result} ({yeas}Y-{nays}N)")
# VoteView CSV download (no API needed — direct download)
# Full House vote data for 118th Congress
vv_url = "https://voteview.com/static/data/out/votes/H118_votes.csv"
print(f"\nVoteView data at: {vv_url}")
print("Columns in VoteView member ideology file (NOMINATE):")
print(" icpsr, state_icpsr, district_code, state_abbrev, party_code,")
print(" occupancy, last_means, bioname, nominate_dim1, nominate_dim2,")
print(" nominate_log_likelihood, nominate_geo_mean_probability,")
print(" nominate_number_of_votes, nominate_number_of_errors")
The DEMO_KEY API key is rate-limited to 30 requests per hour and 50 per day across all api.data.gov services. For production use, register a named key at api.data.gov to receive 1,000 requests per hour. The VoteView URLs follow a predictable pattern: replace H118 with the chamber (H for House, S for Senate) and Congress number. Full documentation is at voteview.com/data.
For joining VoteView member scores against Congress.gov vote records, the key bridge is the ICPSR identifier. VoteView uses ICPSR codes as its primary member identifier; Congress.gov uses BioGuide IDs. A crosswalk file maintained by VoteView at voteview.com/static/data/out/members/HSall_members.csv includes both identifiers for all members where a BioGuide ID exists (generally from the 95th Congress onward). For older Congresses, the ICPSR code is the only reliable identifier.
For FEC campaign finance data—how to trace Super PAC money flows, identify dark money conduits, and build the PAC-to-candidate contribution graph: Follow the money: mapping dark money and super PAC flows with FEC bulk data →
For the STOCK Act congressional trading dataset—how to structure member financial disclosures and cross-reference stock trades against roll call votes on related legislation: Trading on the inside: using STOCK Act filings to track congressional stock transactions →
For the GAO reports database—how congressional oversight findings and agency audit results are structured, indexed, and available via the GAO API: The GAO reports database: accessing 80 years of congressional watchdog findings →