Technical writing

USCIS Immigration Data: The Federal Database Behind Visas, Green Cards, and Naturalizations

· 18 min read· AI Analytics
Federal DataUSCISImmigrationDemographics

US Citizenship and Immigration Services adjudicates roughly 8 million petitions and applications each year — family green cards, employment visas, naturalizations, asylum, DACA renewals, refugee admissions — and publishes unusually detailed outcome data on all of them. The USCIS Data Hub, the DHS Yearbook of Immigration Statistics, and the State Department's monthly visa reports together form the most comprehensive public record of legal immigration in any country. Understanding the data means understanding the program structure: what USCIS does, what it does not do, and where the backlogs that define millions of lives actually come from.

What USCIS is and is not

USCIS is a benefits agency. It adjudicates applications for immigration benefits — permission to be here legally in one status or another. It is not an enforcement agency. Arrests of undocumented immigrants are conducted by Immigration and Customs Enforcement (ICE) and Customs and Border Protection (CBP). Deportations are executed by ICE Enforcement and Removal Operations. The immigration court system that orders removals is administered by the Justice Department's Executive Office for Immigration Review (EOIR). USCIS touches none of that; it handles the front end of the legal immigration system.

This distinction matters for data analysis. The USCIS dataset is a record of applications filed and decisions made: approvals, denials, requests for evidence, administrative closures. An asylum applicant whose case is pending before USCIS has not been ordered removed; they are affirmatively applying for protection. An H-1B worker whose petition was approved by USCIS may or may not have actually entered the country — the State Department issues the actual visa stamp at a consulate, and CBP admits the worker at the port of entry. USCIS records the petition outcome; State records the visa issuance; CBP records the admission. The three are related but distinct, and conflating them produces analytical errors.

USCIS is also unusual among federal agencies in being almost entirely self-funded. It receives no annual appropriation from Congress for most of its operations; it runs on application fees, which totaled approximately $3.6 billion in FY2023. This means its budget fluctuates with application volume, which creates periodic fiscal pressure when application rates drop (as they did sharply in 2020). The fee schedule is set by regulation and has been a recurring source of litigation and rulemaking.

Naturalization: the data and the process

USCIS publishes annual naturalization statistics broken down by country of birth, country of prior citizenship, state of residence, and age group. The numbers have been running between 800,000 and 900,000 per year in recent years, with the exact count sensitive to processing backlogs and application incentives (such as the threat of fee increases prompting application surges).

The statutory requirements for naturalization are worth knowing because they shape who appears in the data. An applicant must:

  • Have been a lawful permanent resident (LPR, or green card holder) for at least five years — or three years if married to and living with a US citizen throughout that period.
  • Have been continuously present in the United States during the five-year period (continuous presence means no single absence of one year or more, and a pattern of absences that does not break the continuity of residence).
  • Have been physically present in the United States for at least 30 months out of the five years immediately preceding the application date.
  • Demonstrate good moral character during the statutory period.
  • Demonstrate an ability to read, write, and speak English (with some age and disability exceptions).
  • Pass a civics test: an officer asks 10 questions drawn from a pool of 100 standardized questions, and the applicant must answer at least 6 correctly.

The 5-year LPR requirement means that naturalization data lags green card issuance by at least five years. A spike in green card admissions in one year shows up in naturalization data roughly five to seven years later, after the LPR period is served and applications are filed and processed. This lag is important for interpreting naturalization trends: the country-of-birth breakdown for current naturalizations reflects immigration patterns from half a decade ago or more.

Mexico consistently produces the largest single-country naturalization count, reflecting the long-established and large Mexican-born LPR population in the United States. India, the Philippines, Cuba, the Dominican Republic, Vietnam, and China typically follow. The country-of-birth ranking in naturalization data is substantially different from the country-of-birth ranking in current immigration flows, because it reflects the historical composition of the LPR population rather than recent arrivals.

Green cards: how the admission categories work

USCIS tracks lawful permanent resident admissions by the category under which the green card was granted. The category determines both the annual numerical limit (if any) and the wait time. The major categories are:

  • Immediate relatives of US citizens — spouses, unmarried children under 21, and parents of adult US citizens. This category is numerically unlimited; there is no annual cap and no waiting list. It consistently accounts for the largest share of green card admissions, typically around 40 percent of the annual total.
  • Family preference categories (F-1 through F-4)— other family members of US citizens and LPRs: unmarried adult children of US citizens (F-1), spouses and children of LPRs (F-2A/F-2B), married children of US citizens (F-3), and siblings of US citizens (F-4). These categories have annual caps and per-country limits, producing waiting lists of years to decades for some countries and preference categories.
  • Employment-based preference categories (EB-1 through EB-5)— 140,000 total annual green cards divided across five preference tiers, from priority workers with extraordinary ability (EB-1) to immigrant investors (EB-5). The per-country annual limit of 7 percent of total employment-based numbers is the source of the India and China backlogs described below.
  • Diversity Visa Lottery (DV) — 50,000 green cards annually distributed by computer lottery to nationals of countries that have sent fewer than 50,000 immigrants to the United States in the previous five years. High-sending countries including Mexico, China, India, the Philippines, Canada, and several others are ineligible. DV green cards are among the few available to people without a family or employer sponsor in the United States.
  • Refugees and asylees — individuals granted refugee status or asylum who are adjusting to permanent resident status, typically one year after their initial grant.

The DHS Yearbook of Immigration Statistics, published annually, provides the definitive count of green card admissions by category, country of birth, state of intended residence, and several other dimensions. It covers the full DHS immigration portfolio, not just USCIS, and is the standard citation for total annual immigration numbers.

The India and China employment-based backlog

The per-country annual limit for employment-based green cards — no more than 7 percent of the annual EB total to nationals of any single country — interacts catastrophically with the concentration of EB-2 and EB-3 applicants from India. Approximately 70 percent of H-1B visa holders, who are the primary pipeline into the EB-2 and EB-3 categories, are born in India. When 70 percent of the demand is concentrated in one country that is capped at 7 percent of supply, the resulting backlog is not a policy failure that can be fixed with administrative efficiency — it is a structural arithmetic consequence of the statutory caps.

The USCIS Visa Bulletin, published monthly, shows the current “priority date” for each preference category and country of birth. The priority date is the date on which an application was filed; a priority date becomes “current” when USCIS announces it can begin processing applications with that filing date. As of 2024, the EB-2 India priority date was approximately January 2012. This means that an Indian-born professional who filed an EB-2 petition in 2024 will not have their green card application adjudicated until roughly 2057 or later — a wait of more than four decades — assuming no change to the statutory caps and no significant change in filing volumes or annual allocations.

The Cato Institute estimated in 2020 that approximately 1 million Indian-born workers and their family members were in the employment-based green card backlog. Many of these individuals have lived in the United States on H-1B status for 10, 15, or 20 years. They pay taxes, own homes, and have US-born children who are citizens, while they themselves remain on temporary nonimmigrant status that is legally tied to their current employer. H-1B status can be extended indefinitely while a green card petition is pending, but workers in this situation cannot freely change jobs without potentially losing their priority date.

The China EB-2 and EB-3 backlogs are similarly severe, though smaller in absolute numbers because the share of H-1B holders from China is smaller than the Indian share. Bills to eliminate or relax the per-country employment-based cap — most prominently the Fairness for High-Skilled Immigrants Act, which has passed the House of Representatives multiple times — have consistently stalled in the Senate, where concerns about the effect on other sending countries have blocked floor votes.

H-1B specialty occupation visa

The H-1B program allows US employers to temporarily employ foreign nationals in specialty occupations — positions requiring a bachelor's degree or equivalent in a specific field. Before filing an H-1B petition with USCIS, the employer must obtain a certified Labor Condition Application from the Department of Labor, attesting that the offered wage meets the prevailing wage for the occupation and area and that the employment will not adversely affect US workers' working conditions.

The H-1B program operates under two annual caps: 65,000 regular-cap slots and an additional 20,000 for workers holding a US master's degree or higher. Cap-exempt employers — universities, affiliated nonprofit research organizations, and government research entities — may hire H-1B workers without entering the lottery and without consuming cap numbers. When cap-subject petitions exceed available slots (which has occurred every year since 2008), USCIS conducts a computerized lottery. Since FY2020, a pre-registration system means employers submit electronic registrations in March; only registrants selected in the lottery proceed to file full petitions.

For FY2025, USCIS received 470,342 eligible registrations for 85,000 available slots, yielding a selection rate of approximately 18 percent. The total number of H-1B holders in the United States at any given time — including those on extensions, those in cap-exempt positions, and those whose status is maintained while a green card is pending — is approximately 580,000 to 600,000, far exceeding the annual cap numbers.

The USCIS H-1B Employer Data Hub publishes annual approval and denial counts by employer, fiscal year, and initial versus continuing status. The top employers by approval volume are consistently Indian-headquartered IT staffing and outsourcing firms — Infosys, Tata Consultancy Services, Cognizant, Wipro, HCL — rather than the large technology product companies that dominate public perception of the program. The wage gaming issue that arises from the prevailing wage level system — where employers can legally offer entry-level wages to workers performing experienced-level work by classifying the position as Level I — is analyzed in detail in the companion article on H-1B data.

Other nonimmigrant employment visa categories

USCIS adjudicates a range of nonimmigrant employment visa categories beyond H-1B. The major ones with published approval data include:

  • L-1 intracompany transferee — allows multinational companies to transfer employees who have worked abroad for at least one year to a US affiliate, parent, or subsidiary. L-1A is for managers and executives; L-1B is for workers with specialized knowledge. The L-1 has no annual cap and no lottery. Top employers by L-1 approval counts are similar to H-1B: large multinational technology and consulting companies and Indian IT services firms.
  • O-1 extraordinary ability — for individuals with “extraordinary ability” in science, education, business, athletics, or the arts, or “extraordinary achievement” in motion pictures or television. O-1 has no cap. Approval data is published in USCIS annual reports. The O-1 is used by a wide range of workers including elite researchers, professional athletes, senior executives, and entertainers.
  • TN (Trade NAFTA/USMCA) — available exclusively to Canadian and Mexican nationals in specific professional categories listed in the USMCA agreement. TN requires no USCIS petition for most Canadian applicants (who apply directly at the port of entry), making the approval data less complete than for employer-petitioned categories.

The State Department publishes monthly nonimmigrant visa statistics at travel.state.gov, covering all visa categories issued at US consulates abroad. These are distinct from USCIS data: State records visa issuances; USCIS records petition approvals. For categories where consular issuance is the primary bottleneck (such as TN for Mexicans or the O visa for artists), the State data is more complete than the USCIS data.

The asylum system and its backlogs

USCIS administers affirmative asylum — cases where the applicant voluntarily files for asylum protection, not in response to removal proceedings. This is distinct from defensive asylum, which is raised as a defense before an immigration judge in EOIR immigration court. An asylum seeker who enters the country and is not detained or placed in removal proceedings may file an affirmative asylum application with USCIS within one year of arrival.

The USCIS affirmative asylum backlog stood at more than 1.7 million pending cases as of 2024, a number that has grown by roughly 300,000 to 400,000 cases per year as new applications outpace adjudication capacity. Cases filed today face waits of many years before an interview is scheduled. If an affirmative asylum applicant is denied by USCIS, the case is referred to immigration court, where it enters the EOIR defensive backlog as an additional pending case.

The EOIR immigration court backlog exceeded 3.3 million pending cases as of mid-2024. EOIR's approximately 700 immigration judges hear cases ranging from asylum petitions to removal proceedings for visa overstays to cancellation of removal applications. The outcome data is published in EOIR annual reports and at a much more granular level through the TRAC immigration database at Syracuse University, which provides judge-level asylum grant rates, court-level completion statistics, and nationality-level outcome data.

Asylum grant rates vary dramatically by immigration judge — some judges grant more than 90 percent of asylum claims before them; others grant fewer than 5 percent. This “judge lottery” is among the most thoroughly documented disparities in the federal administrative system. TRAC's judge-level data makes it possible to quantify the disparity for any court and any nationality of applicants.

The top nationalities for asylum applications in recent years have included Venezuelan, Guatemalan, Honduran, Salvadoran, Cuban, and Haitian nationals. Syrian nationals, whose claims were elevated following the Syrian civil war, remain a significant presence in the backlog. Grant rates vary substantially by nationality; USCIS and EOIR publish nationality-level approval data annually.

DACA: Deferred Action for Childhood Arrivals

DACA is an executive action announced by the Obama administration in June 2012 that grants two-year renewable deferrals of removal action to qualifying individuals who arrived in the United States as children, meet educational or military service criteria, and have no disqualifying criminal history. DACA recipients receive work authorization and a Social Security number, but no path to lawful permanent residence or citizenship; they remain technically removable but are protected from enforcement action by prosecutorial discretion.

USCIS publishes quarterly DACA population data showing active recipients by state of residence and country of birth. As of 2024, approximately 530,000 individuals held active DACA status. More than 80 percent were born in Mexico; the next largest birth-country groups are El Salvador, Guatemala, Honduras, South Korea, Peru, Brazil, Ecuador, and Colombia. Recipients tend to be concentrated in California, Texas, Illinois, New York, Florida, and Arizona — states with large pre-existing unauthorized immigrant populations.

DACA's legal status has been under continuous litigation since 2017. In 2021, a Texas federal district court ruled in Crane v. Johnson that DACA was unlawfully established and enjoined the program as to new applicants, though current recipients were allowed to continue renewing. The Fifth Circuit affirmed the unlawfulness holding in 2022 while remanding for further review of the DHS final DACA rule published in 2022. The case has been in ongoing Fifth Circuit and potential Supreme Court litigation since. The practical effect has been that no new initial DACA grants have been approved since mid-2021; only renewals proceed.

Immigration courts and EOIR

The immigration court system is administered by the Department of Justice's Executive Office for Immigration Review, not by the federal judiciary. Immigration courts are administrative tribunals within the executive branch; immigration judges are employees of the Justice Department, not Article III judges with lifetime tenure. Appeals from immigration court go to the Board of Immigration Appeals (also within EOIR) and from there to the US Courts of Appeals, which are Article III courts.

EOIR publishes annual statistical reports covering case completions, pending caseload, case outcomes by relief sought, and court-level data. The data shows completion rates by court, median days to completion, and the share of cases terminated in absentia (where the respondent did not appear for their hearing). TRAC's EOIR database goes further, providing judge-level grant rates for asylum and other forms of relief, allowing direct comparison of outcomes across judges hearing similar cases.

The statutory framework for immigration courts — their existence within the executive branch rather than the judicial branch, their dependence on prosecutorial priorities for which cases are placed in removal proceedings, and the lack of a right to appointed counsel for respondents who cannot afford their own — produces the systemic variation in outcomes that the TRAC data makes visible.

Where to access the data

The primary sources for USCIS and immigration system data:

  • USCIS Data Hub at data.uscis.gov — downloadable datasets on naturalization, asylum, DACA, green card applications, and various benefit categories. The hub provides both Excel workbooks and CSV downloads for most datasets.
  • USCIS Immigration Data and Statistics at uscis.gov/tools/reports-and-studies/immigration-data-and-statistics — annual reports on all benefit categories, H-1B Employer Data Hub, quarterly DACA population data, naturalization statistics, and the Policy Manual.
  • DHS Yearbook of Immigration Statistics — published annually at dhs.gov/immigration-statistics, covering all immigration benefits across DHS agencies (USCIS, CBP, ICE) in a single comprehensive volume. The definitive source for total annual legal immigration numbers.
  • State Department Visa Statistics at travel.state.gov — monthly nonimmigrant and immigrant visa issuance statistics by nationality and visa category, plus the monthly Visa Bulletin showing current priority dates for all preference categories and countries.
  • TRAC Immigration at trac.syr.edu/immigration — court-level, judge-level, and nationality-level immigration enforcement and court data derived from FOIA requests and data agreements with EOIR and ICE. The most granular publicly available source for immigration court outcomes.
  • EOIR Statistical Reports at justice.gov/eoir — annual reports on immigration court completions, pending caseload, and outcomes.
  • FOIA requests — individual-level USCIS records are not routinely published. Journalists and researchers seeking individual case data (such as H-1B worker records linked to specific employers and wages) must file FOIA requests with USCIS, which has been inconsistent in scope and timeliness of response.

Python workflow: naturalization statistics by country

The following script downloads the USCIS annual naturalization statistics Excel workbook, extracts the country-of-birth table, computes each country's share of total naturalizations, and cross-references against Census ACS foreign-born population estimates to compute a rough naturalization rate proxy:

import pandas as pd
import requests
from pathlib import Path
from io import BytesIO

# USCIS publishes annual Naturalization Statistics tables at:
# https://www.uscis.gov/tools/reports-and-studies/immigration-data-and-statistics/naturalizations
#
# The Excel workbook contains multiple sheets; the country-of-birth table
# is typically named "Table 1" or "Country of Birth". The URL pattern
# changes each fiscal year -- check the USCIS data page for the current link.

NATURALIZATION_URL = (
    "https://www.uscis.gov/sites/default/files/document/data/"
    "Naturalizations_FY2023.xlsx"
)

dest = Path("Naturalizations_FY2023.xlsx")
if not dest.exists():
    print(f"Downloading {NATURALIZATION_URL}")
    r = requests.get(NATURALIZATION_URL, timeout=120)
    r.raise_for_status()
    dest.write_bytes(r.content)

# Read the workbook; inspect sheet names first
xl = pd.ExcelFile(dest)
print("Sheet names:", xl.sheet_names)

# Load the country-of-birth sheet (name varies by year; adjust as needed)
df = xl.parse(xl.sheet_names[0], header=0, dtype=str)
df.columns = [str(c).strip() for c in df.columns]
print("Columns:", df.columns.tolist())

# Typical columns: 'Country of Birth', 'Number'  (or 'Naturalizations')
# Filter out header/footer rows (non-country rows often contain 'Total', blanks, or footnotes)
country_col = [c for c in df.columns if 'country' in c.lower() or 'birth' in c.lower()][0]
count_col   = [c for c in df.columns if 'number' in c.lower() or 'natural' in c.lower()][0]

df = df[[country_col, count_col]].copy()
df.columns = ['country', 'count_raw']
df = df.dropna(subset=['country'])
df = df[~df['country'].str.strip().str.upper().isin(['TOTAL', 'ALL COUNTRIES', ''])]

# Convert count to numeric (remove commas and footnote markers)
df['count'] = pd.to_numeric(
    df['count_raw'].str.replace(',', '', regex=False).str.strip(),
    errors='coerce'
)
df = df.dropna(subset=['count'])
df['count'] = df['count'].astype(int)

total = df['count'].sum()
df['share_pct'] = (df['count'] / total * 100).round(2)

top20 = df.sort_values('count', ascending=False).head(20).reset_index(drop=True)
top20.index += 1  # 1-based rank

print(f"\nTotal naturalizations in file: {total:,}")
print("\nTop 20 countries of birth by naturalizations:")
print(top20[['country', 'count', 'share_pct']].to_string())

# Cross-reference: ACS foreign-born population by country
# Census ACS Table B05006 (place of birth for the foreign-born population)
# Published at data.census.gov; download the national-level CSV.
# Here we use a simplified hardcoded lookup for illustration.
#
# Foreign-born population (approximate, millions) from ACS 2022 1-year estimates:
fb_pop = {
    'Mexico':                     10_700_000,
    'India':                       2_900_000,
    'China':                       2_400_000,
    'Philippines':                 2_000_000,
    'El Salvador':                 1_400_000,
    'Vietnam':                     1_400_000,
    'Cuba':                        1_300_000,
    'Dominican Republic':          1_200_000,
    'Guatemala':                   1_100_000,
    'Korea':                       1_000_000,
    'Colombia':                      800_000,
    'Honduras':                      750_000,
    'Jamaica':                       740_000,
    'Ecuador':                       720_000,
    'Haiti':                         700_000,
    'Brazil':                        650_000,
    'Canada':                        640_000,
    'United Kingdom':                640_000,
    'Germany':                       440_000,
    'Peru':                          430_000,
}

top20['fb_pop_est'] = top20['country'].map(fb_pop)

# Naturalization rate proxy: naturalizations / foreign-born population
# This is a rough proxy; many foreign-born residents are not yet eligible
# (require 5 years LPR, 3 if married to citizen), so the denominator
# overstates the at-risk pool.
top20['nat_rate_per_1k'] = (
    (top20['count'] / top20['fb_pop_est'] * 1000)
    .where(top20['fb_pop_est'].notna())
    .round(1)
)

print("\nNaturalization rate proxy (naturalizations per 1,000 foreign-born):")
print(
    top20[['country', 'count', 'share_pct', 'fb_pop_est', 'nat_rate_per_1k']]
    .to_string()
)

A few notes on working with this data: the Excel workbooks change structure modestly from year to year — sheet names and column headers may require adjustment. The foreign-born population denominator from ACS is imprecise for this purpose because it includes many individuals who are not yet eligible for naturalization (recent arrivals who have not served the five-year LPR period) and excludes naturalization candidates who are eligible but have not yet applied. The resulting “naturalization rate” is therefore a rough population-level signal rather than a precise eligibility-adjusted rate. Countries with high rates in this proxy tend to be those with large, well-established LPR populations (Vietnam, Philippines, Cuba) whose eligible members have had more time to naturalize relative to recent arrivals.

Related writing

For the H-1B employer-level wage and approval data in detail — the LCA disclosure file structure, prevailing wage level gaming, and the Level I concentration among IT staffing companies: USCIS H-1B Visa Data: Mapping the 600,000-Worker Skilled Immigration Pipeline →

For the H-2A and H-2B temporary worker programs — the agricultural and non-agricultural guest worker certifications that feed US farms and hospitality industries: DOL H-2 Visa Disclosures: Mapping the Guest Worker Programs Feeding US Agriculture and Hospitality →

For the complementary enforcement dataset — ICE arrests, detentions, removals, and the interior enforcement data that sits on the other side of the USCIS benefits record: ICE Enforcement and Removal Operations: Reading the Federal Dataset Behind Immigration Enforcement →