Technical writing

USCIS H-1B Visa Data: Mapping the 600,000-Worker Skilled Immigration Pipeline

· 14 min read· AI Analytics
Federal DataUSCISImmigrationLabor Markets

Two complementary federal datasets expose the mechanics of the H-1B skilled worker visa program in unusual detail: the Department of Labor's Labor Condition Application disclosure file, which records every certified employer wage offer and worksite location regardless of whether a visa was ultimately issued, and the USCIS H-1B Employer Data Hub, which publishes annual approval and denial counts by employer. Together they cover roughly 600,000 LCA certifications per year and reveal patterns that bear little resemblance to the narrative that surrounds the program in the press.

How the H-1B program works

The H-1B nonimmigrant visa allows US employers to temporarily employ foreign nationals in “specialty occupations” — positions that nominally require a bachelor's degree or equivalent in a specific field. The visa is employer-sponsored: the employer petitions on behalf of the worker, not the worker independently. The worker cannot simply show up with an H-1B and seek employment; the visa is tied to a specific employer and job classification.

The program operates under two annual numerical caps. The general cap is 65,000 visas per fiscal year. An additional 20,000 visas are available exclusively for workers who hold a US master's degree or higher from a US institution — the so-called master's cap exemption. Employers in higher education, affiliated nonprofit research organizations, and government research organizations are entirely exempt from the cap, meaning their H-1B petitions are not counted against either limit and are approved without entering the lottery. This cap-exempt category is substantial: universities, teaching hospitals, and research institutions employ tens of thousands of H-1B workers who never go through the lottery at all.

When the number of petitions received in the first five business days of April exceeds the available cap numbers — which has been the case every year since 2008 — USCIS conducts a random lottery to select which petitions will be adjudicated. Since FY2020, USCIS has used a pre-registration system: employers file an electronic registration in March, USCIS selects registrations by lottery, and only selected employers proceed to file full petitions. The lottery is conducted twice: once from the entire eligible pool (filling the 65,000 general cap), and once from the master's cap pool. A holder of a US master's degree gets two chances — one in the master's pool and, if not selected, one in the general pool.

The H-1B program has two lesser-known variants. The H-1B1 is a specialty-occupation visa available exclusively to nationals of Chile and Singapore under the respective US free trade agreements with those countries. It carries separate annual caps (1,400 for Chile, 5,400 for Singapore) and does not require a lottery, but otherwise functions similarly to the H-1B. The E-3 is a specialty-occupation visa exclusively for Australian nationals, with an annual cap of 10,500. E-3 holders must also go through the LCA process at DOL, so their filings appear in the DOL disclosure data alongside H-1B filings.

The Labor Condition Application: what it is and why it matters

Before an employer can petition USCIS for an H-1B worker, it must first obtain a certified Labor Condition Application from the Department of Labor's Office of Foreign Labor Certification. The LCA is a public attestation by the employer on four key conditions: that the employer will pay the worker at least the required wage, that employment of the H-1B worker will not adversely affect the working conditions of similarly employed US workers, that there is no strike or lockout at the worksite in the relevant occupation, and that the employer has provided notice of the LCA filing to workers in the affected occupation.

The LCA process is administrative, not discretionary: DOL does not evaluate whether the employer actually needs the worker or whether a qualified US worker is available. It certifies only that the employer has attested to the required conditions and that the wage offered meets the prevailing wage requirement. This is important because it means the LCA dataset captures employer intent and wage structure, not DOL judgment about the merits of any particular hire.

Critically, the LCA is filed and certified before the USCIS lottery and before a visa is actually issued. An employer that files 1,000 LCAs and wins only 200 lottery slots will have 1,000 certified LCAs in the DOL disclosure data but only 200 visa approvals in the USCIS data. This means the LCA dataset substantially overcounts actual visa issuances — but it also means it is a more comprehensive view of employer demand and wage practices than the USCIS approval data alone.

What the LCA disclosure file contains

The DOL Office of Foreign Labor Certification publishes quarterly LCA disclosure files at its oflc.dol.gov performance data page. Each quarterly Excel file contains one row per LCA case and includes the following fields, among others:

  • EMPLOYER_NAME — the company name as filed. This is not normalized; the same employer may appear under different name variants across quarters. Entity resolution is required for accurate employer-level aggregation.
  • SOC_CODE and SOC_TITLE — the Standard Occupational Classification code and associated job title. SOC codes are the primary tool for analyzing what types of work are covered by H-1B filings. The two-digit major group (15-xx for computer and mathematical occupations) is the most useful aggregation level for broad industry analysis.
  • FULL_TIME_POSITION — Y or N. Part-time H-1B positions are permitted in certain circumstances and carry different wage-hour calculations.
  • WAGE_RATE_OF_PAY_FROM and WAGE_UNIT_OF_PAY — the employer's offered wage, expressed in the unit the employer chose to file (annual, monthly, weekly, or hourly). Annual normalization is essential for comparison.
  • PREVAILING_WAGE and PW_UNIT_OF_PAY — the prevailing wage that DOL determined applicable to the occupation and geographic area. Employers must pay at least this amount. The comparison between WAGE_RATE_OF_PAY_FROM and PREVAILING_WAGE — the wage ratio — is the core metric for wage suppression analysis.
  • PW_WAGE_LEVEL — the prevailing wage level used to set the prevailing wage floor: Level I (entry), Level II (qualified), Level III (experienced), or Level IV (fully competent). This field is the key to understanding the IT staffing wage strategy.
  • WORKSITE_CITY, WORKSITE_STATE, WORKSITE_POSTAL_CODE— the location of actual employment. For IT staffing and consulting firms that place workers at client sites, the worksite is typically the client location, not the staffing company's headquarters.
  • CASE_STATUS — Certified, Denied, Withdrawn, or Certified-Withdrawn. Analysts typically filter to Certified cases as the relevant population.

The prevailing wage level system

The prevailing wage requirement is the central wage protection in the H-1B program, and the level system is where its greatest weakness lives. Under DOL rules, prevailing wages are set using the Occupational Employment and Wage Statistics (OEWS) survey, and they are broken into four levels corresponding to different experience and skill tiers within an occupation:

  • Level I — entry-level workers with a basic understanding of the job. Corresponds to the 17th percentile of the OEWS wage distribution for the occupation and area.
  • Level II — qualified workers with experience. Corresponds to the 34th percentile.
  • Level III — experienced workers who exercise independent judgment. Corresponds to the 50th percentile (median).
  • Level IV — fully competent workers who use advanced skills and serve as mentors. Corresponds to the 67th percentile.

The employer selects the wage level that applies to the position as filed. This creates a structural opportunity for wage suppression: an employer can file an LCA for a software developer at Level I, set the prevailing wage floor at the 17th percentile for that occupation, and legally offer wages that are far below what the median worker in that occupation earns. An H-1B worker employed under a Level I LCA may be performing identical work to a Level III US employee sitting at the next desk but is protected only by the Level I floor.

The DOL and USCIS have periodically attempted to tighten the wage level rules, and in 2020 the Trump administration published an interim final rule that would have substantially raised the prevailing wage floors. That rule was vacated in litigation before it took effect. A subsequent Biden administration rule similarly seeking higher prevailing wages was also challenged. As of mid-2026, the Level I–IV structure from the OEWS percentile framework remains in place.

IT staffing companies dominate the filing data

The most consistent finding across H-1B data analyses from any year is that the top employers by LCA volume are not the large technology firms that dominate public perception of the program. They are IT staffing and outsourcing companies, primarily Indian-headquartered: Infosys, Tata Consultancy Services, Cognizant Technology Solutions, Wipro, and HCL America routinely appear among the top five employers by certified LCA count. Accenture Federal Services and other large IT services firms also appear consistently in the top tier. Apple, Google, Microsoft, and Amazon — the firms most prominently associated with H-1B in political debate — file relatively modest numbers of LCAs compared to the staffing companies.

The distinction matters for wage analysis. Large technology employers typically file H-1B petitions for workers they intend to retain long-term, and they tend to file at Level III or Level IV with wages well above the prevailing wage floor. IT staffing companies, by contrast, employ a business model based on placing workers at client sites for the duration of specific projects. Their competitive advantage depends partly on labor cost, which creates an incentive to file at Level I and offer wages at or just above the prevailing wage floor. LCA data analyses consistently find that Level I filings are concentrated among staffing companies, with median wage ratios (offered wage / prevailing wage) close to 1.00 — meaning employers are offering exactly what the law requires, no more.

This Level I concentration has been documented by the Economic Policy Institute, the Government Accountability Office, and investigative reporting by Bloomberg, Reuters, and ProPublica. The pattern is not ambiguous in the data: filter the LCA file to employers with more than 500 annual certifications, compute the percentage of their filings at Level I, and the IT staffing tier separates cleanly from the technology product tier.

India-born workers and the backlog

More than 70 percent of H-1B holders are India-born, a figure that has been remarkably stable across administrations and years. The concentration reflects several factors: the scale of India's English-speaking engineering graduate population, the dominance of Indian-headquartered IT staffing companies as H-1B filers, and the historical pattern of Indian-origin professionals entering US technology firms in the 1990s and 2000s and then sponsoring workers through their professional networks.

The India-born concentration has produced a severe employment-based green card backlog that is analytically distinct from the H-1B program itself but functionally inseparable from it. Under the Employment First preference categories, no more than 7 percent of green cards may be issued to nationals of any single country per fiscal year. Because India-born workers constitute the overwhelming majority of the EB-2 and EB-3 employment-based petition queues, the per-country cap effectively limits India-born applicants to a small fraction of the annual green card numbers. The resulting backlog is estimated at decades for current applicants in the EB-3 India queue — in some analyses, a century or more under current issuance rates. H-1B status can be extended in three-year increments indefinitely if an employment-based green card is pending, meaning a substantial share of H-1B holders are not recently arrived workers but long-term US residents who have been waiting for permanent residence for years or decades.

The USCIS H-1B Employer Data Hub

USCIS publishes the H-1B Employer Data Hub at uscis.gov/tools/reports-and-studies/h-1b-employer-data-hub. The hub provides annual employer-level counts of initial and continuing H-1B approvals and denials, organized by fiscal year. Unlike the LCA data, which covers certifications before the lottery, the USCIS data covers petitions that were actually adjudicated — a subset of the LCA universe limited to petitions that won the lottery (for cap-subject employers) or were cap-exempt.

The USCIS data is valuable for computing employer-level denial rates, which differ substantially across employers and across policy periods. Between 2017 and 2020, USCIS substantially increased its denial rates through more aggressive issuance of Requests for Evidence (RFEs) and more frequent denials on the grounds that the position did not qualify as a specialty occupation. Denial rates for IT staffing companies were notably higher than for technology product companies during this period. USCIS denial rate trends are among the clearest signals of administrative H-1B policy changes between administrations, since they reflect enforcement policy without requiring a change in statute.

Accessing the LCA data

The DOL OFLC publishes quarterly LCA disclosure files in Excel format at dol.gov/agencies/eta/foreign-labor/performance. Each file covers one fiscal quarter and contains all H-1B, H-1B1, and E-3 LCA applications filed in that quarter. The files are large — a typical quarter has 150,000 to 200,000 rows — and are released with approximately a 90-day lag after the end of the quarter.

The disclosure files are the basis for virtually all journalistic and academic H-1B wage analysis. Because they are public and machine-readable, they do not require a FOIA request. The main practical challenges are inconsistent employer name formatting across quarters (requiring entity resolution before employer-level aggregation) and the need to normalize wages to a common unit (annual) before computing ratios.

Python workflow: downloading LCA data and computing wage ratios

The following script downloads a quarterly LCA disclosure file and computes the employer-level wage ratio statistics and Level I concentration that are the standard starting point for wage suppression investigation:

import pandas as pd
import requests
from pathlib import Path

# DOL OFLC publishes quarterly LCA disclosure files at:
# https://www.dol.gov/agencies/eta/foreign-labor/performance
# Files are Excel (.xlsx). Filenames follow the pattern:
#   LCA_Disclosure_Data_FY<YYYY>_Q<N>.xlsx

# Download a specific quarterly file
def download_lca_file(fiscal_year: int, quarter: int, dest_dir: str = ".") -> Path:
    base = "https://www.dol.gov/sites/dolgov/files/ETA/oflc/pdfs"
    filename = f"LCA_Disclosure_Data_FY{fiscal_year}_Q{quarter}.xlsx"
    url = f"{base}/{filename}"
    dest = Path(dest_dir) / filename
    if not dest.exists():
        print(f"Downloading {url}")
        r = requests.get(url, timeout=120)
        r.raise_for_status()
        dest.write_bytes(r.content)
    return dest

path = download_lca_file(2024, 4)

# Key columns in the LCA disclosure file
# CASE_STATUS          - "Certified", "Denied", "Withdrawn", "Certified - Withdrawn"
# EMPLOYER_NAME        - company name as filed
# SOC_CODE             - Standard Occupational Classification code
# SOC_TITLE            - job title tied to the SOC code
# FULL_TIME_POSITION   - Y/N
# WAGE_RATE_OF_PAY_FROM - employer's offered wage (lower bound if a range)
# WAGE_UNIT_OF_PAY     - "Year", "Month", "Week", "Hour"
# PREVAILING_WAGE      - prevailing wage as determined by the survey source
# PW_UNIT_OF_PAY       - unit for prevailing wage
# PW_WAGE_LEVEL        - Level I, II, III, or IV
# WORKSITE_CITY        - city of employment
# WORKSITE_STATE       - state of employment
# WORKSITE_POSTAL_CODE - ZIP code

df = pd.read_excel(path, dtype=str)
df.columns = [c.strip().upper() for c in df.columns]

# Keep only certified LCAs (the actionable subset)
certified = df[df["CASE_STATUS"].str.startswith("Certified", na=False)].copy()

# Normalize wages to annual figures
def to_annual(wage_str, unit_str):
    try:
        wage = float(str(wage_str).replace(",", ""))
    except (ValueError, TypeError):
        return None
    unit = str(unit_str).strip().lower()
    multipliers = {"year": 1, "month": 12, "week": 52, "hour": 2080, "bi-weekly": 26}
    return wage * multipliers.get(unit, 1)

certified["WAGE_ANNUAL"] = certified.apply(
    lambda r: to_annual(r.get("WAGE_RATE_OF_PAY_FROM"), r.get("WAGE_UNIT_OF_PAY")), axis=1
)
certified["PW_ANNUAL"] = certified.apply(
    lambda r: to_annual(r.get("PREVAILING_WAGE"), r.get("PW_UNIT_OF_PAY")), axis=1
)

# Wage ratio: offered wage vs. prevailing wage
# Ratio < 1.0 means employer is offering below prevailing wage — prohibited,
# but rounding and survey differences can produce values just below 1.0.
# Ratios consistently at exactly 1.000 often indicate Level I gaming.
certified["WAGE_RATIO"] = certified["WAGE_ANNUAL"] / certified["PW_ANNUAL"]

# Top 20 employers by certified LCA count
employer_summary = (
    certified.groupby("EMPLOYER_NAME")
    .agg(
        lca_count=("CASE_STATUS", "count"),
        median_wage=("WAGE_ANNUAL", "median"),
        median_pw=("PW_ANNUAL", "median"),
        median_ratio=("WAGE_RATIO", "median"),
        pct_level_i=("PW_WAGE_LEVEL", lambda x: (x.str.strip() == "I").mean()),
    )
    .sort_values("lca_count", ascending=False)
    .head(20)
)

print("Top 20 employers by LCA count:")
print(employer_summary.to_string())

# SOC code breakdown — share of LCAs by occupation family
certified["SOC_MAJOR"] = certified["SOC_CODE"].str[:2]
soc_counts = certified["SOC_MAJOR"].value_counts()
print("\nLCAs by 2-digit SOC major group:")
print(soc_counts.head(10).to_string())

# Level I concentration by employer — flag employers using Level I on >80% of filings
level_i_flag = employer_summary[
    (employer_summary["lca_count"] >= 100) & (employer_summary["pct_level_i"] > 0.8)
]
print("\nEmployers with 100+ LCAs using Level I wage on >80% of filings:")
print(level_i_flag[["lca_count", "median_ratio", "pct_level_i"]].to_string())

A few methodological notes on working with this data: the wage ratio will cluster tightly at 1.00 for IT staffing employers, which is itself informative — it indicates employers are offering the legal minimum and nothing more. Ratios slightly below 1.00 usually reflect rounding or wage-unit conversion differences rather than actual violations. True violations — where the employer paid less than the certified LCA wage — appear in DOL WHD enforcement data rather than in LCA filings themselves. The LCA is a prospective attestation; actual wage compliance is monitored separately through WHD audits.

How journalists use H-1B data

The LCA and USCIS datasets have driven a sustained body of investigative journalism over the past decade. The reporting patterns fall into several categories:

  • Employer wage suppression. ProPublica, Reuters, and Bloomberg have each published analyses showing that IT staffing companies file the majority of their LCAs at Level I with wages at or just above the prevailing wage floor, while simultaneously placing those workers at client sites where they perform work that the client firm would otherwise staff at substantially higher wages. The LCA data makes this pattern quantifiable at the employer level.
  • Displacement of US workers. Several high-profile cases — including Southern California Edison, Disney, and Abbott Laboratories — involved employers who laid off US workers and required them to train their H-1B replacements as a condition of receiving severance. The LCA data, cross-referenced with layoff notices filed under the WARN Act, was central to identifying these cases.
  • Fraud and misuse. DOL WHD and USCIS publish enforcement actions against employers who violated LCA conditions — paying workers less than the certified wage, using workers at unlisted worksites, or filing LCAs for phantom positions. The enforcement actions can be cross-referenced against the LCA filing data to identify patterns in which types of employers are cited.
  • Visa mill operations. In several documented cases, employers obtained large numbers of LCA certifications and USCIS approvals with no genuine end client placement, in effect selling H-1B status. The USCIS denial rate surge between 2017 and 2020 was partly targeted at these operations.
  • Sector and occupation trend analysis. The SOC code field makes it straightforward to track which occupations are seeing H-1B demand growth. Computer occupations (SOC 15-xxxx) consistently account for more than 60 percent of certified LCAs, with software developers and computer systems analysts as the top individual SOC codes. Healthcare occupations, architecture and engineering, and business operations make up most of the remainder.

What the data does not cover

The LCA disclosure data covers what employers attest to at the time of filing. It does not capture actual wages paid, actual worksites used, or whether employment materialized at all after an LCA was certified. An LCA can be certified and then withdrawn before a USCIS petition is even filed. An H-1B worker can be benched (paid nothing or below the certified wage while not placed with a client) in violation of the LCA wage requirement; these violations appear in WHD enforcement records rather than in the LCA filing data.

The USCIS Employer Data Hub captures approvals and denials at the petition level but does not include individual-level data — there is no worker-level record linking specific individuals to employers, wages, or case outcomes. Individual-level H-1B records are available only through FOIA requests, and USCIS has been inconsistent in its release of such data. The most granular individual-level H-1B analysis in the public record has come from FOIA litigation rather than routine disclosure.

Neither dataset captures what happens to H-1B workers after their visa status ends or after they change employers. H-1B portability — the ability to change employers while a green card petition is pending — was established by the American Competitiveness in the Twenty-First Century Act of 2000, but the administrative record of portability events is not systematically published. Workers who change employers remain on H-1B but disappear from the original employer's filing history and reappear under the new employer, making career-level tracking impossible from published data alone.

Related writing

For the ICE Enforcement and Removal Operations dataset — the complementary federal immigration dataset covering arrests, detentions, and removals of undocumented workers who appear in the same labor markets: ICE Enforcement and Removal Operations: Reading the Federal Dataset Behind Immigration Enforcement →

For the DOL Wage and Hour Division enforcement database — where actual LCA wage violations appear after WHD investigation, including H-1B prevailing wage cases: Wage theft by employer: using DOL Wage and Hour Division enforcement data to find labor violations →

For the BLS Job Openings and Labor Turnover Survey — the monthly dataset that tracks labor demand across the same occupational sectors that H-1B filings concentrate in: BLS JOLTS: The Federal Dataset That Measures Why Workers Quit →