Technical writing
EPA Enforcement Defendants: The Federal Database Behind 200,000 Environmental Cases
Behind every EPA enforcement action—every consent decree against a refinery, every administrative penalty against a municipal sewer authority, every criminal indictment for falsifying discharge reports—there is a list of names: the companies, municipalities, and individuals the United States actually pursued. EPA keeps that list in the Integrated Compliance Information System, and surfaced through the ECHO platform it amounts to roughly 199,682 defendant records, each tying a named party to a case number and recording whether that party appears in the complaint, in the settlement, or in both.
This article covers what the defendant table is and how it relates to the broader EPA enforcement case database; the ICIS data model and how a case links outward to facilities, statutes, penalties, and compliance schedules; the environmental statutes that the cases are brought under; the analytically rich distinction between complaint-named and settlement-named defendants and what the gap between them reveals; the three enforcement tracks—civil judicial, administrative, and criminal; the recurring case archetypes that dominate the docket; the real-world uses from corporate environmental risk screening to repeat-defendant and parent-subsidiary mapping; a Python workflow that pulls defendants and cases from the ECHO case API and ranks repeat defendants and complaint-to-settlement conversion; and the caveats every analyst must hold in mind when working with a table that is, at bottom, a list of names.
What the dataset is
EPA enforces the nation's environmental statutes through formal actions against regulated parties. Each action—whether it is a lawsuit filed in federal court, an administrative penalty proceeding inside the agency, or a criminal prosecution referred to the Department of Justice—is brought against one or more named parties. Those named parties are the defendants (in judicial and criminal matters) or respondents (in administrative matters). The dataset described here is the roster of those parties: the answer to the deceptively simple question, who did EPA actually name in each enforcement action?
In our database this roster is stored as the table epa_enforcement_defendants, with 199,682 rows. It is the companion to the EPA enforcement case database: where the case table has one row per enforcement action (with the law, the lead facility, the penalty, the filing and settlement dates, the compliance schedule), the defendant table has one row per (case × named party). A single Clean Water Act case naming a parent company, two operating subsidiaries, and a plant manager contributes four rows to the defendant table and one row to the case table. The grain difference is the whole point: an enforcement action is not an abstract event, it is an action taken against specific, nameable entities, and the defendant table is where those entities live. The columns are:
activity_id -- ICIS activity identifier for the enforcement action
case_number -- the human-readable enforcement case number
defendant_name -- the named defendant or respondent (party string)
named_in_complaint -- flag: party was named in the complaint / initiating document
named_in_settlement -- flag: party was named in the final settlement / orderTwo of these columns are identifiers and one is the payload. The activity_id is the ICIS-side key for the enforcement activity; the case_number is the version a human would recognize on a court filing or a consent decree. Either one is the join key back to the case table, and through the case table onward to the facility, the statute, and the penalty. The defendant_name is the named party itself—a corporation, a municipal or county government, a utility district, a federal facility, or, in criminal and some civil matters, a named individual such as a corporate officer or plant operator. The two remaining columns, named_in_complaint and named_in_settlement, are boolean flags, and as the rest of this article will argue, the relationship between them is one of the most analytically interesting things in the entire dataset.
It is worth being precise about what this table is not. It is not the case table; it carries no penalty amount, no statute, no facility location, no dates. It is deliberately narrow: it is the bridge between an enforcement action and the parties that action ran against. That narrowness is a feature. A names-to-cases bridge, joined to the case file on one side and to facility identifiers on the other, is exactly the structure you need to ask the questions that the case table alone cannot answer: which parties recur across many cases, which corporate families absorb the most enforcement, and how often a party named at the outset of an action is still standing when the action resolves.
The ICIS data model and how a case links outward
The Integrated Compliance Information System (ICIS)is EPA's system of record for federal enforcement and compliance activity. It replaced a generation of separate, statute-specific legacy systems—the old Docket system, and program-specific compliance databases—with a single integrated model in which an enforcement action is represented once and related to everything it touches. ICIS is the back end; ECHO (Enforcement and Compliance History Online, echo.epa.gov) is the public front end that exposes ICIS case and compliance data alongside the program databases for air, water, hazardous waste, and toxics. When this article refers to data drawn from ICIS and surfaced via ECHO, it means the ICIS enforcement-case records made publicly queryable through ECHO's case search and case web services.
The central object in ICIS is the case (modeled as an enforcement activity, hence the activity_id). A case is not a standalone fact; it is a hub with spokes radiating out to the things that give it meaning. The most important spokes are:
Defendants and respondents. The named parties— exactly the rows of epa_enforcement_defendants. A case can name one party or many, and the named set can change between the complaint and the final resolution.
Facilities. A case is linked to the regulated facility or facilities whose conduct is at issue. Through the EPA Facility Registry Service (FRS) and its Registry ID, that facility link is what connects an enforcement case to the facility's permits, its self-reported emissions and discharges, its inspection history, and its records in the air, water, and hazardous-waste program databases. The defendant (a legal entity) and the facility (a physical place) are distinct objects in the model precisely because one company can operate many facilities and one facility can change hands between companies.
Statutes and violations. Each case is tagged with the law or laws under which it is brought—Clean Water Act, Clean Air Act, RCRA, and so on—and, in the underlying detail, with the specific statutory and regulatory provisions alleged to have been violated. A single case can span multiple statutes when a facility's conduct violates several at once.
Penalties and relief. The case carries the monetary outcome—the civil penalty assessed or agreed, and frequently a separately tracked figure for the cost of injunctive relief (the capital that a defendant must spend to come back into compliance) and for any Supplemental Environmental Projects (SEPs), environmentally beneficial work a defendant undertakes as part of a settlement. The penalty lives on the case, not on the defendant row, which is why the defendant table must be joined to the case table to attach dollars to names.
Compliance schedules and milestones. Most significant settlements—especially consent decrees—do not merely impose a penalty; they impose a schedule of required actions with deadlines: install this control technology by this date, achieve this discharge limit by that date, submit these reports on this cadence. ICIS tracks those milestones, which is how the agency monitors whether a defendant is actually performing the terms it agreed to. For the analyst, this means a case is not a closed book the day it settles; it has an afterlife of obligations that the system follows for years.
The defendant table is the spoke that answers “who,” and it is most powerful in combination with the others. A defendant name, joined through the case to the facility and through the facility to FRS, becomes a corporate actor with a physical footprint, a regulatory history, and a dollar figure attached—rather than just a string.
The statutes behind the cases
EPA does not enforce a single environmental law; it enforces a portfolio of them, each with its own regulated universe and its own theory of harm. The case behind any defendant row is brought under one or more of these statutes, and knowing which one frames everything about the case—the kind of defendant, the kind of violation, and the shape of the remedy.
The Clean Air Act (CAA) governs emissions to the air. CAA cases are frequently the largest by injunctive-relief cost, because the remedy is often the installation of expensive emission-control technology across an industrial sector. The defendants are refineries, power plants, cement kilns, chemical plants, and, in a distinctive line of cases, manufacturers and installers of devices that defeat vehicle emission controls.
The Clean Water Act (CWA) governs discharges to surface waters through the NPDES permit program. CWA defendants are an exceptionally broad set: industrial dischargers exceeding their permit limits, but also—and very commonly—municipalities and public authorities whose sewer systems overflow. Because every NPDES permittee files monthly self-reported discharge monitoring data, CWA violations are unusually visible, and the CWA accordingly accounts for a large share of the enforcement docket and of the defendant table.
The Resource Conservation and Recovery Act (RCRA)governs the cradle-to-grave management of hazardous waste. RCRA defendants are generators, transporters, and treatment, storage, and disposal facilities cited for improper handling, storage, manifesting, or disposal of hazardous waste, and for failing to perform required corrective action.
CERCLA—the Comprehensive Environmental Response, Compensation, and Liability Act, universally known as Superfund—governs the cleanup of contaminated sites and the recovery of cleanup costs. CERCLA produces a distinctive kind of defendant: the potentially responsible party (PRP). Because CERCLA imposes liability that is strict, joint, and several, a single site can generate dozens or hundreds of PRPs—everyone who owned or operated the site or sent waste there—and the resulting allocation disputes are a defining feature of the defendant data, as discussed below.
EPCRA (the Emergency Planning and Community Right-to-Know Act) governs chemical reporting and emergency planning; its defendants are facilities that failed to report hazardous chemical inventories or toxic releases. TSCA (the Toxic Substances Control Act) governs industrial chemicals and includes the long-running enforcement programs around PCBs, asbestos, and lead-based paint disclosure. SDWA(the Safe Drinking Water Act) governs public water systems and underground injection; its defendants are water utilities and injection-well operators. FIFRA(the Federal Insecticide, Fungicide, and Rodenticide Act) governs pesticides; its defendants are pesticide producers, distributors, and applicators cited for selling unregistered or misbranded products or for applying them in violation of their labels.
Complaint-named versus settlement-named defendants
The two flag columns—named_in_complaint and named_in_settlement—encode the life cycle of a party's involvement in a case, and the gap between them is where some of the most interesting analysis lives. An enforcement action typically begins with an initiating document—a complaint in a judicial case, an administrative complaint or show-cause order in an administrative one—that names the parties the government initially pursues. It typically ends with a resolving document—a consent decree, a settlement agreement, a final order—that names the parties bound by the outcome. The set of parties in the complaint and the set in the settlement are frequently not the same set.
Consider the four combinations of the two flags. A party flagged in both the complaint and the settlement is the ordinary case: named at the start, bound at the end. A party flagged in the complaint but not the settlement is a dropped party—named initially but removed before resolution. This happens for many legitimate reasons: the government concluded the party was not in fact liable; the party demonstrated it had sold the facility before the violations; a parent company was dropped once a subsidiary accepted responsibility; or the claims against that party were severed into a separate proceeding. A party flagged in the settlement but not the complaint is an added party—brought into the resolution without having been in the initiating document, often because the government discovered the right corporate entity only during the case, or because a related entity agreed to join the settlement to resolve the matter cleanly. And a party flagged in neither, but present in the table, signals a record whose flags are simply incomplete—a caveat in its own right.
The settlement-named set is also where joint-and-several liability becomes visible. In CERCLA matters especially, multiple defendants are bound by a single settlement under which each is, in principle, liable for the whole, with the parties left to allocate the actual cost among themselves. A settlement that names a dozen parties for one contaminated site is the signature of joint-and-several allocation, and the defendant table—by listing all dozen against one case number—is the cleanest place to see the full cast of a multi-party settlement that a single penalty-on-the-case figure would flatten.
For the analyst, the complaint-to-settlement comparison is a measurable signal. A conversion rate—of the parties named in complaints, what fraction end up named in settlements—can be computed per defendant, per statute, or per region. A party that is repeatedly named in complaints but rarely in settlements may be a habitual co-defendant that the government routinely names and then drops; a high conversion rate is the profile of a party that, once pursued, tends to be held to account. None of this is in the case table; it requires the party-level grain of the defendant table and the two flags that the table preserves.
Civil, administrative, and criminal tracks
EPA enforcement runs on three distinct tracks, and the defendants differ in kind across them. Understanding which track a case belongs to is essential to interpreting what a defendant row means.
Civil judicial cases are lawsuits filed in federal district court. EPA does not litigate on its own; it refers civil judicial matters to the Department of Justice, whose Environment and Natural Resources Division files and litigates the case on the United States' behalf. These are the largest and most consequential matters—the nationwide refinery settlements, the major municipal sewer decrees, the multi-party Superfund cost-recovery actions. They almost always resolve by consent decree: a negotiated, court-entered judgment that combines a civil penalty with a binding compliance program and ongoing milestones. Civil judicial defendants are predominantly entities—corporations, municipalities, authorities—though individuals can be named.
Administrative cases are handled by EPA itself, without going to court, under the agency's statutory authority to assess penalties and order compliance. The named parties are respondents rather than defendants. The instruments are administrative penalty orders, compliance orders, and consent agreements and final orders (CAFOs). Administrative actions are far more numerous than judicial ones—they are the workhorse of routine enforcement—and they account for the bulk of the defendant table by sheer count, even though each typically carries a much smaller penalty than a major judicial decree. The line between the tracks is partly a function of severity and penalty ceilings: matters beyond the agency's administrative penalty authority, or that need injunctive relief a court must order, go the judicial route.
Criminal cases are a different animal entirely. Environmental crimes—knowing violations, falsifying monitoring data, illegal dumping, lying to regulators—are investigated by EPA's Criminal Investigation Division and prosecuted by the Department of Justice. Criminal matters are where individual defendants appear most prominently: corporate officers, plant managers, and operators who personally directed or concealed the conduct, named alongside (or instead of) the corporate entity. The presence of named individuals, and of language indicating indictment or conviction, is the marker that a defendant row belongs to the criminal track rather than the civil or administrative one. Criminal cases are the smallest track by count but carry the most severe consequences, including incarceration, which no civil or administrative action can impose.
Notable case archetypes
The enforcement docket is not a random scatter of cases; it clusters into recognizable archetypes, and learning the archetypes makes the defendant table far easier to read. Four recur often enough to be worth describing.
Refinery Clean Air Act settlements. Over the past two decades EPA and DOJ have pursued a sustained, sector-wide initiative against petroleum refiners, resolved through large consent decrees that require the installation of advanced controls on the units that drive refinery emissions—fluid catalytic cracking units, heaters and boilers, sulfur recovery plants, flares, and leak-prone equipment. The defendant rows for these cases are corporate refining entities; the cases are characterized by very large injunctive-relief costs (the capital for the controls) relative to the cash penalty, and by long multi-year compliance schedules in ICIS. Because refiners operate through layered corporate structures, these cases are also a frequent site of parent-and-subsidiary co-naming.
Municipal sewer and CSO consent decrees. A large share of CWA enforcement targets not private industry but cities and public sewer authorities, for combined sewer overflows (CSOs) and sanitary sewer overflows—the discharge of untreated sewage when aging combined storm-and-sanitary systems are overwhelmed by rain. These cases resolve through consent decrees that commit a municipality to a long-term control plan, often a billion-dollar, decades-long program of sewer separation, storage tunnels, and treatment upgrades. The defendants are municipal and county governments and sewer districts—a reminder that the most expensive environmental remedies are frequently imposed on the public sector, not on corporations.
Superfund PRP allocation. CERCLA cost-recovery and cleanup actions produce the multi-defendant cases par excellence. For a contaminated site with a long industrial history, the government may name a long list of potentially responsible parties—past and present owners and operators, and the generators and transporters who sent waste to the site—all jointly and severally liable. The settlement then binds the cooperating PRPs, who allocate the cleanup cost among themselves and may pursue non-settling parties for contribution. In the defendant table these cases stand out as a single case number with an unusually long roster of named parties, and they are the clearest illustration of why a party-level table is indispensable: the case-level penalty tells you the site cost, but only the defendant rows tell you who shouldered it.
Diesel defeat-device cases. A distinctive modern CAA line targets the manufacture, sale, and installation of defeat devices—hardware and software that disable or circumvent the emission controls on diesel vehicles and engines, producing illegal excess emissions of nitrogen oxides and particulates. These cases run the gamut from major vehicle and engine manufacturers down to small aftermarket tuners and parts sellers, and they appear on both the civil and criminal tracks. They are a useful reminder that the defendant universe is not limited to large stationary facilities: it includes product manufacturers and distributors whose violations are embodied in the things they sell rather than in a smokestack at a fixed address.
Real-world uses
A names-to-cases table, joined to the case file and to facility identifiers, supports a class of analysis that the case table alone cannot. The common thread is that these questions are about parties, and parties are exactly what this table is keyed on.
Corporate environmental risk screening. The most direct use is due diligence on a specific company. Given a target's name (and its known aliases and subsidiaries), the defendant table answers whether, how often, and under which statutes that company has been named in EPA enforcement—and, joined to the case table, with what penalties and compliance obligations. For an acquirer, a lender, an insurer, or an ESG analyst, an enforcement history keyed to the actual named legal entity is a far harder signal than a facility's self-reported emissions, because it reflects conduct the government found serious enough to formally pursue.
Repeat-defendant detection. Counting the distinct cases in which a normalized name appears surfaces the recidivists—parties named in enforcement actions again and again. Repeat-defendant analysis is valuable both as a risk signal (a company with a long enforcement history is a different counterparty than a first-time respondent) and as a window into enforcement strategy (which actors the agency returns to). The analysis is only as good as the name normalization behind it, which is why the worked example below devotes real attention to collapsing corporate-suffix and punctuation variants before counting.
Parent-subsidiary mapping. Large enterprises are named through whichever legal entity operated the offending facility, and across many cases a corporate family will appear under many different entity names. Clustering those names into corporate families—using the co-occurrence of related entities on the same case, shared facility links through FRS, and external corporate-structure data—turns a flat list of strings into a map of which ultimate parents carry the most enforcement exposure. This is the step that distinguishes naive name-counting (which scatters a conglomerate's history across a dozen subsidiaries) from a true enterprise-level risk view.
Settlement-rate analysis. The complaint and settlement flags support measuring how cases resolve—the conversion of complaint-named parties to settlement-named parties—across statutes, regions, time periods, and defendant types. This can illuminate which kinds of parties tend to be dropped before resolution, whether multi-defendant cases settle differently from single-defendant ones, and how the named-party set narrows or widens as a case moves from initiation to outcome. It is, again, analysis that is simply impossible at the case grain; it requires the party-level rows and the two flags.
Python workflow: pulling defendants and cases from the ECHO case API
EPA exposes ICIS enforcement-case data through ECHO's case web services at echodata.epa.gov/echo. The pattern is a two-step one: a case-search endpoint returns the cases matching a filter (by statute, state, date range, and more), and a case-detail endpoint returns the full record for a single case—including its list of named defendants. The script below searches Clean Water Act cases in Texas over a date range, fetches the defendants for each returned case, normalizes the defendant names, and then computes two of the analyses described above: the repeat defendants (ranked by the number of distinct cases they appear in) and the complaint-to-settlement conversion rate per defendant. No API key is required for public data. Because the exact case-service field and parameter names evolve between ECHO releases, the script isolates them in one place and any production use should be validated against the current case-service documentation; for genuinely national-scale work, the ICIS/ECHO bulk case download is far more efficient than iterating the per-case detail endpoint.
import requests, pandas as pd
from collections import defaultdict
# EPA ECHO Case REST Services -- federal enforcement cases drawn from ICIS.
# No API key is required for public data. Two endpoints used here:
# get_cases -- search/summary of enforcement cases
# get_case_info -- full detail for a single case, including defendants
BASE = "https://echodata.epa.gov/echo"
def search_cases(statute=None, state=None, from_date=None, to_date=None, rows=1000):
# Returns a QID (query handle) plus a first page of case rows. The
# parameter names below are the documented ICIS/case-service filters;
# confirm against the live ECHO case-service schema, which evolves.
params = {"output": "JSON", "responseset": rows}
if statute:
params["p_law"] = statute # e.g. "CWA", "CAA", "RCRA"
if state:
params["p_st"] = state # e.g. "TX"
if from_date:
params["p_dffdate"] = from_date # filed/settled date lower bound
if to_date:
params["p_dftdate"] = to_date
r = requests.get(f"{BASE}/case_rest_services.get_cases", params=params, timeout=120)
r.raise_for_status()
return r.json()
def case_defendants(case_number):
# Full case detail, including the list of named defendants/respondents.
params = {"output": "JSON", "p_case_number": case_number}
r = requests.get(f"{BASE}/case_rest_services.get_case_info", params=params, timeout=60)
r.raise_for_status()
return r.json()
def norm(name):
# Crude name normalization -- collapse the obvious corporate-suffix and
# punctuation noise so that "ACME OIL, INC." and "Acme Oil Inc" collide.
s = (name or "").upper().strip()
for token in (",", ".", " "):
s = s.replace(token, " ")
for suffix in (" INCORPORATED", " INC", " LLC", " L L C", " CORP",
" CORPORATION", " CO", " COMPANY", " LP", " L P", " LTD"):
if s.endswith(suffix):
s = s[: -len(suffix)]
return " ".join(s.split())
# --- 1. Pull a slice of Clean Water Act cases and flatten the defendants ---
result = search_cases(statute="CWA", state="TX", from_date="01/01/2018",
to_date="12/31/2023")
cases = result.get("Results", {}).get("Cases", [])
print(f"CWA cases returned for TX 2018-2023: {len(cases)}")
rows = []
for c in cases:
cn = c.get("CaseNumber") or c.get("ActivityId")
detail = case_defendants(cn)
defs = (detail.get("Results", {}) or {}).get("Defendants", []) or []
for d in defs:
rows.append({
"case_number": cn,
"defendant_name": d.get("DefendantName", ""),
# These two flags mirror the named_in_complaint / named_in_settlement
# columns of the local epa_enforcement_defendants table.
"named_in_complaint": str(d.get("NamedInComplaint", "")).strip().upper() in ("Y", "1", "TRUE"),
"named_in_settlement": str(d.get("NamedInSettlement", "")).strip().upper() in ("Y", "1", "TRUE"),
})
df = pd.DataFrame(rows)
if df.empty:
raise SystemExit("No defendant rows -- inspect the case-service response shape.")
df["norm_name"] = df["defendant_name"].map(norm)
# --- 2. Repeat defendants: parties named across multiple distinct cases ---
repeat = (
df.groupby("norm_name")["case_number"]
.nunique()
.sort_values(ascending=False)
.head(15)
)
print("\nTop repeat defendants (distinct CWA cases, TX 2018-2023):")
for name, n in repeat.items():
print(f" {name[:46]:<46} {n:>4} cases")
# --- 3. Complaint-to-settlement conversion by defendant ---
# How often does a party named in the complaint actually appear in the
# settlement? A low rate flags parties routinely dropped before resolution.
agg = defaultdict(lambda: {"complaint": 0, "settlement": 0})
for _, r in df.iterrows():
a = agg[r["norm_name"]]
a["complaint"] += int(r["named_in_complaint"])
a["settlement"] += int(r["named_in_settlement"])
print("\nComplaint-to-settlement conversion (named >= 3 complaints):")
for name, a in sorted(agg.items(), key=lambda kv: -kv[1]["complaint"]):
if a["complaint"] >= 3:
rate = a["settlement"] / a["complaint"] if a["complaint"] else 0.0
print(f" {name[:40]:<40} {a['settlement']:>3}/{a['complaint']:<3} {rate:5.0%}")
Two practical notes. First, the per-case detail call in the loop is convenient for a bounded slice like one statute in one state over a few years, but it does not scale: a national pull would mean hundreds of thousands of round-trips, and the right tool there is the bulk case export, which ships the defendant roster and the complaint and settlement flags directly, already joined to the case. Second, the conversion-rate output is only meaningful once names are normalized; without the norm step, a single defendant spelled three ways will be counted as three parties, and both the repeat-defendant ranking and the conversion rate will be quietly wrong. Treat name normalization as part of the analysis, not as preprocessing to be skipped.
Limitations and analytical caveats
The defendant table is a uniquely useful bridge between EPA enforcement actions and the parties they ran against, but it is a narrow, names-centric table, and several structural features must be held in mind before drawing conclusions from it.
Name normalization is the central problem. The defendant_name column is a free-text party string captured from legal documents, and the same entity appears under many spellings—with and without Inc., LLC, or Corporation; with punctuation variants; with divisional or “doing business as” qualifiers; and occasionally with outright typographical differences. Any count of distinct defendants, any repeat-defendant ranking, and any parent-subsidiary roll-up is therefore only as reliable as the normalization applied. The crude normalizer in the worked example is a starting point, not a finished solution; serious entity resolution requires fuzzy matching, alias dictionaries, and ideally a join to an external corporate-registry identifier.
There is no penalty or statute on this table. The defendant table is deliberately a bridge: it carries the party, the case number, and the two flags, and nothing else. To attach a dollar figure, a statute, a facility, or a date to a defendant, you must join to the case file on activity_id or case_number. Treating the row count of this table as a measure of enforcement severity confuses the number of named parties with the magnitude of enforcement; a single multi-defendant Superfund settlement can contribute more rows than a far costlier single-defendant refinery decree.
The complaint and settlement flags can be incomplete.The interpretive power of the two flags depends on their being populated consistently, and they are not always. A party present in the table with neither flag set, or a case whose initiating or resolving document was never fully captured, will distort any complaint-to-settlement conversion measure. Conversion rates should be computed over the subset of records with the relevant flag present, and reported as such, rather than over the whole table as if every row carried clean flags.
Historical coverage is uneven. ICIS consolidated a generation of older, statute-specific legacy systems, and the depth and consistency of the data degrade as one goes back in time. Older cases may be sparsely represented or carry less structured detail than recent ones; the migration from legacy systems was not lossless. A count of defendants over time partly reflects changes in data capture and system coverage, not only changes in enforcement activity, and long time-series should be read with that in mind.
A named defendant is not a finding of guilt. Being named in a complaint means the government pursued a party; it is an allegation, not an adjudication. Most civil and administrative matters resolve by settlement without an admission of liability, and parties named in complaints are sometimes dropped precisely because they turned out not to be liable. The defendant table records who was named and how the naming evolved; it does not, by itself, establish that any particular party did what was alleged. That distinction matters especially for individuals and for risk screening, where conflating “named” with “guilty” is both analytically and ethically wrong.
Held with these caveats in mind, epa_enforcement_defendants is the table that puts names to the federal environmental enforcement record: 199,682 rows linking the companies, governments, and individuals the United States pursued to the cases it pursued them in, and recording—through two deceptively small flags—how each party's involvement began and how it ended.
Related writing
EPA RCRA Hazardous Waste Data: The Federal Database Behind 400,000 Regulated Facilities — Many of the defendants in this table are RCRA generators and disposal facilities, and the RCRA compliance record is where the violations that ripen into an enforcement case first appear; joining the defendant roster to RCRAInfo by facility shows what a named party was actually cited for before the case was filed.
EPA Pollutant Emissions: The Federal Database Behind 10 Million Facility-Level Air and Toxic Release Records — Clean Air Act enforcement defendants are frequently the heaviest emitters in the National Emissions Inventory and the Toxics Release Inventory, and because both datasets share the FRS Registry ID, a defendant's case can be tied directly to the facility-level emissions that motivated it.
OFAC Civil Penalties: The Federal Database Behind Sanctions Violations and Treasury Enforcement — A parallel federal enforcement record from a different agency: where EPA names environmental defendants in ICIS, Treasury's OFAC publishes the banks and corporations penalized for sanctions violations, and the two together illustrate how named-party enforcement data underpins corporate risk screening across regulatory domains.