Technical writing

FDA FAERS: The Federal Adverse Event Reporting Database Behind Drug Safety Surveillance

· 16 min read· AI Analytics
FDAFAERSDrug SafetyAdverse EventsFederal Data

Every drug that reaches the U.S. market carries an implicit bargain: the FDA approves it based on clinical trial evidence, but clinical trials are too small and too short to detect rare or delayed harms. The FDA Adverse Event Reporting System — FAERS — is the mechanism by which that bargain is honored after approval. It receives roughly two million reports per year from healthcare professionals, consumers, and drug manufacturers, has accumulated more than twenty million records across decades, and serves as the primary signal-generation database behind every major post-market drug withdrawal in the past thirty years.

This article covers the structure and scope of FAERS, its four main report types, the seven quarterly CSV data files and what each contains, the MedDRA medical terminology hierarchy used to code adverse reactions, the statistical methods FDA uses for signal detection, the documented limitations of spontaneous reporting systems, a survey of famous drug safety cases in which FAERS signals preceded or informed major regulatory actions, the openFDA API for programmatic access, and a Python script that queries FAERS via openFDA to profile the adverse event report landscape for a specific drug.

Post-market surveillance and the limits of clinical trials

FDA drug approval rests on the findings of randomized controlled clinical trials submitted in a New Drug Application or Biologics License Application. Those trials are powered to detect the primary efficacy endpoints and to characterize adverse events that occur at rates above roughly one in a hundred to one in a thousand, depending on trial size. Adverse events occurring in one in ten thousand or one in a hundred thousand patients — the rates at which many serious drug harms actually occur — are statistically invisible in a trial population of three to five thousand subjects. Post-market surveillance exists to detect those rarer signals in the vastly larger population of patients exposed to the drug after approval.

FAERS is a passive surveillance system: reports are submitted voluntarily by healthcare professionals and consumers, or mandatorily by drug manufacturers. It is not an active registry, not a claims database, and not a clinical database. It does not follow patients prospectively, does not capture denominator information (how many patients are taking a given drug), and does not adjudicate causation. A report that a patient taking drug X experienced outcome Y does not mean drug X caused outcome Y. These limitations are fundamental and well-understood by pharmacovigilance researchers; they are frequently misunderstood by journalists and non-specialist readers who treat raw FAERS counts as incidence estimates.

Despite these limitations, FAERS remains the indispensable backbone of U.S. post-market drug safety. It has been the primary signal-generation mechanism for dozens of drug withdrawals and label changes since the system's predecessor (the Spontaneous Reporting System, SRS) was established in 1969. The database is publicly available as quarterly bulk downloads and via the openFDA API, making it accessible to academic researchers, epidemiologists, litigants, and journalists at no cost.

Report types and submission pathways

FAERS receives reports through four distinct pathways, each with different regulatory requirements and different implications for data quality.

Voluntary healthcare professional and consumer reports are submitted directly to FDA via MedWatch (Form FDA 3500 for voluntary reporters, FDA 3500A for mandatory reporters). Healthcare professionals — physicians, pharmacists, nurses, and other clinicians — may report any adverse event or medication error they observe. Consumer and patient reports are also accepted and constitute a growing share of the database as FDA has made direct consumer reporting easier through online MedWatch submission. Healthcare professional reports are generally more medically detailed: they tend to include diagnosis information, temporal relationship to drug exposure, relevant medical history, and concomitant medications. Consumer reports are often less complete but can capture experiences not reported through clinical channels.

Mandatory manufacturer 15-day expedited reports are the highest-volume category and the most regulatory consequential. Under 21 CFR 314.81(b)(1) for approved drugs and 21 CFR 600.80 for biologics, manufacturers are required to submit an expedited report to FDA within fifteen calendar days of receiving information about any adverse event that is both serious and unexpected (not already in the current labeling). “Serious” is defined by regulation to include death, life-threatening events, hospitalization (initial or prolonged), disability, congenital anomaly, or required medical or surgical intervention to prevent permanent impairment. Foreign spontaneous reports from post-market experience abroad are also subject to expedited reporting if they meet the serious-and-unexpected threshold.

Manufacturer periodic safety reports cover adverse events that are serious but expected (already described in labeling), or non-serious. Under 21 CFR 314.81(b)(2), manufacturers submit periodic adverse drug experience reports quarterly for the first three years after approval and annually thereafter. These reports consolidate large volumes of non-expeditable reports into aggregate summaries; in the FAERS database they appear as individual records coded as periodic submissions. Manufacturer reports, taken together, constitute the majority of the FAERS database — typically 80-90% of total annual submissions.

Industry safety reports from clinical trials (IND safety reports) are submitted for investigational drugs under 21 CFR 312.32. These are separate from post-market FAERS submissions and are handled through different reporting pathways, but serious unexpected suspected adverse reactions observed in clinical trials must be expedited to FDA within seven or fifteen calendar days depending on whether the event is fatal or life-threatening.

The quarterly FAERS data files

FDA releases FAERS data as quarterly bulk downloads in ASCII/CSV format, available at the FAERS Public Dashboard. Each quarterly package contains seven tables that together describe the adverse event report, the drugs involved, the adverse reactions, the outcomes, the drug therapy dates, the drug indications, and the reporter sources. The tables are linked by a common report identifier (primaryid) and a case identifier (caseid).

FileKey fieldsContents
DEMOage, sex, wt, country, event_dt, rept_codOne row per report: patient demographics, event date, report type (direct / manufacturer / periodic), received date
DRUGdrugname, route, dose_vbm, dechal, role_codOne row per drug per report: drug name, dosage, route of administration, indication, and role (PS = primary suspect, SS = secondary suspect, C = concomitant, I = interacting)
REACpt, drug_rec_actOne row per adverse reaction per report: MedDRA Preferred Term (PT) for each reaction; drug_rec_act indicates whether the reaction resolved on drug discontinuation
OUTCOoutc_codOne row per outcome per report: DE (death), LT (life-threatening), HO (hospitalization), DS (disability), CA (congenital anomaly), RI (required intervention), OT (other serious)
INDIindi_drug_seq, indi_ptOne row per drug indication per report: MedDRA PT for the indication (what the drug was prescribed for); linked to DRUG by drug sequence number
THERstart_dt, end_dt, dur, dur_codOne row per drug therapy episode: drug therapy start and end dates, duration, and duration unit; allows time-to-onset analysis
RPSRrpsr_codOne row per reporter source per report: physician, pharmacist, other health professional, lawyer, consumer, unknown

The DRUG table's role_cod field is analytically critical. A report implicating fifteen concomitant medications alongside one primary suspect is very different from a report where a single drug is identified as the primary suspect. Disproportionality analyses should filter to primary and secondary suspect roles; including concomitant drugs as if they were suspected agents produces severely inflated signal counts. The DEMO table'srept_cod distinguishes direct reports (voluntary healthcare professional and consumer submissions) from manufacturer expedited and periodic reports, enabling stratified analysis by reporter type. The quarterly files do not include personally identifiable information; age is provided as a numeric value with a separate unit field, and patient weight is included when reported.

Duplicate reports are a significant data quality issue in FAERS. A single adverse event may generate multiple reports: the treating physician submits a MedWatch report, the hospital pharmacist submits a separate report, and the manufacturer of the drug (notified through their pharmacovigilance system) submits its own expedited 15-day report based on the same event. FDA provides a deduplication field (caseid) that groups reports representing the same underlying case, and the DEMO file includes a caseversion field to identify the most recent version of each case. Most rigorous FAERS analyses retain only the latest version of each case and apply additional deduplication heuristics based on patient demographics, event date, and drug name.

MedDRA: the terminology hierarchy behind FAERS reaction coding

Adverse reactions in FAERS are coded using the Medical Dictionary for Regulatory Activities (MedDRA), a hierarchical medical terminology system developed and maintained by the International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use (ICH) and licensed by the MedDRA Maintenance and Support Services Organization (MSSO). MedDRA is the international standard for adverse event coding in drug regulation, used by FDA, EMA (European Medicines Agency), and health authorities in Japan, Canada, Australia, and most other regulated markets.

The MedDRA hierarchy has five levels, from most specific to most general:

LevelAbbreviationApprox. countExample
Lowest Level TermLLT~80,000Nausea and vomiting
Preferred TermPT~24,000Nausea
High Level TermHLT~1,700Nausea and vomiting symptoms
High Level Group TermHLGT~340Gastrointestinal signs and symptoms
System Organ ClassSOC27Gastrointestinal disorders

FAERS stores adverse reactions at the PT level in the REAC file. All downstream signal detection and aggregate analysis operates on PTs, though researchers frequently roll up to HLT or HLGT for exploratory analyses. Approximately 24,000 PTs cover essentially every clinically meaningful adverse event across all body systems. Some of the most commonly reported PTs in FAERS are “Drug ineffective” (one of the single most frequent terms, because patients and clinicians report lack of efficacy through FAERS), “Off label use,” “Nausea,” “Fatigue,” “Death,” “Dyspnoea,” “Anaphylactic reaction,” and “Pneumonia.” The high frequency of “Drug ineffective” and “Off label use” reflects the voluntary reporting pipeline's capture of patient experience beyond strictly pharmacological harm.

MedDRA version updates occur semi-annually (March and September) and add new terms, retire obsolete ones, and occasionally reorganize the hierarchy. FAERS records are coded to the version current at the time of processing; historical records retain their original coding. Cross-version comparisons require MedDRA history tables that map retired terms to their current equivalents. The MedDRA terminology itself is licensed and not freely distributable; the MSSO licenses it to academic researchers at reduced cost, and the ICH provides limited free access for non-commercial research.

Signal detection: PRR, EBGM, and disproportionality analysis

FDA's primary quantitative signal detection method is the Empirical Bayes Geometric Mean (EBGM), implemented in the Multi-Item Gamma Poisson Shrinker (MGPS) algorithm developed by DuMouchel at FDA. The underlying concept is disproportionality analysis: if a drug-event pair (a specific drug and a specific adverse reaction PT) appears in FAERS substantially more often than would be expected given the overall reporting frequency of both the drug and the event independently, that disproportionality suggests a potential signal meriting further investigation.

The simpler version of the same idea is the Proportional Reporting Ratio (PRR), widely used in European pharmacovigilance and by academic researchers because it requires no specialized software. PRR for a drug-event pair is defined as:

PRR = (a / (a + b)) / (c / (c + d))

where a is the number of reports for the drug of interest with the reaction of interest; b is the number of reports for the drug with all other reactions; c is the number of reports for all other drugs with the reaction of interest; and d is the number of reports for all other drugs with all other reactions. A PRR substantially greater than 1.0 with a chi-square p-value below the conventional threshold and a minimum absolute count (typically three or more reports) constitutes a statistical signal by the Evans criteria, one of the standard frameworks for PRR-based signal assessment.

The EBGM improves on PRR by applying Bayesian shrinkage to account for sampling uncertainty in small counts: a drug-event pair observed only once or twice will have its EBGM shrunk toward the null (1.0) more aggressively than a pair with fifty observations. This prevents rare events from generating false signals purely due to random variation. FDA's Sentinel System, a complementary active surveillance program that analyzes large insurance claims and electronic health record databases, provides denominator-adjusted incidence estimates that FAERS cannot generate on its own.

A disproportionality signal is not a causal finding. It means that the drug-event pair is reported more frequently than baseline expectation, which may reflect a true pharmacological association, reporting bias (more reports for a new or publicized drug), protopathic bias (the drug was used to treat the early symptoms of the disease that then worsened), or channeling (the drug is preferentially prescribed to sicker patients who are more likely to experience the event regardless of drug use). Signal evaluation requires clinical plausibility assessment, review of the individual case narratives, epidemiological studies in claims or EHR databases, and often additional non-clinical data from animal studies.

Fundamental limitations of FAERS

Voluntary underreporting is the most fundamental limitation of FAERS. The classic estimate — that spontaneous reporting systems capture between one percent and ten percent of actual adverse events — derives from comparisons of FAERS report rates against incidence estimates from prescription databases and hospital records. The reporting rate varies widely by drug, event type, and clinical setting: events that are well-publicized, novel, or legally contested generate substantially more reports per actual occurrence than routine, expected adverse events associated with long-established drugs. This creates systematic reporting bias: new drugs appear more dangerous in FAERS not because they are inherently more dangerous but because adverse events associated with new drugs attract more reporting attention from healthcare professionals and consumers.

The absence of a denominator is the second critical limitation. FAERS contains no information on how many patients are taking a given drug. Without knowing total exposure (patient-years at risk), it is mathematically impossible to compute an adverse event incidence rate. A drug with 10,000 FAERS death reports might have an extremely low mortality rate if 50 million patients take it; a drug with 100 death reports might have a catastrophic mortality rate if only 5,000 patients have been exposed. FDA's Sentinel System, which does have denominator data from insurance claims covering approximately 300 million covered lives, addresses this limitation for active surveillance analyses.

Confounding by indication is particularly problematic in FAERS. Drugs are not prescribed randomly; they are prescribed to patients with specific diseases, and those diseases independently affect the probability of adverse outcomes. A drug prescribed exclusively to patients with advanced heart failure will generate many reports of death and cardiovascular events because its patient population has a high baseline rate of those outcomes, not because the drug is causing them. Drug-event associations in FAERS must be interpreted with careful attention to the clinical population receiving the drug and what adverse events that population would experience even without any drug exposure.

Famous drug safety cases in FAERS

Rofecoxib (Vioxx, Merck) is the defining case of post-market pharmacovigilance failure. Vioxx was approved in 1999 as a COX-2 selective NSAID with a gastrointestinal safety advantage over non-selective NSAIDs. By 2000, FAERS data and the VIGOR trial had raised cardiovascular signals, specifically elevated rates of myocardial infarction in patients taking rofecoxib versus naproxen. Merck maintained that the cardiovascular excess in VIGOR reflected naproxen's cardioprotective effect rather than rofecoxib's harm. FDA issued labeling revisions but did not withdraw approval. By the time Merck voluntarily withdrew Vioxx in September 2004 — prompted by the APPROVE trial showing a two-fold increase in cardiovascular events in the rofecoxib arm — an estimated 20 million to 25 million Americans had taken the drug. David Graham, a senior FDA safety reviewer, testified before the Senate Finance Committee in 2004 that FAERS data had shown excess cardiovascular risk and that the signal had not received adequate institutional response. Subsequent epidemiological estimates attributed between 88,000 and 140,000 excess serious coronary events to rofecoxib exposure, with approximately 27,785 deaths.

Ranitidine (Zantac, originally GlaxoSmithKline; later generic) is a more recent and chemically distinct case. In 2019, Valisure (an online pharmacy with its own laboratory) filed a citizen petition with FDA reporting that ranitidine degrades to produce N-nitrosodimethylamine (NDMA), a probable human carcinogen, and that NDMA levels in ranitidine can increase substantially with storage at elevated temperatures. FDA confirmed the degradation findings and launched its own laboratory analysis. FAERS was queried for ranitidine-associated cancer reports as part of the safety signal assessment. FDA concluded that the unpredictable and potentially high NDMA exposure from ranitidine warranted market withdrawal, and in April 2020 requested all manufacturers to immediately withdraw all prescription and over-the-counter ranitidine products from the U.S. market. The withdrawal was global, affecting Zantac's reformulation and triggering one of the largest mass tort litigations in pharmaceutical history.

Valdecoxib (Bextra, Pfizer) was another COX-2 inhibitor withdrawn in April 2005 following FDA's finding of serious cardiovascular risk and serious skin reactions (including Stevens-Johnson syndrome and toxic epidermal necrolysis) observed in FAERS reports and clinical trial data. The Bextra withdrawal occurred simultaneously with FDA-mandated black-box warnings on all remaining COX-2 inhibitors (celecoxib, remaining on the market) and traditional NSAIDs. Fen-phen (fenfluramine/phentermine), terfenadine (Seldane), cisapride (Propulsid), and troglitazone (Rezulin) are earlier cases where FAERS cardiac, arrhythmia, and hepatotoxicity signals preceded withdrawal.

GLP-1 receptor agonists (semaglutide/Ozempic/Wegovy, liraglutide/Victoza/Saxenda, tirzepatide/Mounjaro) represent the most active current FAERS signal landscape as of 2024–2025. Rapid uptake by tens of millions of patients for both diabetes and weight loss has generated large FAERS volumes for this class. Active signals under FDA review include suicidal ideation and self-injurious behavior (the European Medicines Agency issued a review in 2023; FDA's review concluded no clear causal link as of 2024), gastroparesis and ileus reports, and aspiration pneumonitis during anesthesia in patients with delayed gastric emptying. These signals illustrate the challenge of FAERS interpretation in the context of a rapidly expanding exposure base: increased reporting for any novel widely-used drug is expected, and separating pharmacological signal from stimulated reporting requires denominator-adjusted analysis through Sentinel or epidemiological studies.

The openFDA API

FDA's openFDA initiative, launched in 2014, provides programmatic JSON access to FAERS and several other FDA databases (drug labels, device adverse events, food enforcement actions). The FAERS endpoint ishttps://api.fda.gov/drug/event.json. No API key is required for basic queries; unregistered access is rate-limited to 40 requests per minute and 1,000 requests per day. Registering a free API key at open.fda.gov raises limits to 240 requests per minute and unlimited daily queries.

The API supports two query modes: search (filter reports) andcount (aggregate counts by field). The search parameter accepts Lucene-style query syntax. Key fields for drug queries:patient.drug.medicinalproduct (drug name, free text) andpatient.drug.medicinalproduct.exact (exact match, case-sensitive, required for precise brand name queries). Reaction queries:patient.reaction.reactionmeddrapt.exact returns exact PT term matches; combined with count on the same field, it produces frequency distributions of reaction terms. The serious top-level field filters to serious reports. Seriousness flags (seriousnessdeath, seriousnesslifethreatening,seriousnesshospitalization) allow outcome-stratified queries.

The openFDA API does not expose all FAERS fields and is not a substitute for the quarterly bulk downloads for comprehensive analysis. The bulk CSV files contain drug therapy dates (THER), drug indications (INDI), and reporter sources (RPSR) that are not currently accessible through the API. For time-to-onset analyses, case narrative review, or full linkage across all seven FAERS tables, the quarterly downloads from the FAERS Public Dashboard are the appropriate data source. The dashboard URL isfda.gov/drugs/fda-adverse-event-reporting-system-faers/faers-public-dashboard; download links for quarterly ASCII files are listed there by quarter going back to 2004Q1.

Python: querying FAERS via openFDA for a specific drug

The following script queries the openFDA FAERS API to generate a report profile for a specific drug — in this case Ozempic (semaglutide), the GLP-1 receptor agonist — covering the top twenty adverse reaction PT terms, outcome type distribution by seriousness flag, age distribution of reporters, and sex distribution of reporters. The script requires only the requestslibrary from the Python standard library and handles rate limits with basic error handling.

import requests
import json
from collections import defaultdict

# ---------------------------------------------------------------------------
# FDA FAERS openFDA API analysis
# Endpoint: https://api.fda.gov/drug/event.json
# No API key required for limited queries; register at open.fda.gov for higher
# rate limits (1,000 requests/min with key vs 40 requests/min without).
# ---------------------------------------------------------------------------

BASE_URL = "https://api.fda.gov/drug/event.json"
DRUG_NAME = "OZEMPIC"  # semaglutide GLP-1 receptor agonist


def fda_get(params: dict) -> dict:
    """Thin wrapper around the openFDA API with error handling."""
    resp = requests.get(BASE_URL, params=params, timeout=30)
    resp.raise_for_status()
    return resp.json()


# ---------------------------------------------------------------------------
# Part 1: Top 20 most commonly reported adverse reactions (MedDRA PT terms)
# ---------------------------------------------------------------------------
# search: drug name (medicinalproduct field, exact match for brand names)
# count: reaction preferred term (PT from MedDRA)
# limit: top 20 results

params_reactions = {
    "search": f'patient.drug.medicinalproduct.exact:"{DRUG_NAME}"',
    "count": "patient.reaction.reactionmeddrapt.exact",
    "limit": 20,
}

print(f"=== Top 20 Adverse Reactions for {DRUG_NAME} (MedDRA PT terms) ===")
print(f"  {'Rank':<5}  {'MedDRA Preferred Term':<45}  {'Report Count':>12}")
print("  " + "-" * 65)

data_reactions = fda_get(params_reactions)
for rank, item in enumerate(data_reactions.get("results", []), start=1):
    term = item.get("term", "Unknown")
    count = item.get("count", 0)
    print(f"  {rank:<5}  {term:<45}  {count:>12,}")

# ---------------------------------------------------------------------------
# Part 2: Outcome type distribution (death, hospitalization, disability, etc.)
# Outcomes in FAERS are coded as integers in the FAERS schema:
#   1 = Recovered/resolved
#   2 = Recovering/resolving
#   3 = Not recovered/not resolved
#   4 = Recovered/resolved with sequelae
#   5 = Fatal
#   6 = Unknown
# The openFDA API also exposes seriousness flags as top-level boolean fields.
# ---------------------------------------------------------------------------

OUTCOME_FIELDS = {
    "seriousnessdeath":             "Death",
    "seriousnesslifethreatening":   "Life-threatening",
    "seriousnesshospitalization":   "Hospitalization",
    "seriousnessdisabling":         "Permanent disability",
    "seriousnesscongenitalanomali": "Congenital anomaly",
    "seriousnessother":             "Other serious",
}

print(f"\n=== Outcome Distribution for {DRUG_NAME} Reports ===")
print(f"  {'Outcome':<30}  {'Reports with flag':>17}")
print("  " + "-" * 50)

base_search = f'patient.drug.medicinalproduct.exact:"{DRUG_NAME}"'

outcome_counts = {}
for field, label in OUTCOME_FIELDS.items():
    params_outcome = {
        "search": f'{base_search}+AND+{field}:1',
        "count": field,
        "limit": 1,
    }
    try:
        data_out = fda_get(params_outcome)
        results = data_out.get("results", [])
        count = results[0]["count"] if results else 0
        outcome_counts[label] = count
        print(f"  {label:<30}  {count:>17,}")
    except Exception as exc:
        print(f"  {label:<30}  [error: {exc}]")

# ---------------------------------------------------------------------------
# Part 3: Age distribution of reporters for the drug
# Age in openFDA is stored as patient.patientage (numeric, in the unit
# given by patient.patientageunit). For simplicity we query pre-binned
# age groups using a range search against patient.patientage (assumes years).
# ---------------------------------------------------------------------------

AGE_BUCKETS = [
    ("0-17",    "patient.patientage:[0+TO+17]"),
    ("18-44",   "patient.patientage:[18+TO+44]"),
    ("45-64",   "patient.patientage:[45+TO+64]"),
    ("65-74",   "patient.patientage:[65+TO+74]"),
    ("75+",     "patient.patientage:[75+TO+120]"),
]

print(f"\n=== Age Distribution for {DRUG_NAME} Reports ===")
print(f"  {'Age Group':<12}  {'Report Count':>12}  {'Bar'}")
print("  " + "-" * 55)

age_counts = {}
for label, age_filter in AGE_BUCKETS:
    params_age = {
        "search": f'{base_search}+AND+{age_filter}',
        "limit": 1,
    }
    try:
        data_age = fda_get(params_age)
        meta = data_age.get("meta", {})
        n = meta.get("results", {}).get("total", 0)
        age_counts[label] = n
    except Exception:
        age_counts[label] = 0

total_age = sum(age_counts.values()) or 1
for label, n in age_counts.items():
    pct = n / total_age * 100
    bar = "#" * int(pct / 2)
    print(f"  {label:<12}  {n:>12,}  {bar} ({pct:.1f}%)")

# ---------------------------------------------------------------------------
# Part 4: Sex distribution of reporters for the drug
# patient.patientsex: 0 = Unknown, 1 = Male, 2 = Female
# ---------------------------------------------------------------------------

SEX_CODES = {"Male": 1, "Female": 2, "Unknown": 0}

print(f"\n=== Sex Distribution for {DRUG_NAME} Reports ===")
print(f"  {'Sex':<10}  {'Report Count':>12}  {'Share':>7}")
print("  " + "-" * 33)

sex_counts = {}
for sex_label, sex_code in SEX_CODES.items():
    params_sex = {
        "search": f'{base_search}+AND+patient.patientsex:{sex_code}',
        "limit": 1,
    }
    try:
        data_sex = fda_get(params_sex)
        n = data_sex.get("meta", {}).get("results", {}).get("total", 0)
        sex_counts[sex_label] = n
    except Exception:
        sex_counts[sex_label] = 0

total_sex = sum(sex_counts.values()) or 1
for sex_label, n in sex_counts.items():
    pct = n / total_sex * 100
    print(f"  {sex_label:<10}  {n:>12,}  {pct:>6.1f}%")

# ---------------------------------------------------------------------------
# Part 5: Summary table
# ---------------------------------------------------------------------------
print("\n=== Summary: FAERS Report Profile for", DRUG_NAME, "===")
print(f"  Total reports (approx):         Retrieved via reaction query above")
print(f"  Top reaction (MedDRA PT):       See Part 1 table")
print(f"  Deaths reported:                {outcome_counts.get('Death', 0):,}")
print(f"  Hospitalizations reported:      {outcome_counts.get('Hospitalization', 0):,}")
print(f"  Life-threatening events:        {outcome_counts.get('Life-threatening', 0):,}")
print()
print("  Note: FAERS data reflects VOLUNTARY reporting. Report counts cannot")
print("  be used to compute incidence rates. Duplicate reports, confounding,")
print("  and reporting bias are inherent limitations of spontaneous-report data.")
print()
print("  Denominator (exposed patients) is unknown from FAERS alone.")
print("  Signal detection requires disproportionality analysis (PRR, EBGM),")
print("  not raw report counts.")

The script's output illustrates several characteristic FAERS patterns for a high-profile drug in a popular therapeutic class. The top reaction terms will typically include a mix of gastrointestinal events (nausea, vomiting, diarrhoea) consistent with known GLP-1 pharmacology, non-specific terms like “drug ineffective,” and terms that are under active safety review (ileus, suicidal ideation). The age distribution reflects the actual prescribing population for diabetes and obesity treatment — predominantly middle-aged adults — combined with any systematic underreporting in elderly or pediatric populations. The sex distribution reflects both the prescribing population and any sex-based differential in MedWatch reporting propensity (research suggests women report adverse events at higher rates than men for many drug classes).

Data limitations and research notes

Researchers working with FAERS for the first time consistently make two systematic errors: treating report counts as incidence rates, and failing to restrict the DRUG table to primary and secondary suspect role codes before computing drug-reaction associations. Both errors produce grossly misleading results. Report counts are not incidence rates because the denominator (exposure) is unknown. Including concomitant medications as if they were suspected agents inflates drug-event pair counts by orders of magnitude for drugs like acetaminophen, aspirin, and metformin that appear as concomitant medications in enormous numbers of reports across all indications.

For academic pharmacovigilance research using the bulk files, the recommended analytical workflow is: (1) download all quarterly packages for the time period of interest; (2) stack and deduplicate by caseid, retaining only the latest caseversion; (3) join DRUG, REAC, and OUTCO on primaryid; (4) filter DRUG to role_cod IN ('PS', 'SS') before computing drug-reaction pairs; (5) apply EBGM or PRR with appropriate minimum count thresholds; (6) validate signals against clinical plausibility and product labeling. Tools including OpenVigil 2.1 (an open-source web application from the University of Kiel) and WHO's VigiBase provide pre-built disproportionality analysis interfaces for FAERS and the international spontaneous reporting data, respectively.

FAERS data is published under open-access terms and is freely downloadable for any use. The openFDA API's terms of service permit commercial use. MedDRA terminology requires a separate license for commercial use; the LLT-to-PT mapping and SOC hierarchy are not included in FAERS downloads. The quarterly files include only PT terms, not MedDRA codes, so full hierarchy navigation (rolling up from PT to HLT, HLGT, SOC) requires a licensed MedDRA release for systematic work, though the PT terms themselves are human-readable and queryable as strings for most purposes.

Related writing

NHTSA FARS: The Federal Database Behind Every US Traffic Fatality Since 1975 — fatal accident reporting system, annual traffic fatalities, pedestrian trends, and NHTSA API analysis.

EEOC Discrimination Charges: The Federal Database Behind 80,000 Annual Workplace Bias Claims — EEOC charge statistics by basis, industry, and state, and the EEOC charge process.