Since 2014 the Centers for Medicare & Medicaid Services has published a searchable federal database listing every payment, meal, royalty, research grant, and ownership stake that drug and medical device companies transfer to physicians and teaching hospitals in the United States. The database—Open Payments—covered $12.7 billion in transfers in program year 2022 alone, spanning roughly 2,700 reporting manufacturers and nearly 900,000 covered recipients. It is the most granular public record of financial relationships between the pharmaceutical and device industries and the clinicians who prescribe their products.
The Sunshine Act: legal foundation
Open Payments is the implementation of the Physician Payments Sunshine Act, enacted as Section 6002 of the Affordable Care Act in March 2010. Before the Sunshine Act, industry-to-physician financial relationships were disclosed only piecemeal—some states had their own reporting laws, a handful of companies made voluntary disclosures, and ProPublica had assembled a partial picture from those voluntary disclosures under its “Dollars for Docs” project, launched the same year. The federal law replaced that fragmented landscape with a uniform mandatory reporting system administered by CMS.
The statutory framework is codified at 42 U.S.C. § 1320a-7h. CMS issued the implementing regulations in February 2013 (78 Fed. Reg. 9457) with an effective date of August 1, 2013 for the start of data collection. The first public release of data—covering a partial year from August through December 2013—appeared on September 30, 2014. Annual releases covering full calendar program years have followed each September since.
Applicable manufacturers and GPOs
The reporting obligation falls on “applicable manufacturers”—entities operating in the United States that produce or sell a covered drug, device, biological, or medical supply that requires a prescription or that is reimbursable under Medicare, Medicaid, or CHIP. The implementing regulations set a de minimis threshold: manufacturers with total annual US revenues below $100 million from covered products are exempt. In practice this threshold captures all major pharmaceutical and medical device companies while excluding very small domestic producers.
Group purchasing organizations (GPOs)—entities that negotiate purchasing contracts on behalf of hospitals and health systems—must report ownership or investment interests held by covered recipients. They are not required to report general payments or research payments, a narrower obligation reflecting their intermediary rather than manufacturing role.
Covered recipients: the 2022 expansion
For the first eight program years (2013–2020), “covered recipients” comprised physicians (doctors of medicine, osteopathy, dental surgery, dental medicine, podiatric medicine, optometry, and licensed chiropractors) and teaching hospitals. The Coronavirus Aid, Relief, and Economic Security (CARES) Act of 2020 amended the Sunshine Act to expand covered recipients beginning with program year 2021 data to include five additional practitioner types: physician assistants, nurse practitioners, clinical nurse specialists, certified registered nurse anesthetists, and certified nurse-midwives. This expansion added hundreds of thousands of mid-level practitioners to the disclosure universe and accounts for a noticeable increase in record counts beginning with the 2022 program year release.
Teaching hospitals are defined for Open Payments purposes as hospitals that receive indirect medical education (IME) payments from Medicare, direct graduate medical education (DGME) payments, or Medicare psychiatric hospital IME payments—a list CMS publishes annually on the Open Payments website. Payments to teaching hospitals are reported separately from payments to individual physicians; a consulting fee paid to a department within a teaching hospital that does not go to a specific identifiable physician is a teaching hospital payment, not a physician payment.
The three reporting streams
The Sunshine Act and implementing regulations divide reportable transfers into three legally distinct streams, each published as a separate dataset in Open Payments.
General payments
General payments are the broadest category, covering any payment or other transfer of value from an applicable manufacturer to a covered recipient that is not a research payment or an ownership/investment interest. The regulation at 42 C.F.R. § 403.904 specifies the nature-of-payment categories that must be used when classifying each transfer. CMS provides a closed list of permitted values:
- Consulting fee: payment for advice or expertise rendered in a formal consulting arrangement—advisory board membership, promotional consulting, clinical advisory services
- Compensation for services other than consulting: speaker bureau payments, training fees, promotional speaking, manufacturer-sponsored medical education events
- Food and beverage: meals provided in a practice setting, at a conference, or during a sales call; aggregated across all occurrences during the year and reported as a single annual total per physician per manufacturer
- Travel and lodging: airfare, hotel, and related expenses paid directly or reimbursed
- Education: textbooks, subscriptions, continuing medical education tuition, anatomical models
- Gift: any non-cash item of value not classifiable under another category
- Entertainment: tickets, recreational activities, event attendance
- Royalty or license: payments for use of intellectual property, including royalties on medical device designs co-developed by the physician
- Honoraria: payments for speaking or other events where the recipient is recognized for their standing
- Grant: unrestricted educational grants that do not meet the definition of research payments
- Charitable contribution: donations made in a physician's name or at their direction
The reporting threshold for general payments is $10 per individual transfer. Transfers below $10 need not be reported unless the aggregate from a single manufacturer to a single covered recipient during the program year exceeds $100, in which case all transfers must be reported regardless of individual amount. Food and beverage is subject to a $10 per-meal threshold, and all food/beverage from a single manufacturer to a single recipient is aggregated into one annual record rather than reported per-meal, though individual meal dates and amounts must be captured internally for dispute purposes.
Research payments
Research payments cover transfers made in connection with a research agreement or research protocol. CMS distinguishes research payments from general payments because research funding is considered to have a different character than direct physician compensation: a manufacturer funding a clinical trial is not straightforwardly “buying” prescribing behavior in the same way a speaker fee might. Nonetheless, research payments are publicly disclosed because the financial relationship between a manufacturer and the researcher who publishes on its product creates a potential conflict of interest relevant to readers of that research.
Research payment records include the name of the research project, the name of the principal investigator, the total amount of the research payment, whether the research involves human subjects, and the ClinicalTrials.gov identifier if one has been assigned. Research payments account for the largest share of total Open Payments dollars: in 2022 the research payments dataset reported approximately $4 billion of the total $12.7 billion, reflecting the high cost of clinical trials and the fact that large per-site payments to teaching hospitals can be very large for multi-year trials.
Ownership and investment interests
The ownership and investment interest stream captures situations where a covered recipient holds an equity stake, stock options, partnership share, or other proprietary interest in an applicable manufacturer. These records do not represent a cash payment in the period; they represent a disclosed ongoing financial relationship. The values reported include the dollar value of the interest as of December 31 of the program year using fair market value or the value assigned in the agreement, as applicable.
Ownership interests account for the largest absolute dollar figure in Open Payments—approximately $6 billion of the 2022 total—because the valuation of equity stakes in pharmaceutical and medical device companies can be very large for founders, scientific advisory board members with early-stage equity, and physicians who participated in a company's formation. These are disclosed financial relationships, not payments flowing from manufacturer to physician in the program year. The practical importance of the ownership dataset for conflict-of-interest analysis is substantial: a cardiologist who holds $2 million in stock options in a device company whose pacemaker they implant has a different kind of conflict than one who received a $500 dinner.
Scale and leading companies
The 2022 Open Payments release reported $12.7 billion in total transfers across all three streams: general payments approximately $2.5 billion, research payments approximately $4.0 billion, and ownership/investment interests approximately $6.2 billion. These figures have grown steadily since the first full-year release in 2014, which showed approximately $6.5 billion in total transfers. The growth reflects expanded manufacturer participation, expanded covered recipient categories after the CARES Act, and growth in the underlying industry's spending on physician relationships.
Approximately 2,700 applicable manufacturers and GPOs submitted data in the 2022 program year, but the distribution is highly concentrated. Companies consistently appearing at the top of general payment totals include:
- Amgen: large consulting and speaker fees associated with biologics including Repatha (evolocumab, PCSK9 inhibitor), Evenity (romosozumab), Otezla (apremilast), and Enbrel; Amgen also carries major research payment activity from oncology trials
- Pfizer: broad portfolio spanning infectious disease (COVID-19 antivirals), oncology, cardiology, and rheumatology; speaker and consulting fees across a large recipient base
- Johnson & Johnson / Janssen: oncology (Darzalex, Imbruvica), immunology (Stelara, Tremfya), and the device subsidiary DePuy Synthes contributing royalty payments for orthopedic implant designs
- AbbVie: Humira (adalimumab) was the world's top-selling drug through 2022, and AbbVie's general payments to rheumatologists, gastroenterologists, and dermatologists who prescribe it are extensive; Skyrizi and Rinvoq speaker programs growing as Humira biosimilar competition arrived
- Medtronic: the largest pure-play medical device company; royalty payments to orthopedic, spinal, and cardiac surgeons who co-design or license technology; research payments for device trials
- Intuitive Surgical: the maker of the da Vinci robotic surgical system pays royalties to surgeons involved in developing robotic techniques and instruments; these royalty payments, disclosed under “royalty or license,” are among the largest individual physician payments in the dataset, sometimes exceeding $1 million per recipient per year
Among individual physician recipients, the highest payments typically appear in two categories: device royalties to orthopedic and neurosurgeons who hold patents on implant designs, and ownership/investment interests for physician-founders or early scientific advisors of biotechnology companies. Individual royalty recipients receiving more than $500,000 in a single year are not uncommon; payments to a handful of surgeon-inventors with foundational spinal implant or joint replacement patents have exceeded $10 million in a single year.
Dataset structure and field schema
CMS publishes Open Payments data at openpaymentsdata.cms.gov, with bulk CSV downloads and a Socrata API. Each program year produces three separate datasets: General Payments, Research Payments, and Ownership/Investment Interests. The schemas differ somewhat across streams but share a common core of manufacturer and recipient identification fields.
Key fields in the General Payments dataset:
- applicable_manufacturer_or_gpo_making_payment_name: the name of the reporting company as registered with CMS; the primary field for manufacturer-level filtering
- covered_recipient_npi: the 10-digit NPI of the physician or non-physician practitioner receiving the payment; teaching hospital payments do not carry an NPI in this field
- covered_recipient_first_name / covered_recipient_last_name: name of the recipient as registered in the NPI registry
- covered_recipient_primary_type_1: physician credential type (e.g., “Medical Doctor,” “Doctor of Dental Surgery”); not the clinical specialty
- covered_recipient_specialty_1: the primary specialty of the covered recipient as reported by the manufacturer, typically derived from the NPI registry taxonomy code; the field to use for specialty-level analysis (e.g., “Allopathic & Osteopathic Physicians|Cardiology”)
- total_amount_of_payment_usdollars: the dollar value of the individual transfer; food and beverage is reported as an annual aggregate per manufacturer–recipient pair
- date_of_payment: the date the payment was made or the transfer occurred
- nature_of_payment_or_transfer_of_value: the closed-list category from the regulated taxonomy described above
- product_category_or_therapeutic_area: the therapeutic area associated with the payment (e.g., “Cardiovascular,” “Hematology/Oncology”) as reported by the manufacturer
- name_of_drug_or_biological_or_device_or_medical_supply_1: the specific product name associated with the payment, if any; a single payment record can be associated with up to five products using sequentially numbered versions of this field (
_2through_5) - dispute_status_for_publication: indicates whether the covered recipient has disputed this record; disputed records are published but flagged
- record_id: unique identifier for the payment record; stable across the life of a program year release
NPI linkage is one of the most valuable features of the Open Payments schema. Because covered recipient NPI is a standard field, Open Payments records can be joined directly to the CMS National Plan and Provider Enumeration System (NPPES) NPI registry, which provides physician specialty (via taxonomy codes), practice address, and other attributes. This enables analysts to enrich Open Payments data with geographic and specialty detail beyond what is in the Open Payments schema itself, and to cross-reference with other NPI-linked datasets such as Medicare Part D prescriber data to study the relationship between payments and prescribing behavior.
Public access, journalism, and research use
ProPublica's “Dollars for Docs” project, launched in October 2010 from voluntary company disclosures, was the proof of concept that physician payment data could be assembled, cleaned, and made searchable for the public. ProPublica built the first searchable database covering seven companies—including Eli Lilly, AstraZeneca, Cephalon, and GlaxoSmithKline—that had made voluntary or settlement-mandated disclosures. By 2013 ProPublica had expanded the database to cover 15 companies and approximately $2 billion in payments. When CMS launched the federal Open Payments database in September 2014, ProPublica updated “Dollars for Docs” to use the federal data as its primary source and has continued to update it annually.
Mainstream investigative reporting routinely uses Open Payments. The New York Times, Wall Street Journal, NPR, and Bloomberg News have all published major investigations using the database, examining which oncologists receive the largest payments from checkpoint inhibitor manufacturers; how psychiatric drug speaker fees correlated with off-label prescribing under investigation; how opioid manufacturers targeted high-prescribing physicians with speaking and consulting payments in the years before the crisis; and how device royalty payments to surgeons create incentives to implant proprietary hardware.
Academic medical journals have used Open Payments as a primary data source for conflict-of-interest research. The dataset appears in hundreds of published studies in JAMA, the New England Journal of Medicine, JAMA Internal Medicine, BMJ Open, and Health Affairs. Common research designs include: linking a physician's Open Payments history to their Medicare Part D prescribing data; measuring whether journal article authors correctly disclosed conflicts present in Open Payments; and analyzing the size and composition of industry payments in specific specialties or therapeutic areas.
Research on prescribing impact
The central question that motivated the Sunshine Act—whether pharmaceutical and device payments to physicians influence clinical decisions—has been examined extensively using Open Payments and its predecessor datasets. The evidence is consistent across multiple methods and therapeutic areas, though the causal mechanism remains contested.
A landmark study by Carey, Lieber, and Whitney, published in the American Economic Review: Insights in 2021, used a difference-in-differences design exploiting the timing of pharmaceutical company sales representative visits and meal payments. The study found that receipt of even a single meal from a drug company—average value approximately $20—was associated with a statistically and economically significant increase in prescribing of the sponsor's drug compared to therapeutic alternatives. The effect was detectable for three drug classes (statins, antidepressants, and hypertension medications) and persisted for months after the meal. The magnitude was not trivial: meals were associated with a 4.6–5.0 percentage point increase in the probability of prescribing the promoted drug.
DeJong et al. (2016), published in JAMA Internal Medicine, used Open Payments data linked to Medicare Part D prescribing to examine whether physicians who received any industry meal for a given drug were more likely to prescribe that drug. The study covered four drug classes and found significant associations for all four: physicians receiving brand-drug meals prescribed the promoted drug at higher rates than non-recipients, controlling for physician and geographic factors. The dose-response relationship was also significant— physicians receiving more meals prescribed at higher rates.
Yeh et al. (2016), also in JAMA Internal Medicine, focused on oncology and the relationship between payments from cancer drug manufacturers and prescribing of high-cost biologics. Oncologists receiving payments from targeted therapy manufacturers had significantly higher rates of prescribing those manufacturers' drugs, after adjusting for patient mix and practice setting.
Psychiatry has been a focal area for payment-prescribing research because of the extensive speaker bureau programs that antipsychotic and antidepressant manufacturers ran in the 2000s and 2010s. Kornfield et al. (2013) found that psychiatrists who received antipsychotic speaker fees prescribed more atypical antipsychotics, and that second-generation antipsychotics with large speaker programs gained market share among speaker-physicians faster than would be predicted by clinical evidence alone.
The methodological debate centers on selection versus causation. Pharmaceutical companies do not select physicians randomly for speaker programs and consulting arrangements—they target physicians who are already high prescribers of their products, who have large practices, and who have influence over other prescribers in their network. Under this selection interpretation, the observed correlation reflects companies finding their natural allies rather than creating new ones. The most credible studies address this by exploiting quasi-random variation in payment receipt or by using within-physician longitudinal designs that control for fixed physician characteristics. These designs generally still find positive effects of payments on prescribing, suggesting the observed correlation is not entirely explained by selection.
Payment patterns by specialty
The distribution of Open Payments dollars across physician specialties reflects the economics of the pharmaceutical and device industries. High-value targets for industry payments are specialties with large prescribing volume of expensive products, specialties with influence over other prescribers, and surgeons with the ability to select implants and devices.
Orthopedic surgery and neurosurgery receive the largest device royalty payments in the dataset. Hip and knee implant systems, spinal fusion hardware, and neurostimulators are frequently developed in collaboration between device companies and practicing surgeons, who hold patents and receive per-unit royalties on sales of their co-designed products. It is not unusual for a prominent orthopedic surgeon who holds foundational patents on a widely adopted implant system to receive several million dollars per year in royalties disclosed under the “royalty or license” nature-of-payment category.
Cardiology receives substantial payments from cardiac device companies (pacemakers, implantable cardioverter-defibrillators, transcatheter heart valves) and from pharmaceutical manufacturers of lipid-lowering agents (statins, PCSK9 inhibitors), anticoagulants, and heart failure drugs. Cardiologists serve on advisory boards, conduct clinical trials, and deliver educational talks for companies whose products they use daily; cumulative payments to the specialty are among the highest in the dataset after orthopedic surgery.
Primary care physicians (internal medicine and family medicine) receive lower average payments per physician than specialists but collectively represent a large share of general payment activity because of the scale of the specialty. Pharmaceutical companies invest heavily in primary care prescribing influence for products prescribed across a broad population (statins, antidepressants, antihypertensives, diabetes drugs). Speaker fees and meals are the predominant payment types in primary care; royalty payments are rare.
Oncology combines pharmaceutical consulting and speaking fees (for checkpoint inhibitors, targeted therapies, and supportive care drugs) with research payment activity from clinical trials. Oncologists who serve as principal investigators on industry-funded trials appear in both the general and research payment streams. The cost of cancer drugs has made oncology one of the most commercially important specialties for pharmaceutical companies, and the size of speaker and advisory board payments reflects this.
Psychiatry and neurology have historically been among the specialties with the most prominent speaker bureau activity, particularly for antidepressants, antipsychotics, and mood stabilizers. Enforcement actions against off-label promotion of antipsychotics—resulting in billion-dollar settlements with AstraZeneca, Eli Lilly, Pfizer, and others—curtailed some of the most aggressive programs, but psychiatric drug payments continue to appear prominently in Open Payments data.
Drug-specific linkage and emerging patterns
One of the most analytically powerful features of the Open Payments schema is the product-level linkage. Because manufacturers must report the specific drug, biologic, device, or medical supply associated with each payment, it is possible to track industry spending at the product level: how much did manufacturers of PCSK9 inhibitors pay cardiologists in a given year? Which physicians received payments from Novo Nordisk for semaglutide (Ozempic and Wegovy) as the drug moved from a niche diabetes treatment to a mass-market obesity product?
The semaglutide case illustrates how Open Payments data can track the evolution of a commercial launch. In the 2020 and 2021 program years, semaglutide-linked payments were concentrated among endocrinologists and diabetologists. Beginning in 2022, as Wegovy received FDA approval for obesity and Novo Nordisk began a large speaker and advisory program, semaglutide-linked payments expanded dramatically to include primary care physicians, obesity medicine specialists, and bariatric surgeons—a clear signature in the Open Payments data of a commercial strategy pivot toward the broader obesity market. Analysts following the GLP-1 agonist market can use Open Payments to track Eli Lilly's competing tirzepatide (Mounjaro/Zepbound) commercial investment in real time as each annual release appears.
For AbbVie and Humira (adalimumab), Open Payments data from 2014 through 2022 documents the investment AbbVie made in maintaining prescriber relationships for the world's top-selling drug throughout that period. As Humira biosimilars entered the US market in 2023, the 2022 program year data shows both continued Humira-associated payments and a notable ramp-up in payments associated with Skyrizi (risankizumab) and Rinvoq (upadacitinib)—AbbVie's planned successor products. This sequential pattern of investment is visible in the product-level payment data before it appears in market share figures.
Insulin manufacturers—Eli Lilly (Humalog, Basaglar), Novo Nordisk (Novolin, Levemir, Tresiba), and Sanofi (Lantus, Toujeo)—have appeared prominently in Open Payments data for payments to endocrinologists and primary care physicians. These payment records became newsworthy in the context of insulin pricing controversies: critics argued that manufacturer investment in physician relationships was partly a strategy to maintain branded insulin prescribing in the face of biosimilar competition.
The dispute process and data limitations
Before Open Payments records are published each September, CMS provides a 45-day review and dispute window during which covered recipients can examine the records attributed to them and initiate disputes. A dispute does not prevent publication: disputed records are published with the fielddispute_status_for_publication set to “Yes,” flagging them for users. If a dispute is resolved in the recipient's favor—either because the manufacturer corrects or retracts the record—the corrected or deleted record appears in the next data refresh. The dispute process is intended to catch genuine errors (misattributed NPIs, incorrect amounts, records that belong to another physician with a similar name) rather than to allow recipients to suppress accurate records.
Despite the dispute mechanism, data quality issues persist. Common problems include name variations—a manufacturer may report “Jonathan” while the NPI registry shows “Jon,” making exact-match queries miss records—and specialty misclassification, where the specialty reported by the manufacturer does not match the taxonomy code in the NPI registry. NPI matching itself can fail if the manufacturer uses a physician's NPI from a prior employer or if the physician has multiple active NPIs. CMS has improved matching algorithms over successive annual releases but acknowledges residual error rates.
The Sunshine Act has significant coverage gaps that users should understand. The law does not capture:
- Payments below the thresholds: transfers under $10 per transaction that do not aggregate to $100 from a single manufacturer; these can be numerous even if individually small
- Pharma company employment: a physician who is a part-time or full-time employee of a manufacturer is not necessarily captured in Open Payments in the same way as a consulting relationship; employment compensation may be excluded if it meets certain criteria
- Spouse or family member holdings: if a physician's spouse holds stock in a drug company whose products the physician prescribes, that holding does not appear in Open Payments (though it may be required to be disclosed in journal articles and CMS enrollment forms)
- Non-covered products: payments tied to products that are not covered drugs, devices, biologics, or medical supplies—including most dietary supplements, cosmetics, and certain over-the-counter products—are not required to be reported
- Non-US manufacturers: the reporting obligation applies to manufacturers “operating in the United States;” a foreign manufacturer with no US operations and no direct sales to US physicians is not covered, though US distributors or subsidiaries may be
Researchers who rely on Open Payments to characterize a physician's financial conflicts should treat the database as a disclosure floor, not a complete inventory of all financial relationships. A clean Open Payments record is necessary but not sufficient evidence that a physician has no conflicts.
API access and Python example
CMS publishes Open Payments through the Socrata platform at openpaymentsdata.cms.gov. The API follows standard Socrata conventions: each dataset has a unique identifier, queries are constructed using the Socrata Query Language (SoQL) via URL parameters, and output formats include JSON, CSV, and GeoJSON. No API key is required for read access at typical analytical query volumes, though heavy bulk downloads are better accomplished via the direct CSV download links rather than the API.
The following Python script demonstrates four common query patterns: fetching all payments to a specific physician by NPI; retrieving large consulting fees above a dollar threshold; pulling all payments from a named manufacturer; and searching for payments linked to a specific drug by name. Summary functions analyze payment type distribution and identify the top physician recipients within a result set.
import requests
import pandas as pd
import io
# ---------------------------------------------------------------------------
# CMS Open Payments -- Socrata API access
# Source: openpaymentsdata.cms.gov (mirrors to data.cms.gov/open-payments)
# Three datasets, each with an annual Socrata endpoint:
# General Payments (GP) -- consulting fees, meals, speaker fees, etc.
# Research Payments (RP) -- industry-funded research activities
# Ownership/Investment -- stock, options, partnership interests
#
# Dataset IDs below are for the program year 2022 release (as of 2025).
# CMS assigns a new Socrata ID for each annual release; verify at:
# https://openpaymentsdata.cms.gov/dataset
# No API key required for read access at typical query volumes.
# ---------------------------------------------------------------------------
SOCRATA_BASE = "https://openpaymentsdata.cms.gov/resource"
# 2022 program year dataset IDs (verify against CMS catalog for later years)
GP_DATASET_ID = "w4ky-vbzm" # General Payments 2022
RP_DATASET_ID = "vq63-hu5i" # Research Payments 2022
OI_DATASET_ID = "e8vh-33q6" # Ownership / Investment Interests 2022
def fetch_general_payments_by_npi(npi: str, limit: int = 10000) -> pd.DataFrame:
"""
Fetch all general payment records for a specific covered recipient NPI.
NPI is the 10-digit National Provider Identifier.
"""
url = f"{SOCRATA_BASE}/{GP_DATASET_ID}.csv"
params = {
"$where": f"covered_recipient_npi = '{npi}'",
"$limit": limit,
"$order": "total_amount_of_payment_usdollars DESC",
}
resp = requests.get(url, params=params, timeout=60)
resp.raise_for_status()
df = pd.read_csv(io.StringIO(resp.text), dtype=str, low_memory=False)
return df
def fetch_large_consulting_fees(
min_amount: float = 50000,
limit: int = 5000,
) -> pd.DataFrame:
"""
Fetch general payment records where nature_of_payment is 'Consulting Fee'
and total_amount exceeds min_amount.
"""
url = f"{SOCRATA_BASE}/{GP_DATASET_ID}.csv"
where = (
f"nature_of_payment_or_transfer_of_value = 'Consulting Fee' "
f"AND total_amount_of_payment_usdollars > {min_amount}"
)
params = {
"$where": where,
"$limit": limit,
"$order": "total_amount_of_payment_usdollars DESC",
}
resp = requests.get(url, params=params, timeout=120)
resp.raise_for_status()
df = pd.read_csv(io.StringIO(resp.text), dtype=str, low_memory=False)
return df
def fetch_manufacturer_payments(
manufacturer_name_fragment: str,
limit: int = 100000,
) -> pd.DataFrame:
"""
Fetch all general payment records from a manufacturer whose name contains
the given fragment (case-insensitive substring match via Socrata LIKE).
"""
url = f"{SOCRATA_BASE}/{GP_DATASET_ID}.csv"
params = {
"$where": (
f"UPPER(applicable_manufacturer_or_gpo_making_payment_name) "
f"LIKE '%{manufacturer_name_fragment.upper()}%'"
),
"$limit": limit,
}
resp = requests.get(url, params=params, timeout=180)
resp.raise_for_status()
df = pd.read_csv(io.StringIO(resp.text), dtype=str, low_memory=False)
return df
def fetch_drug_payments(drug_name_fragment: str, limit: int = 50000) -> pd.DataFrame:
"""
Fetch general payment records tied to a specific drug or biological product.
The name_of_drug_or_biological_or_device_or_medical_supply_1 field links
each payment to the product the manufacturer associated with it.
"""
url = f"{SOCRATA_BASE}/{GP_DATASET_ID}.csv"
params = {
"$where": (
f"UPPER(name_of_drug_or_biological_or_device_or_medical_supply_1) "
f"LIKE '%{drug_name_fragment.upper()}%'"
),
"$limit": limit,
"$order": "total_amount_of_payment_usdollars DESC",
}
resp = requests.get(url, params=params, timeout=120)
resp.raise_for_status()
df = pd.read_csv(io.StringIO(resp.text), dtype=str, low_memory=False)
return df
def summarize_by_nature(df: pd.DataFrame) -> None:
"""
Summarize a general payments DataFrame by nature_of_payment, showing
total dollars and record count for each payment type.
"""
df = df.copy()
df["amount"] = pd.to_numeric(df["total_amount_of_payment_usdollars"], errors="coerce")
summary = (
df.groupby("nature_of_payment_or_transfer_of_value")["amount"]
.agg(total="sum", count="count")
.sort_values("total", ascending=False)
)
print(f"{'Payment Type':<40} {'Total ($)':>14} {'Records':>10}")
print("-" * 66)
for ptype, row in summary.iterrows():
print(f"{str(ptype):<40} ${row['total']:>13,.0f} {row['count']:>10,}")
print(f"{'TOTAL':<40} ${summary['total'].sum():>13,.0f} {summary['count'].sum():>10,}")
def top_physician_recipients(df: pd.DataFrame, n: int = 20) -> None:
"""
Print the top-N individual physician recipients by total dollar amount
received from the records in df.
"""
df = df.copy()
df["amount"] = pd.to_numeric(df["total_amount_of_payment_usdollars"], errors="coerce")
df["full_name"] = (
df["covered_recipient_first_name"].fillna("") + " " +
df["covered_recipient_last_name"].fillna("")
).str.strip()
df["specialty"] = df["covered_recipient_primary_type_1"].fillna("Unknown")
agg = (
df.groupby(["covered_recipient_npi", "full_name", "specialty"])["amount"]
.agg(total="sum", payments="count")
.sort_values("total", ascending=False)
.head(n)
)
print(f"{'NPI':<12} {'Physician':<35} {'Specialty':<30} {'Total':>12} {'Pmts':>6}")
print("-" * 97)
for (npi, name, spec), row in agg.iterrows():
print(
f"{str(npi):<12} {name[:34]:<35} {spec[:29]:<30} "
f"${row['total']:>11,.0f} {row['payments']:>6,}"
)
def main() -> None:
# -----------------------------------------------------------------------
# Example 1: Large consulting fees (> $50,000 per payment)
# -----------------------------------------------------------------------
print("=== Large Consulting Fees (> $50,000) ===")
consult_df = fetch_large_consulting_fees(min_amount=50000)
print(f"Records returned: {len(consult_df):,}")
top_physician_recipients(consult_df)
print()
# -----------------------------------------------------------------------
# Example 2: Payments tied to semaglutide (Ozempic / Wegovy)
# -----------------------------------------------------------------------
print("=== Semaglutide-Linked Payments (Ozempic / Wegovy) ===")
sema_df = fetch_drug_payments("SEMAGLUTIDE")
print(f"Records: {len(sema_df):,}")
top_physician_recipients(sema_df)
print()
# -----------------------------------------------------------------------
# Example 3: All AbbVie general payments (Humira, Skyrizi, Rinvoq)
# -----------------------------------------------------------------------
print("=== AbbVie General Payments 2022 ===")
abbvie_df = fetch_manufacturer_payments("ABBVIE")
print(f"Records: {len(abbvie_df):,}")
summarize_by_nature(abbvie_df)
if __name__ == "__main__":
main()
The manufacturer name fragment search uses a Socrata LIKE clause with wildcards; wrapping both sides in UPPER() normalizes for case sensitivity. The drug name search queries only the first drug field (name_of_drug_or_biological_or_device_or_medical_supply_1); records with multiple associated products use fields _2 through _5, so a comprehensive drug search should query all five fields using an OR clause or load the full dataset and filter in pandas. For bulk analysis of a full program year—tens of millions of rows across all three streams—the direct CSV download from openpaymentsdata.cms.gov is considerably faster than paginating through the API.
Cross-referencing Open Payments with the NPPES NPI registry (downloadable in full from npiregistry.cms.hhs.gov as a monthly CSV file) unlocks geography- and taxonomy-based analysis: which zip codes have the highest per-physician payment concentrations? How do payments to cardiologists in the Midwest compare to those in academic medical centers on the coasts? Which taxonomy codes receive the largest aggregate payments from a given manufacturer? Because both datasets use NPI as the primary physician identifier, the join is straightforward and adds the full NPPES taxonomy hierarchy to every Open Payments record.
For journalists and compliance officers who need a searchable interface rather than raw data access, the CMS Open Payments search tool at openpaymentsdata.cms.gov and ProPublica's “Dollars for Docs” at projects.propublica.org/docdollars both provide name-search interfaces over the full program year history. Both should be used with awareness that searching by name is unreliable without NPI confirmation, because physician name collisions are common in a database of 900,000 covered recipients.
For the hospital-side view of CMS financial data, see CMS Medicare Inpatient Provider Data: The Hospital-Level Payment Records Behind $170 Billion in Annual DRG Reimbursements, which covers the IPPS payment formula, DRG relative weights, the charge-to-payment gap, and how to query hospital-level Medicare payment records via the Socrata API.
For the patent and market exclusivity context that shapes which branded drugs drive the largest Open Payments activity, see FDA Orange Book: The Patent and Exclusivity Database Behind Every Generic Drug Launch, which explains how Orange Book patent listings and exclusivity designations determine when branded-drug manufacturers face generic competition—and therefore how long the commercial incentive to invest in physician relationships for a given product persists.