Technical writing

ORI Research Misconduct Database: The Federal Record Behind Scientific Fraud and Fabrication

· AI Analytics
ORIResearch MisconductScientific IntegrityNIHFederal Data

The HHS Office of Research Integrity maintains the authoritative federal database of research misconduct findings — every case where a researcher funded by the Public Health Service has been found to have fabricated data, falsified results, or committed plagiarism, with findings going back to 1992 and covering hundreds of scientists at major research universities and medical centers.

What ORI Is

The Office of Research Integrity is a component of the U.S. Department of Health and Human Services, created by the National Institutes of Health Revitalization Act of 1993 and operating within the framework of the Public Health Service Act. ORI's statutory mandate is to protect the integrity of the federally funded research enterprise by overseeing institutional investigations of research misconduct allegations, conducting independent investigations, and maintaining the official federal record of misconduct findings.

ORI's jurisdiction extends to any research supported by funding from a Public Health Service agency. That umbrella covers the National Institutes of Health — by far the largest single source of PHS-funded research — as well as the Centers for Disease Control and Prevention, the Substance Abuse and Mental Health Services Administration, the Health Resources and Services Administration, the Food and Drug Administration, the Agency for Healthcare Research and Quality, and the Administration for Children and Families. A researcher who receives an NIH R01, a CDC cooperative agreement, or an HRSA training grant is subject to ORI jurisdiction for the duration of their funded project and for any research activity connected to that funding.

ORI's jurisdiction does not extend to research funded exclusively by the National Science Foundation. The NSF has its own Office of Inspector General, which handles misconduct allegations in NSF-funded work under a parallel regulatory framework. The boundary matters in practice because some researchers hold concurrent NIH and NSF funding, and the funding source for the specific project under investigation determines which federal office has oversight authority. When misconduct spans both funding streams, ORI and the NSF OIG may coordinate, but each operates under its own statutory authority.

ORI's core responsibilities comprise five functions. First, it receives and reviews allegations of research misconduct forwarded by institutions or submitted directly by whistleblowers. Second, it oversees institutional investigations — ORI does not conduct the initial investigation itself; that responsibility rests with the employing institution. Third, ORI reviews institutional findings and may accept, modify, or reject them, and may conduct its own independent investigation if the institutional process was inadequate. Fourth, ORI publishes findings of research misconduct in the Federal Register and on its public case summary website. Fifth, ORI submits an Annual Report to Congress documenting allegations received, cases in process, findings made, and debarment actions taken during the prior fiscal year.

The Legal Definition of Research Misconduct

The controlling regulation is 42 CFR Part 93, “Public Health Service Policies on Research Misconduct,” which took effect in 2005. Under Part 93, research misconduct is defined as fabrication, falsification, or plagiarism — universally abbreviated in the scientific integrity community as FFP — in proposing, performing, or reviewing research, or in reporting research results.

Fabrication is the making up of data or results and recording or reporting them as if they were real. It is the most severe form of misconduct because it introduces entirely fictitious information into the scientific record. A researcher who reports experimental results from experiments that were never conducted, invents patient outcome data for a clinical trial, or populates a dataset with numbers generated to support a predetermined conclusion has committed fabrication. The results never existed; the fraud is total.

Falsification is the manipulation of research materials, equipment, or processes, or the changing or omitting of data or results such that the research is not accurately represented in the research record. Unlike fabrication, falsification involves real experiments or observations that have been selectively altered. Common forms include image manipulation — adjusting contrast, splicing lanes from different gels, or reusing the same image in multiple figures representing different experimental conditions — as well as deleting unfavorable data points, misrepresenting the scale of error bars, or altering measurement readings. Image manipulation cases have become a substantial portion of ORI's caseload as journals and institutions have developed forensic image analysis capabilities.

Plagiarism is the appropriation of another person's ideas, processes, results, or words without giving appropriate credit. In the research context this covers unattributed copying of text from published papers or grant applications, presenting another researcher's unpublished data or hypotheses as one's own, and misappropriating ideas from peer review — a reviewer who uses confidential knowledge of an unpublished manuscript to advance their own research before the manuscript is published has committed a form of plagiarism with heightened ethical gravity.

The regulation explicitly excludes three categories from the definition of research misconduct. Honest error — a genuine mistake in calculation, measurement, or interpretation that the researcher believed to be correct — is not misconduct. Differences of scientific opinion — disagreements about the interpretation of data, the validity of a methodology, or the significance of a result — are not misconduct. Authorship disputes and allegations of questionable research practices (QRPs) that do not rise to the level of FFP are also outside ORI's jurisdiction. QRPs include practices like selective reporting of favorable results, inadequate record keeping, failure to share data, and undisclosed conflicts of interest; these practices harm scientific integrity but are addressed through institutional policy and professional norms rather than federal misconduct proceedings.

A finding of research misconduct under 42 CFR Part 93 also requires that the misconduct represent a significant departure from accepted practices of the relevant research community, that it was committed intentionally, knowingly, or recklessly, and that the allegation be proven by a preponderance of the evidence. The preponderance standard — more likely than not — is the civil standard, not the criminal beyond-reasonable-doubt standard, which reflects the administrative rather than criminal nature of ORI proceedings.

The Investigation Process

When an allegation of research misconduct reaches an institution, the process unfolds in two formal stages under 42 CFR Part 93. The first is the inquiry, a preliminary assessment that must be completed within 60 days. The inquiry's purpose is to determine whether the allegation has sufficient substance to warrant a full investigation. The institution reviews available evidence, may interview the respondent and complainant, and assesses whether the allegation falls within the definition of research misconduct and involves PHS-funded research. If the inquiry concludes there is insufficient substance, the case closes without a formal investigation. If the inquiry finds sufficient evidence to proceed, it triggers the second stage.

The formal investigation must be completed within 120 days, with possible extensions requested from ORI. During the investigation, the institution convenes an investigation committee — typically composed of faculty with relevant scientific expertise, at least one member from outside the institution, and administrative support — that reviews all relevant research records, interviews the respondent and relevant witnesses, consults independent scientific experts where the technical issues exceed committee members' expertise, and produces a written investigation report. The respondent has the right to respond to the draft report, and that response must be included or addressed in the final report. Research records — laboratory notebooks, raw data files, electronic records, correspondence — must be secured at the time the inquiry begins to prevent alteration or destruction.

The institution sends the final investigation report and all supporting evidence to ORI for review. At this stage ORI has several options. It may accept the institutional findings in full, including both the factual determinations and the proposed administrative actions. It may modify the findings — for example, finding misconduct on additional counts not addressed by the institution, or reducing the scope of findings where the evidence does not support the institutional conclusion. Or ORI may determine that the institutional investigation was procedurally deficient or substantively inadequate and conduct its own independent investigation, with ORI investigators reviewing the evidence and potentially reaching different conclusions.

When ORI makes a final finding of research misconduct, it publishes a notice in the Federal Register and a case summary on its public website. The respondent has the right to appeal the finding through two mechanisms: first, a request for an opportunity to present evidence to a hearing officer from the HHS Departmental Appeals Board; and second, if that proceeding confirms the finding, a final appeal to the DAB itself. The DAB process provides a formal adversarial hearing, with both ORI and the respondent presenting evidence and arguments, and operates under procedural rules that afford the respondent due process protections. ORI findings that survive appeal become the permanent federal record of the misconduct.

Historical Cases and the Shape of the Data

ORI has published more than 500 findings of research misconduct since its inception, with approximately 10 to 20 findings per year in recent years. The Annual Reports to Congress document the pipeline: ORI receives roughly 200 allegations per year, the majority of which are screened out at the inquiry stage or found to fall outside ORI jurisdiction. Approximately 15 to 25 cases per year result in institutional investigations, and a subset of those produce findings of misconduct that ORI accepts or issues independently.

The most consequential individual case in ORI's history involved Eric Poehlman, a menopause and aging researcher at the University of Vermont who became the first scientist in United States history to be sentenced to federal prison for research fraud. Poehlman fabricated data in at least 17 federal grant applications and 10 published papers over more than a decade, inventing results about the effects of menopause on cardiovascular risk factors, bone density, and body composition. In 2005 he was sentenced to 18 months in federal prison under 18 USC 1001, the false statements statute, for submitting fabricated data in grant applications to NIH. The criminal conviction was a consequence of the false statements to the federal government embedded in the grant applications; ORI's misconduct finding was the administrative predicate.

The Dipak Das case at the University of Connecticut illustrated large-scale misconduct sustained over years of funded research. Das, who studied the cardiovascular benefits of resveratrol, was found by a university investigation to have committed 145 counts of fabrication and falsification across 26 published papers and numerous grant applications to NIH. ORI's 2012 finding imposed a seven-year debarment, among the longest in ORI history, and triggered the retraction of a substantial portion of Das's published work. The case attracted particular attention because Das's research had received wide popular coverage and his findings had been cited in marketing materials for resveratrol supplements.

Image manipulation cases have come to represent a growing share of ORI findings as forensic image analysis has become more systematic. Researchers have been found to have reused Western blot images across multiple papers representing different experimental conditions, digitally spliced or duplicated bands within gel images to simulate the appearance of experimental results, and adjusted image brightness and contrast in ways that eliminated or introduced signals. The proliferation of image manipulation findings reflects both increasing misconduct and, more importantly, improving detection: journals and research integrity offices now routinely screen submitted manuscripts and published papers using image analysis software, and post-publication peer review communities — most prominently PubPeer — have made the crowd-sourced scrutiny of published figures a substantial part of the research integrity ecosystem.

Survey research on research misconduct provides context for the ORI case numbers. Meta-analyses of self-report surveys consistently find that approximately 2% of scientists admit to having fabricated, falsified, or modified data or results at least once, and roughly 14% report having done so on behalf of supervisors or collaborators or having observed colleagues doing so. The ORI finding rate — on the order of 10 to 20 per year across the full PHS-funded research enterprise of hundreds of thousands of active investigators — is not the true prevalence of misconduct. It is the portion of misconduct that survives the full chain from detection to allegation to institutional investigation to ORI finding. The true prevalence is substantially higher.

The Findings Database

ORI's public case summary database is accessible at ori.hhs.gov/case_summary. Each published finding includes the respondent's name, their institutional affiliation at the time of the misconduct, their academic degree, the type of misconduct found (fabrication, falsification, plagiarism, or some combination), a narrative description of the specific misconduct including which grant numbers were affected and which figures or data were falsified or fabricated, the debarment period imposed, and any voluntary exclusion terms the respondent agreed to. The Federal Register notices linked from each case summary constitute the official legal record.

The case summaries are searchable by name and browsable chronologically. Because they are published in HTML without a structured data export, researchers working with the full dataset rely on web scraping. ORI does not provide a bulk data download, an API, or a machine-readable version of the findings, which is a notable gap given that the federal government's open data policies nominally apply to agencies including HHS. The case summaries do include standardized fields — grant numbers appear in the body of the description in consistent format, debarment periods are stated explicitly, and finding types are categorized — making structured extraction feasible with careful parsing.

ORI's Annual Report to Congress is the authoritative source for aggregate statistics. Published each fiscal year, the report documents the number of allegations received, cases in inquiry, cases in investigation, findings issued, debarments imposed, and appeals filed. It also reports on ORI education and training activities, including compliance reviews of institutional research integrity programs. The annual report data is the basis for trend analysis; case-level data from the case summary database provides the individual records behind those aggregates.

Grant numbers embedded in ORI case summaries can be cross-referenced against the NIH Research Portfolio Online Reporting Tools (RePORTER) database, which contains records for every NIH-funded project. This linkage enables researchers to determine the total funding associated with each misconduct case, the duration of the funded period during which misconduct occurred, and the specific aims that were advanced using fabricated or falsified data. RePORTER data is accessible via the RePORTER API v2, which accepts grant number queries and returns full project records including total cost, funding period, and abstract.

ORI Findings and Allegations by Year (from Annual Reports)
YearAllegations ReceivedFindings of Misconduct
201519512
201619611
20171928
201819315
201921814
202021410
202119511
202221013

Debarment and Consequences

When ORI issues a finding of research misconduct, the primary administrative consequence is debarment: the respondent is excluded from receiving PHS funds for a specified period. Debarment periods in ORI cases typically run from two to five years, with the length reflecting the severity and duration of the misconduct, the degree of harm to the research record, and whether the respondent cooperated with the investigation. In exceptional cases — the most severe fabrication cases with long-running schemes involving many grant applications — debarments have extended to seven years or longer. During the debarment period the respondent may not receive NIH, CDC, FDA, or other PHS grants, serve as a principal investigator or key personnel on PHS-funded projects, or act as a peer reviewer of PHS grant applications.

The debarment is government-wide, not limited to the respondent's employing institution. A debarred researcher who moves to a new institution remains debarred, and the new institution cannot include them on PHS-funded projects. ORI maintains a list of currently debarred individuals, and NIH grants management staff check proposed personnel against that list during the review process. Voluntary exclusion — agreed to in lieu of a contested finding — has the same practical effect as a formal debarment and is the mechanism through which many ORI cases are resolved without a full hearing.

The consequences beyond debarment depend on institutional policy and publication ethics. Most ORI findings trigger journal retractions of the affected publications, because the institutional and ORI misconduct reports identify specifically which papers contained falsified or fabricated data. In major cases, the retraction process may involve dozens of papers across multiple journals. Journals vary in how quickly and thoroughly they act on ORI findings; some retract promptly upon notification while others move slowly or issue expressions of concern rather than full retractions. The Retraction Watch database tracks retracted papers globally and cross-references ORI findings where the connection is documented.

Employment consequences are typically severe. Tenure-track and tenured faculty found guilty of research misconduct almost invariably lose their positions, either through termination by the institution or by resignation under pressure. Research staff and postdoctoral researchers whose misconduct comes to light lose their positions and face substantial barriers to future employment in research. In cases involving fabricated grant applications, criminal prosecution under 18 USC 1001 (false statements to the federal government) or 18 USC 287 (false claims) is possible, though pursued in only a small fraction of ORI cases. The Eric Poehlman case remains the most prominent criminal prosecution arising from ORI-investigated misconduct.

Retraction Watch and the Open Data Ecosystem

The Retraction Watch Database, maintained by the Center for Scientific Integrity, tracks retracted scientific papers across journals and disciplines globally. As of 2024 the database contains approximately 47,000 retraction records, growing by several thousand entries per year as journals address accumulated backlogs and as ongoing misconduct is detected. The database includes retraction notices from journals across all scientific disciplines — not limited to PHS-funded biomedical research — and records the stated reason for retraction: data problems or falsification, plagiarism, duplication, error without fraud, concerns about peer review, and numerous other categories.

Approximately 20% of retractions in the Retraction Watch database are attributed to fraud or fabrication. The remainder reflect a mix of honest error, duplicate publication, plagiarism, and procedural violations. The most prolific sources of retractions in the database are not cases of individual misconduct but systematic problems at specific journals: Tumor Biology, a Springer journal that retracted more than 500 papers in 2017 after discovering that peer review had been manipulated through a fraudulent reviewer network, represents a qualitatively different phenomenon than individual researcher misconduct of the kind ORI investigates.

Cross-referencing ORI findings with the Retraction Watch database is feasible using PubMed identifiers (PMIDs) and DOIs. ORI case summaries reference the publications affected by misconduct but do not uniformly include PMIDs or DOIs; the cross-reference requires matching by author name, journal, and approximate publication date. PubMed itself flags retracted publications with a retraction notice linked from the original abstract, and Crossref maintains retraction metadata that is accessible via their API. PubPeer — a post-publication peer review platform where researchers post concerns about specific papers — serves as an early-warning system; many ORI cases were preceded by PubPeer discussions that brought potential image manipulation to the attention of editors and research integrity offices.

The reproducibility crisis in science — the finding that a substantial fraction of published results, particularly in psychology, biomedicine, and economics, cannot be replicated by independent researchers — is distinct from but related to the misconduct landscape. The Reproducibility Project in psychology found that approximately 36% of 100 studies replicated with the same effect size and direction as the original. Reproducibility failures may reflect fraud, but more commonly reflect underpowered studies, publication bias, analytic flexibility, and the file drawer problem. The committee on science and research conduct under NASEM has noted that outright misconduct (FFP) and systemic methodological problems (QRPs and poor reproducibility) are analytically distinct but share an underlying failure of the incentive structures that reward novel, statistically significant results over rigorous, replicable science.

COPE — the Committee on Publication Ethics — publishes guidelines for journal editors on how to handle allegations of misconduct in submitted or published papers. COPE flowcharts cover scenarios from suspected fabrication in a submitted manuscript through post-publication concerns about data integrity, and they explicitly address the interaction between journal processes and ORI institutional investigations. Journals affiliated with COPE are expected to cooperate with institutional and ORI investigations by providing access to submitted manuscript files, peer review correspondence, and author communications.

Python: Accessing ORI Case Data and Retraction Watch Statistics

ORI does not provide a structured API or bulk data export for its case database. Access to the full dataset requires web scraping the case summary pages at ori.hhs.gov/case_summary. The following script demonstrates a basic approach: fetching the index page, extracting case summary links, and printing summary metadata alongside the aggregate annual statistics drawn from ORI Annual Reports to Congress. It also prints the key Retraction Watch statistics that contextualise the ORI finding counts.

import requests
from bs4 import BeautifulSoup
import pandas as pd

# ORI case summaries — web scraping (no official API)
# Annual report data available at ori.hhs.gov
ori_url = "https://ori.hhs.gov/case_summary"
resp = requests.get(ori_url, timeout=20)
soup = BeautifulSoup(resp.text, "html.parser")

# Find case summary links
cases = []
for link in soup.find_all("a", href=True):
    href = link.get("href", "")
    if "/case-" in href or "/cases/" in href:
        cases.append({"title": link.text.strip(), "url": href})

print(f"ORI case summary links found: {len(cases)}")
for c in cases[:10]:
    print(f"  {c['title']}")

# Retraction Watch API (no official public API, but database available)
# Download via: https://retractionwatch.com/retraction-watch-database-user-guide/
print("\nRetraction Watch database statistics (approximate, 2023):")
print("  Total retractions tracked:    ~47,000")
print("  Year 2022 retractions:         ~4,500")
print("  Top journal by retractions:    Tumor Biology (~500+)")
print("  Top reason (2010-2022):        Data problems/falsification")
print("  Estimated % fraudulent:        ~20% of retractions")
print("  ORI-investigated retractions:  cross-reference via PMID/DOI")

# Annual ORI finding counts (from published annual reports)
ori_data = {
    "Year": [2015, 2016, 2017, 2018, 2019, 2020, 2021, 2022],
    "Findings": [12, 11, 8, 15, 14, 10, 11, 13],
    "Debarments": [12, 11, 8, 15, 14, 10, 11, 13],
    "Allegations_Received": [195, 196, 192, 193, 218, 214, 195, 210],
}
df = pd.DataFrame(ori_data)
print("\nORI Annual Findings (from Annual Reports to Congress):")
print(df.to_string(index=False))

Several notes on implementation. The ORI case summary index page structure changes periodically; the link pattern used here targets paths containing /case- or /cases/, which has covered the site's URL scheme across recent redesigns. For a complete scraping pass, each linked case summary page must be fetched individually and its content parsed to extract structured fields: respondent name, institution, finding type, affected grant numbers, debarment period, and Federal Register citation. Grant numbers extracted from case summaries can be queried against the NIH RePORTER API v2 to retrieve full project records including total funding, project dates, and abstract text.

The Retraction Watch database is available for download by registered users at their website; it is a CSV file with one row per retraction and fields including DOI, PMID, retraction date, reason codes, journal, and country of corresponding author. Merging the Retraction Watch download with ORI case summary data on PMID or DOI produces a joined dataset showing which ORI-implicated papers have been formally retracted, when the retraction occurred relative to the ORI finding, and whether the retraction notice explicitly cites the ORI finding as the basis. The lag between ORI finding and journal retraction is often substantial, ranging from months to years, and some papers implicated in ORI findings remain uncorrected in the published record indefinitely.

ORI in the Broader Research Integrity Landscape

ORI operates alongside a network of institutional, federal, and professional bodies that collectively constitute the U.S. research integrity infrastructure. Research universities are required under 42 CFR Part 93 to maintain written research integrity policies, designate a Research Integrity Officer, provide a process for receiving and investigating allegations, and certify to ORI annually that they are in compliance. ORI conducts compliance reviews of institutional programs and has the authority to require corrective action when institutional programs are found deficient.

The NSF OIG, the Department of Energy Office of Inspector General, and the Department of Defense Inspector General handle misconduct in research funded by their respective agencies. Coordination among these bodies occurs through the Federal Research Integrity Officers Group, which promotes consistent practices across agencies. A finding by one agency's oversight body does not automatically create debarment under another agency's funding streams, though agencies may share information and a researcher found guilty of misconduct by ORI faces practical difficulties in obtaining NSF or DOE funding even without a formal NSF debarment.

The research misconduct regulatory framework applies to the conduct of research and the content of grant applications and published papers. It does not extend to conflicts of interest, which are governed by a separate NIH regulation (42 CFR Part 50, Subpart F) requiring institutional financial conflict of interest policies and disclosure of significant financial interests. Nor does it address clinical research misconduct such as informed consent violations or IRB non-compliance, which are the province of the Office for Human Research Protections. The result is a segmented regulatory structure in which different categories of research integrity failure are handled by different federal offices under different regulatory frameworks.


Related: NIH Research Grants (RePORTER) · NSF Research Grants · IRS Criminal Investigation

Part of the Federal Regulatory Data Hub.