Technical writing

CMS Provider Ownership: The Federal Database Behind Private Equity in Nursing Homes, Home Health, and Hospice

· 12 min read· AI Analytics
CMSPrivate EquityNursing HomesHealthcare OwnershipFederal Data

For nearly every nursing home, home health agency, hospice, and hospital that bills Medicare, the federal government now publishes a list of who owns it — not just the company on the sign out front, but the holding companies, management firms, real-estate trusts, and private equity funds stacked behind it, each tagged with its role, its percentage stake, and whether the owning entity was created specifically to carry out an acquisition. The CMS all-owners files, mandated by the ownership-disclosure rules at 42 CFR 455.104, are the closest thing the country has to an X-ray of who actually controls American post-acute and institutional care: roughly 279,707 ownership records for skilled nursing facilities alone, plus about 101,100 for home health agencies, 71,133 for hospices, and 147,332 for hospitals.

These files exist because of a simple, hard-won recognition: in the most heavily financialized corners of American healthcare, the name of the provider tells you almost nothing. A single nursing home may be operated by one company, owned in real estate by a second, financed by a third through a real estate investment trust, managed by a fourth, and ultimately controlled by a private equity fund through a chain of holding companies that never appears on any door, bill, or inspection report. The all-owners files pull that chain into the open, one row at a time. They are the data behind nearly every serious investigation of private equity in nursing homes, every map of hospice roll-ups, and every reconstruction of the real-estate structures that brought down hospital chains. This article is a field guide to what they contain and how to use them.

What it is, and the 2021+ ownership-transparency push

The legal spine of these files is old. Section 455.104 of Title 42 of the Code of Federal Regulations has, for decades, required Medicare and Medicaid providers to disclose the identity of any person or entity with a direct or indirect ownership interest of five percent or more, anyone with operational or managerial control, and certain managing employees. The disclosure was always collected at enrollment, through the Provider Enrollment, Chain, and Ownership System (PECOS). What changed in the 2020s was not the underlying obligation but the decision to publish it — to take the ownership data that had sat inside PECOS as administrative plumbing and release it as structured, downloadable open data on data.cms.gov.

The catalyst was a sustained policy push that began in earnest in 2021. The Biden administration made nursing-home ownership transparency an explicit priority: a February 2022 State of the Union commitment, followed by a set of CMS actions, framed the opacity of nursing home ownership — and the role of private equity in particular — as a patient-safety problem in its own right. The reasoning was that you cannot hold an owner accountable for the care delivered in a facility if you cannot identify the owner, and that the layering of holding companies and related parties had made identification nearly impossible from the outside. In September 2022, CMS published its first ownership data for skilled nursing facilities, releasing the enrollment-derived ownership records as a public file for the first time.

The push culminated in a formal rule. In November 2023, CMS finalized the Medicare and Medicaid Programs ownership-disclosure rule for skilled nursing facilities, which sharpened the definitions and, critically, required SNFs to identify whether each owner or managing entity is a private equity company or a real estate investment trust (REIT), along with additional detail on management companies and the parties that exercise control. That rule is the reason the published files now carry explicit entity-type flags for private equity and REIT ownership rather than leaving analysts to infer financial structure from corporate names. The all-owners files are the operational output of that rule, refreshed periodically and stored in our catalog as cms_snf_owners, cms_hha_owners, cms_hospice_owners, and cms_hospital_owners, one family of files per provider type.

The schema and role codes

The all-owners files share a common shape across provider types, which is what makes them so powerful to work with: the same code that parses the nursing-home file parses the hospice file. The fundamental unit is not the provider and not the owner but the relationshipbetween them. Each row records one owner's association with one provider, so a facility with a dozen disclosed owners contributes a dozen rows, and a single owner that controls fifty facilities appears in fifty rows. The roughly 279,707 SNF figure is a count of these provider-owner relationships, not a count of distinct nursing homes (there are on the order of 15,000 of those) and not a count of distinct owners.

The spine of each row carries:

The entity-type flags and the created-for-acquisition flag are the fields the 2023 rule added muscle to, and they are why these files are analytically interesting rather than merely administrative. Without them, an analyst would have to guess whether “Maple Grove Holdings LLC” is a family business or a private equity vehicle. With them, the file says so directly — subject, always, to the self-reporting caveats below.

How to trace a chain of ownership

The single most important thing to understand about post-acute healthcare ownership is that it is deliberately split into layers, and the all-owners file is built to let you walk those layers. The canonical structure is the operator / propco split. A nursing home is typically divided into an operating company (the “opco”), which holds the Medicare license, employs the staff, and bills the program, and a property company (the “propco”), which owns the building and land. The opco pays rent to the propco. When the propco is a REIT — or sells the real estate to a REIT in a sale-leaseback — the rent flows out of the operating entity to investors, and the operating company can be left thinly capitalized.

This matters because it determines where the money is and where the liability sits. The operating company is the entity that gets sued when care goes wrong and the entity whose margins determine staffing. If rent and management fees pull cash out of the opco and into affiliated propco and management entities, the operating company can run lean — or, in bankruptcy, can be the shell that absorbs the losses while the real estate and the fees are safely housed elsewhere. The all-owners file lets you see the split: the operating entity appears as the enrolled provider, while the propco, the management company, and the holding companies appear as owners with their respective role codes and entity-type flags.

Tracing a chain in practice means following the indirect ownership records and the holding-company flags upward. You start from a provider, list its disclosed owners, and look for entities flagged as holding companies or carrying indirect ownership interests. Each such entity is a pointer to a higher layer. The file does not always give you a clean parent-child edge from one entity to the next — that is one of its real limitations — but the combination of indirect-interest role codes, holding-company flags, shared owner addresses, and repeated owner names across many facilities lets you reconstruct the group structure with reasonable confidence. The signature of a sophisticated owner is a cluster of facilities that share the same handful of holding-company and management-company owners, with a private equity or REIT entity sitting at the apex.

The reason this reconstruction is worth the effort is related-party transactions. A large share of nursing-home spending flows not to arms-length vendors but to companies owned by the same people who own the facility — the propco that collects rent, the management company that collects a management fee, the staffing agency, the therapy company, the pharmacy. Money paid to a related party leaves the operating entity's books as an expense, which depresses reported facility profit even as it enriches the common owner. Cost reports capture the dollars; the all-owners file is what lets you identify which counterparties are related — which vendor is secretly the owner's own affiliate — by revealing the shared ownership behind the transacting entities. Without the ownership graph, a related-party payment is indistinguishable from an ordinary vendor payment.

Private equity in nursing homes

The reason this dataset became a federal priority is the body of research connecting private equity ownership of nursing homes to worse outcomes for residents. The landmark study is the National Bureau of Economic Research working paper by Atul Gupta, Sabrina Howell, Constantine Yannelis, and Abhinav Gupta, which assembled private equity nursing-home deals and matched them to resident-level Medicare data. Its central finding is stark: going to a private equity-owned nursing home was associated with a statistically significant increase in short-term mortality among residents, on the order of roughly ten percent relative to comparison facilities — an estimate the authors translated into thousands of additional deaths over the period studied. The paper also documented declines in frontline nurse staffing and increases in the kinds of charges and prescribing patterns consistent with cost-cutting and revenue-maximization after a PE buyout.

The mechanism the literature points to is the financial structure the all-owners file exposes. After a leveraged buyout, the facility carries debt, pays rent to a propco (often after a sale-leaseback that extracts the real-estate value upfront), and pays fees to affiliated management, therapy, and staffing entities. The pressure to service debt and generate returns falls on the operating margin, and the largest controllable cost in a nursing home is labor — specifically, registered-nurse hours. Lower staffing is the most consistently documented downstream effect of PE ownership, and lower nurse staffing is, separately, one of the most robust predictors of worse resident outcomes in the entire long-term-care literature. The mortality finding and the staffing finding are two ends of the same chain.

The cautionary corporate histories are well known in the sector. HCR ManorCare, one of the largest nursing-home operators in the country, was taken private by the Carlyle Group in a roughly six-billion-dollar leveraged buyout in 2007. Carlyle executed a large sale-leaseback of ManorCare's real estate, converting owned buildings into leased ones and saddling the operating company with substantial rent obligations. In the years that followed, ManorCare's operating finances deteriorated, regulatory and quality problems mounted, and the company ultimately landed in bankruptcy — a textbook illustration of the opco/propco/leverage pattern and its consequences. Genesis HealthCare, another of the nation's largest post-acute operators, followed a parallel arc: heavy lease obligations stemming from sale-leaseback financing, thin operating margins, and a slow-motion financial collapse that left one of the biggest names in the industry fighting for survival. Both stories are legible in ownership data: the propco split, the REIT-held real estate, the layered holding companies. The all-owners file is the instrument that makes the next ManorCare identifiable before the bankruptcy filing rather than after.

The home health and hospice rollup wave

Nursing homes get the headlines, but the most aggressive consolidation has been in home health and especially hospice, and the cms_hha_owners and cms_hospice_ownersfiles are the record of it. Hospice is structurally attractive to financial buyers for a specific reason: it is reimbursed by Medicare on a per-diem basis, paying a fixed daily rate for each enrolled patient largely regardless of the services actually delivered on a given day. A hospice that enrolls patients with long expected stays and keeps service intensity low can generate high margins on that per-diem, and there is no inpatient real estate to buy. The combination of recurring government revenue, low capital requirements, and fragmented mom-and-pop ownership is precisely the profile private equity roll-up strategies are built for.

The result has been a wave of acquisitions in which financial sponsors buy up dozens or hundreds of small agencies and consolidate them under shared holding and management companies. The same structure recurs: a platform company, a stack of holding entities, a management services company collecting fees, and a private equity owner at the top. The home health and hospice files expose this through exactly the fields built for it — the entity-type flags identify the PE and management-services owners, and the created-for-acquisition flag, read across association dates, traces the tempo of the buying spree. Counting created-for-acquisition events by year in the hospice file is one of the cleanest available measures of when and how fast the roll-up has proceeded.

The policy concern is sharpest in hospice because the per-diem incentive can run directly against patient interest: enrolling patients who are not actually terminally ill, or providing minimal care to maximize the margin on each enrolled day. CMS and the HHS Office of Inspector General have repeatedly flagged fraud and quality problems concentrated in newly enrolled for-profit hospices, particularly in a handful of states that saw explosive growth in hospice certifications. The ownership files are the tool for asking whether the agencies driving that growth share common owners — whether an apparent proliferation of independent hospices is in fact a single financial structure operating under many names.

REIT-financed hospital real estate

The same real-estate financialization reached general hospitals, and the most spectacular recent failure in American healthcare is the case study. Steward Health Care, a large for-profit hospital system, sold the real estate under its hospitals to Medical Properties Trust (MPT), a hospital-focused REIT, in a series of sale-leaseback deals. The transactions converted Steward's owned hospital buildings into leased ones, generating a large upfront cash infusion in exchange for long-term rent obligations — the hospital version of the opco/propco split that defines the nursing-home sector. Steward kept operating the hospitals; MPT collected the rent.

The structure unwound badly. Saddled with rent and debt, Steward's operating finances deteriorated; reports surfaced of unpaid vendors, deferred maintenance, and service cuts at hospitals serving vulnerable communities, and in 2024 Steward filed for one of the largest hospital bankruptcies in U.S. history, throwing the future of numerous community hospitals into doubt and drawing intense congressional scrutiny of both Steward and MPT. The episode became the emblematic warning about REIT-financed hospital real estate: that extracting the real-estate value from a hospital system and converting it into a rent obligation can hollow out the operating entity and put patient access at risk when the model strains. The hospital all-owners file (cms_hospital_owners) is where these REIT relationships are disclosed — the REIT entity flag and the ownership records are what let an analyst identify which hospitals sit inside a sale-leaseback structure before, rather than during, a crisis.

What you can do with it

Because the files establish ownership structure rather than judgment, their value is in mapping, counting, and joining. Several uses recur.

Map PE-owned and REIT-owned facilities. The most direct use is to filter the files to owners flagged as private equity companies or REITs and produce a national map of which facilities sit inside financial-sponsor or real-estate-trust structures — by state, by chain, by provider type. This is the foundational layer for almost every downstream question, and it is exactly what the published entity-type flags were added to enable.

Count created-for-acquisition events over time. Reading the created-for-acquisition flag against association dates turns the files into a time series of consolidation. Counting acquisition-flagged ownership entities by year, per provider type, produces a defensible measure of the tempo of merger-and-acquisition activity in nursing homes, home health, and hospice — the kind of trend line that is otherwise hard to assemble without paid deal databases.

Link owners across sectors to find multi-sector roll-ups. Because the SNF, home health, hospice, and hospital files share the same owner-identity structure, normalizing owner names and addresses across all four lets you find owners that operate in more than one sector — a private equity platform that holds nursing homes and the home health agencies and hospices that feed and follow them. These cross-sector roll-ups, which capture a patient across the continuum of post-acute care under common ownership, are invisible in any single file and only emerge when the owner identities are joined across all four.

Join to Care Compare quality and staffing. The highest-value join is to outcomes. The provider enrollment ID and CMS Certification Number link the ownership files to the CMS Care Compare quality datasets and to the payroll-based staffing data CMS publishes for nursing homes. Joined on the facility key, the ownership graph supplies the financial-structure dimension and Care Compare supplies the quality and staffing dimension, so you can ask directly whether PE-owned or REIT-leased facilities staff differently and perform differently than their peers — the empirical question at the center of the whole policy debate.

A worked example in Python

The workhorse analysis on these files is the ownership-structure scan: flag the private equity and REIT owners, find the largest multi-facility owners, and chart acquisition activity over time. The script below pulls the skilled nursing facility all-owners file from data.cms.gov through the data-api endpoint, normalizes the entity-type flags, computes how many facilities carry a PE or REIT owner, ranks the largest organizational owners by facility count, and counts created-for-acquisition events by year. Because all four provider files share the same shape, swapping the dataset identifier points the identical code at the home health, hospice, or hospital file.

import requests
import pandas as pd

# ---------------------------------------------------------------
# CMS data.cms.gov -- Skilled Nursing Facility All Owners file
# Catalog page: https://data.cms.gov/provider-characteristics/
#                hospitals-and-other-facilities/
#                skilled-nursing-facility-all-owners
#
# CMS publishes a parallel All Owners file for each provider type:
#   - SNF (skilled nursing facilities)
#   - Home Health Agency
#   - Hospice
#   - Hospital
# They share the same column shape, so the same code works on each;
# only the dataset UUID changes. Resolve the current UUID from the
# catalog page if a request 404s -- CMS re-versions these files.
#
# This script:
#   1. Pages the full SNF all-owners file through the data.cms.gov API
#   2. Flags owners whose entity type marks them as PE or REIT
#   3. Ranks the largest multi-facility owners
#   4. Counts "created for acquisition" M&A events by year
# ---------------------------------------------------------------

# Stable data.cms.gov dataset UUID for the SNF All Owners file.
DATASET_UUID = "REPLACE_WITH_CURRENT_SNF_OWNERS_UUID"
BASE = f"https://data.cms.gov/data-api/v1/dataset/{DATASET_UUID}/data"


def fetch_all(page_size: int = 5000) -> pd.DataFrame:
    """Page through the all-owners datastore endpoint.

    The data.cms.gov API returns JSON arrays; we walk size/offset
    until a short page signals the end. The SNF file is on the order
    of 280,000 ownership rows -- one row per (provider, owner) pair,
    so a single facility with twelve owners contributes twelve rows.
    """
    rows: list[dict] = []
    offset = 0
    while True:
        params = {"size": page_size, "offset": offset}
        resp = requests.get(BASE, params=params, timeout=120)
        resp.raise_for_status()
        page = resp.json()
        if not page:
            break
        rows.extend(page)
        print(f"  Fetched {len(rows):,} rows so far...")
        if len(page) < page_size:
            break
        offset += page_size
    return pd.DataFrame(rows)


# ---------------------------------------------------------------
# Step 1: Download the SNF all-owners file.
# ---------------------------------------------------------------
print("Downloading CMS SNF All Owners file...")
raw = fetch_all()
print(f"Raw ownership rows (provider x owner): {len(raw):,}")

# Column names drift across releases; normalise to lowercase snake.
raw.columns = [c.strip().lower().replace(" ", "_") for c in raw.columns]


def pick(df: pd.DataFrame, *candidates: str) -> str:
    """Return the first candidate column that exists in df."""
    for c in candidates:
        if c in df.columns:
            return c
    raise KeyError(f"none of {candidates} found in columns")


enroll_col = pick(raw, "enrollment_id", "associate_id", "enrollment_id_")
prov_col = pick(raw, "organization_name", "provider_name", "associate_id_owner")
owner_col = pick(
    raw,
    "associate_id_owner",
    "owner_name",
    "first_name",  # individuals are split first/last; fall back below
)
# The entity-type flags are published as separate yes/no columns.
# Their exact names vary, so detect them defensively.
TYPE_FLAGS = {
    "pe": pick(raw, "type_owner_pe", "private_equity_company_owner", "pe_owner"),
    "reit": pick(raw, "type_owner_reit", "reit_owner"),
    "holding": pick(raw, "type_owner_holding_company", "holding_company_owner"),
    "mgmt": pick(raw, "type_owner_management_services", "management_services_owner"),
    "forprofit": pick(raw, "type_owner_for_profit", "for_profit_owner"),
}
ACQ_FLAG = pick(raw, "created_for_acquisition_owner", "created_for_acquisition")
ROLE_COL = pick(raw, "role_text_owner", "role_code_owner", "association_type_owner")
PCT_COL = pick(raw, "percentage_ownership", "percent_ownership", "ownership_percentage")
DATE_COL = pick(raw, "association_date_owner", "association_date")


def is_yes(series: pd.Series) -> pd.Series:
    """CMS encodes the boolean flags as 'Y'/'N' (sometimes 'true')."""
    return series.astype(str).str.strip().str.upper().isin({"Y", "YES", "TRUE", "1"})


# ---------------------------------------------------------------
# Step 2: Flag PE and REIT ownership.
# ---------------------------------------------------------------
raw["is_pe"] = is_yes(raw[TYPE_FLAGS["pe"]])
raw["is_reit"] = is_yes(raw[TYPE_FLAGS["reit"]])
raw["is_pe_or_reit"] = raw["is_pe"] | raw["is_reit"]

pe_rows = raw[raw["is_pe"]]
reit_rows = raw[raw["is_reit"]]
pe_facilities = pe_rows[enroll_col].nunique()
reit_facilities = reit_rows[enroll_col].nunique()
total_facilities = raw[enroll_col].nunique()

print("\nOwnership-type prevalence across SNFs")
print("-" * 52)
print(f"  Distinct facilities (by enrollment id): {total_facilities:,}")
print(
    f"  With at least one PE owner:   {pe_facilities:,} "
    f"({pe_facilities / total_facilities:.1%})"
)
print(
    f"  With at least one REIT owner: {reit_facilities:,} "
    f"({reit_facilities / total_facilities:.1%})"
)


# ---------------------------------------------------------------
# Step 3: Rank the largest multi-facility owners.
#         Group by the owner identity and count the distinct
#         facilities each owner is associated with. For organisation
#         owners the org name is the natural key; individuals would
#         need first+last+address to disambiguate, so we restrict to
#         named organisational owners here.
# ---------------------------------------------------------------
org_owner_col = pick(raw, "associate_id_owner", "organization_name_owner", "owner_name")
named = raw[raw[org_owner_col].astype(str).str.strip().ne("")].copy()

top_owners = (
    named.groupby(org_owner_col)[enroll_col]
    .nunique()
    .rename("facilities")
    .reset_index()
    .sort_values("facilities", ascending=False)
)
# Mark whether each large owner is ever flagged PE or REIT.
flagged = (
    raw.groupby(org_owner_col)[["is_pe", "is_reit"]].max().reset_index()
)
top_owners = top_owners.merge(flagged, on=org_owner_col, how="left")

print("\nLargest multi-facility owners (top 25)")
print("-" * 52)
print(top_owners.head(25).to_string(index=False))


# ---------------------------------------------------------------
# Step 4: Acquisition activity by year.
#         The "created for acquisition" flag marks owners spun up to
#         hold a newly bought facility -- a usable M&A proxy. Pair it
#         with the association date to chart consolidation over time.
# ---------------------------------------------------------------
raw["created_for_acq"] = is_yes(raw[ACQ_FLAG])
raw["assoc_year"] = pd.to_datetime(raw[DATE_COL], errors="coerce").dt.year

acq = raw[raw["created_for_acq"]].dropna(subset=["assoc_year"])
by_year = (
    acq.groupby(acq["assoc_year"].astype(int))[enroll_col]
    .nunique()
    .rename("facilities_acquired")
    .reset_index()
    .sort_values("assoc_year")
)

print("\n'Created for acquisition' events by year")
print("-" * 52)
print(by_year.to_string(index=False))

The two fragile steps are the unit of analysis and the owner-name key. Every count has to be explicit about whether it is counting ownership rows, distinct facilities (deduplicated on the enrollment ID), or distinct owners — the roughly 279,707 SNF figure is rows, not facilities, and conflating the three is the most common mistake made with these files. The owner key is the second trap: organizational owners can be aggregated on their legal name with reasonable success, but the same fund routinely appears under slightly different spellings and through differently named holding companies, so any serious roll-up analysis needs a name-and-address normalization pass — collapsing “ABC Capital LLC” and “ABC Capital, L.L.C.” and a shared headquarters address into one owner — before the facility counts can be trusted. The script above does the naive grouping; production work adds the normalization layer.

Caveats and limits

Four limits govern any honest use of the all-owners files. The first, and most important, is that the data is self-reported. The ownership records originate in what the provider disclosed to CMS at enrollment and revalidation. An owner who wishes to obscure a relationship has incentives and, often, the structural means to do so, and CMS does not independently audit every disclosure. The entity-type flags — including the private-equity and REIT flags that make the file so useful — are likewise reported by the provider, which means a financial owner that does not characterize itself as private equity may not be flagged as such. The files are the best available view of ownership; they are not a verified ledger.

The second is indirect-ownership opacity. The whole point of the layered opco/propco/holding-company structure is to put distance between the named provider and the ultimate owner, and while the file captures indirect ownership interests, it does not always give a clean, traversable edge from each entity to its parent. Chains can break: an indirect owner may be disclosed without the intervening entity that connects it being identifiable, and the ultimate beneficial owner can remain a layer beyond what was reported. Reconstructing the full chain often requires combining the file with corporate-registry data, and even then the top of the stack can be a fund whose own investors are not public.

The third is name normalization. Owner names — both organizational and individual — are entered as free text and are wildly inconsistent: punctuation, abbreviations, “LLC” versus “L.L.C.,” trailing entity suffixes, misspellings, and the use of many distinct single-purpose entities by one sponsor all conspire to fragment a single real-world owner across many string values. Any analysis that counts facilities per owner, or that tries to link an owner across the four provider files, lives or dies on the quality of the name-and-address normalization, and naive grouping on the raw name will systematically undercount the largest, most sophisticated owners — precisely the ones most worth measuring.

The fourth is percentage gaps. The ownership-percentage field is incomplete by design: it is blank for control and managing-employee roles, it is reported only above the five-percent disclosure threshold so small stakes vanish, and the disclosed percentages across a single provider frequently fail to sum to one hundred — because indirect interests are reported at the level of each layer rather than reconciled to a single cap table, and because gaps and omissions are common. Treating the percentage field as a precise cap table will mislead; it is best read as an indicator of the presence and rough magnitude of an interest rather than an exact equity split. Taken with those four caveats in mind, the CMS all-owners files are the authoritative, openly downloadable record of who controls American post-acute and institutional care — and the only public instrument that makes the financial structures behind a facility visible before, rather than after, they fail.

Related writing

CMS Doctors and Clinicians covers the individual-clinician file that shares the same NPI and enrollment plumbing as these ownership records and links physicians to the groups and hospitals they bill under.

CMS hospital quality data covers the facility-level Care Compare outcomes and staffing measures that the provider enrollment ID and CCN in these ownership files join into.

EPA enforcement defendants is another federal database built on named-party records and corporate structure, with its own challenges around entity normalization and repeat-actor detection.