Technical writing

NVD CVE Database: The Federal Record of Every Known Software Vulnerability

· 12 min read· AI Analytics
NISTNVDCVECybersecurityFederal Data

Every patch advisory, every penetration-test finding, every vulnerability scanner alert, and every security headline about a critical flaw points at the same kind of identifier — a string like CVE-2021-44228. The NIST National Vulnerability Database is the federal record that turns those identifiers into structured, comparable data. The tables described here hold roughly 459,000 catalogued vulnerabilities — about 109,053 in the current set and 350,125 in the historical archive — each one carrying a CVE ID, a description, a CVSS severity score, a CWE weakness type, the affected products, and the reference URLs. It is the layer that makes the world's catalogue of known software vulnerabilities something you can actually query, rank, and prioritize.

What it is, and the CVE-versus-NVD distinction

The single most common point of confusion in vulnerability data is the difference between a CVE and the NVD, so it is worth settling first. The two are produced by different organizations and do different jobs, and almost every practical question about the data depends on keeping them separate.

CVE — Common Vulnerabilities and Exposures — is the naming system. It is run by MITRE, a non-profit operating a federally funded research center, under sponsorship from the Cybersecurity and Infrastructure Security Agency (CISA). The CVE program does one thing extremely well: it assigns a unique, stable identifier to each publicly disclosed vulnerability so that everyone — researchers, vendors, scanners, defenders — can refer to the same flaw by the same name. The identifier has the form CVE-YYYY-NNNNN, where the year is the year the ID was reserved (not necessarily the year the flaw was found or disclosed) and the numeric suffix is a sequence that has grown well past five digits. Crucially, the IDs are issued in a federated way: MITRE delegates blocks of numbers to CNAs, CVE Numbering Authorities, which are vendors and organizations — Microsoft, Red Hat, Google, GitHub, and hundreds of others — authorized to assign CVE IDs and write the initial description for vulnerabilities in their own products or scope. The base CVE record is therefore lightweight: an ID, a description, references, and increasingly some CNA-supplied metadata.

NVD — the National Vulnerability Database — is the enrichment layer. It is run by NIST, the National Institute of Standards and Technology, a US Department of Commerce agency. NVD does not assign CVE IDs and does not decide what counts as a vulnerability; it ingests the CVE list and adds the analytical structure that turns a free-text advisory into machine-usable data. For each CVE, NVD analysts (historically) attach a CVSS severity score, map the flaw to a CWE weakness type, and enumerate the affected productsusing CPE identifiers. That enrichment is what makes the NVD the workhorse of vulnerability management: a scanner or a risk model wants the score, the weakness category, and the product match, and those come from NVD, not from the bare CVE entry.

The relationship, then, is layered. MITRE and its CNAs assign and describe; NIST's NVD scores, categorizes, and matches. The dataset documented here is the NVD view — the CVE list seen through NIST's enrichment — which is why each row carries not just the ID and description but the CVSS scores, the CWE mapping, the CPE-matched products, and the published and last-modified dates that track when NVD created and revised its analysis. The split into a current table and a larger archive simply reflects scale: the full corpus spans from the CVE program's late-1990s origins to the present, and the archive holds the long historical tail while the current table carries the recent, actively maintained records.

-- nvd_cves           ~109,053 current CVE records
-- nvd_cves_archive   ~350,125 historical CVE records
-- Combined           ~459,178 catalogued vulnerabilities, keyed by CVE ID.

cve_id                 TEXT     -- the CVE identifier, e.g. CVE-2021-44228 (CVE-YYYY-NNNNN)
description            TEXT     -- English summary of the flaw, supplied by the CNA
cvss_v3_base_score     REAL     -- CVSS v3.1 base score, 0.0 - 10.0 (often NULL on older CVEs)
cvss_v3_severity       TEXT     -- v3.1 qualitative band: NONE / LOW / MEDIUM / HIGH / CRITICAL
cvss_v3_vector         TEXT     -- v3.1 vector string, CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H
cvss_v2_base_score     REAL     -- legacy CVSS v2 base score, 0.0 - 10.0 (pre-2016 CVEs)
cvss_v2_severity       TEXT     -- v2 qualitative band: LOW / MEDIUM / HIGH
cvss_v2_vector         TEXT     -- v2 vector string, AV:N/AC:L/Au:N/C:C/I:C/A:C
cwe_id                 TEXT     -- weakness type from the CWE catalogue, e.g. CWE-79, CWE-89
affected_products      TEXT     -- matched CPE URIs (cpe:2.3:a:vendor:product:version:...) as array
published_date         DATE     -- date the CVE was published to the NVD
last_modified_date     DATE     -- date the NVD record was last revised
references             TEXT     -- advisory, patch, and exploit URLs with reference tags (array)

The CVSS scoring system

The field that does the most work in this dataset is the CVSS score. CVSS, the Common Vulnerability Scoring System, is an open standard maintained by FIRST (the Forum of Incident Response and Security Teams) for expressing the severity of a vulnerability as a number from 0.0 to 10.0, paired with a qualitative band. It is what lets an organization with thousands of open vulnerabilities sort them into an order of attention.

CVSS is built in three metric groups, and understanding the division explains both the power and the limits of the score. The Base group captures the intrinsic, immutable characteristics of the vulnerability itself — how it can be exploited and what an exploit would cost the victim — and it is the only group NVD publishes. The Temporal group adjusts for factors that change over time, such as whether mature exploit code exists and whether a patch is available. The Environmental group adjusts for the specific deployment: how critical the affected asset is to a given organization, and what compensating controls are in place. The decisive practical point is that NVD only ever supplies the base score. Temporal and environmental adjustments are meant to be applied by the consuming organization, because only the organization knows its own exposure. Treating an NVD base score as a complete risk assessment is the most common misuse of the data: the base score is the starting input to prioritization, not the final answer.

The base score is computed from a small set of metrics encoded in the vector string, and reading the vector is more informative than the headline number. In CVSS v3.1, a vector looks likeCVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H, and the components are:

This is why the canonical worst case — the 9.8 Critical — reads the way it does. A 9.8 is almost always AV:N/AC:L/PR:N/UI:N with high impact to confidentiality, integrity, and availability: an attacker anywhere on the network, with no privileges and no victim interaction, can reliably and fully compromise the system. (A perfect 10.0 additionally requires a changed scope.) Log4Shell, CVE-2021-44228, scored exactly 10.0 for this reason, and the qualitative bands follow the number: 0.0 None, 0.1–3.9 Low, 4.0–6.9 Medium, 7.0–8.9 High, and 9.0–10.0 Critical.

The dataset carries scores under two versions, and the transition matters for any longitudinal analysis. CVSS v2 was the standard through roughly 2015; its vector uses a different, coarser model (for example an Authentication metric and only a three-band Low/Medium/High severity scale, with the High band starting at 7.0). CVSS v3.0 arrived in 2015 and v3.1 in 2019, introducing the Scope metric, splitting authentication into Privileges Required and User Interaction, and adding the Critical band. NVD scored CVEs published from late 2015 onward primarily under v3, while retaining v2 for older records and dual-scoring during the overlap. The consequence is concrete: v2 and v3 scores are not interchangeable, the same flaw can land in different severity bands under each, and a trend line that mixes v2 and v3 base scores across the 2015–2016 boundary is comparing two different rulers. (CVSS v4.0 was published in 2023 and is beginning to appear, adding still more granularity, but the bulk of the corpus remains v3.1 and v2.)

The CWE weakness taxonomy

Where CVSS answers “how bad is it,” CWE answers “what kind of mistake is it.” CWE — the Common Weakness Enumeration, also a MITRE-stewarded community standard — is a hierarchical catalogue of software and hardware weakness types: the underlying flaw categories that give rise to vulnerabilities. A CVE is a specific vulnerability in a specific product; the CWE it maps to is the general class of defect behind it. NVD records this mapping in the cwe_id field, and it is what lets you analyze vulnerabilities by root cause rather than one at a time.

The catalogue is large — many hundreds of weakness entries arranged in parent/child relationships — but in practice a small number of categories dominate, and MITRE publishes an annual CWE Top 25 Most Dangerous Software Weaknesses ranking the categories that appear most often in real, frequently exploited vulnerabilities. The perennial leaders are worth knowing by sight:

Two caveats temper the cwe_id field. First, not every CVE carries a usable CWE: NVD uses the placeholders NVD-CWE-noinfo (insufficient information to map) andNVD-CWE-Other (no suitable Top-level category), and a meaningful fraction of records, especially recent ones awaiting analysis, are simply unmapped. Second, a CWE mapping is a judgment call, and the same root cause can plausibly map to a parent or a child category, so cross-source CWE statistics need the hierarchy in mind. Used carefully, though, the CWE field is the lens that turns the NVD into an instrument for studying why software fails, not just which products did.

CPE product matching, and why it is imperfect

For a vulnerability to be actionable, you have to know what it affects, and that is the job of CPE — the Common Platform Enumeration. CPE is a structured naming scheme for products, with a formatted-string form such ascpe:2.3:a:apache:log4j:2.14.1:*:*:*:*:*:*:*, where the fields encode the part (a for application, o for operating system, h for hardware), the vendor, the product, the version, and a series of further qualifiers. NVD's enrichment attaches to each CVE a set of CPE applicability statements — in the data, the matched product configurations — that express which products and version ranges are vulnerable, including range expressions like “all versions before 2.15.0.” This is the machinery that lets a scanner say “you are running an affected version” and lets an analyst pivot from a CVE to every product it touches, or from a product to every CVE against it.

It is also the most error-prone part of the dataset, and honesty about that is essential. CPE matching is imperfect for several structural reasons. The vendor and product strings are drawn from a controlled dictionary that lags reality, so newly affected products may not yet have a CPE, and the same product can appear under inconsistent vendor or product names across entries, fragmenting a clean join. Version-range expressions are only as good as the analyst's reading of the advisory, and a subtly mis-stated boundary can include or exclude versions wrongly. Many CVEs — particularly recent ones during the enrichment backlog discussed below — carry no CPE data at all, which means a query that filters by affected product will silently miss them. And CPE describes products at a coarseness that does not always capture the configuration nuance that determines real exposure. The upshot for practitioners: CPE is indispensable for automated matching, but a clean CPE join is a lower bound on affected systems, not a complete one, and serious vulnerability management treats CPE coverage gaps as an expected condition rather than an anomaly.

The 2024 NVD enrichment backlog

Everything above describes how the NVD is supposed to work. In 2024 it visibly stopped working at the usual pace, and the episode reshaped the vulnerability-data ecosystem in ways that still matter.

Beginning around February 2024, NIST sharply slowed its enrichment of incoming CVEs. New identifiers kept arriving from the CNAs, but NVD stopped attaching CVSS scores, CWE mappings, and CPE data to most of them — the records were published, yet they sat unanalyzed. Within months a large fraction of newly published CVEs lacked the enrichment that downstream tools depend on, and the backlog grew into the tens of thousands. NIST attributed the slowdown to a combination of an increasing volume of CVEs, changes in interagency support, and a need to rework its processes, and it stood up contractor support and consortium efforts to work the queue down, but the recovery was gradual and the backlog of unenriched and partially enriched records persisted well beyond 2024.

The crisis had two lasting consequences. The first is the rise of CISA Vulnrichment: CISA launched a program to add enrichment — CVSS, CWE, and exploitation/automatability context in the form of SSVC decision data — directly to CVE records, publishing it in the CVE program's own data feed rather than waiting on NVD. The second is a structural shift toward CNA-provided CVSS: the CVE record format had already been extended so that CNAs can include their own CVSS scores at the time of assignment, and the backlog accelerated reliance on those vendor-supplied scores as a substitute for NVD's. The practical effect is that the NVD is no longer the sole authoritative source of CVSS scores. For recent CVEs especially, a score may come from the CNA, from CISA, or from NVD, and these can differ. Any analysis touching post-2023 vulnerabilities has to reckon with which source provided the enrichment, and with the simple fact that a missing NVD score does not mean the vulnerability is unimportant — it may mean only that NVD has not gotten to it. This is why the data model distinguishes Primary (NVD) from Secondary (CNA) scoring, and why the worked example below prefers but does not require the Primary score.

How NVD relates to CISA KEV and EPSS

A CVSS score tells you how severe a vulnerability could be in the abstract, but it says nothing about whether anyone is actually attacking it. Two complementary federal-adjacent datasets fill that gap, and using the NVD well means joining to them.

CISA KEV — the Known Exploited Vulnerabilities catalog — is CISA's authoritative list of CVEs for which reliable evidence of active, in-the-wild exploitation exists. It is a small, curated subset of the CVE universe — on the order of a thousand-plus entries against nearly half a million CVEs — precisely because most catalogued vulnerabilities are never actually exploited. KEV is binary and high-confidence: a CVE is either on it or not, and inclusion comes with remediation due dates for federal agencies. Joining the NVD to KEV on CVE ID is one of the highest-value moves in vulnerability management, because it lets you isolate the vulnerabilities that are not merely theoretically severe but demonstrably being used by attackers right now.

EPSS — the Exploit Prediction Scoring System — is the probabilistic complement, also maintained under FIRST. Rather than a binary “exploited / not,” EPSS assigns each CVE a probability, from 0 to 1, that it will be exploited in the wild within the next 30 days, generated by a model trained on exploit, threat-intelligence, and vulnerability features and refreshed daily. EPSS is designed to answer the prioritization question that CVSS alone cannot: among the thousands of High and Critical CVEs you cannot patch all at once, which are most likely to actually be attacked. The maturing best practice combines all three: CVSS for intrinsic severity, KEV for confirmed exploitation, and EPSS for forward-looking exploitation likelihood — with the NVD as the spine that supplies the CVSS score and the CVE-keyed record everything else hangs off.

What you can do with it

The NVD is most valuable as the backbone of vulnerability prioritization and trend analysis. The most immediate use is vulnerability-management prioritization: rank the open vulnerabilities in an environment by CVSS base score to triage them, then layer the contextual signals on top — intersect with CISA KEV to surface the actively exploited ones that demand immediate action, and sort by EPSS to order the rest by exploitation likelihood rather than severity alone. A ranking that uses CVSS, KEV, and EPSS together consistently produces a far more defensible patch queue than CVSS severity by itself, which tends to drown teams in a sea of undifferentiated High and Critical findings.

A second use is CVE trend analysis. Because every record carries a vendor and product (via CPE), a weakness type (via CWE), a severity, and publication dates, the corpus supports questions about the shape of the vulnerability landscape over time: which vendors and products accumulate the most CVEs, how the mix of CWE weakness types is shifting (the long rise of out-of-bounds writes, for instance), and whether the Critical share of disclosures is trending up or down. A third use is operational: measuring time-to-patch by pairing a CVE's published date with the date an organization actually remediated it, turning the NVD into the reference clock for an organization's own remediation SLAs. A fourth is SBOM and CPE matching: given a software bill of materials, resolve each component to its CPE and join against the NVD's applicability data to enumerate the known vulnerabilities a piece of software inherits from its dependencies — the automated core of modern software supply-chain security. And underneath all of them is the join to KEV for an actively-exploited focus, which repeatedly proves to be the single most effective filter for cutting a vast vulnerability list down to the part that matters this week.

A worked Python example

The NVD exposes its data through the NVD 2.0 REST API atservices.nvd.nist.gov, which is the modern replacement for the retired 1.x feeds. The API is queryable by publication and modification date windows (bounded to 120 days per request), by CVSS severity band, by CWE, by CPE, by keyword, and more, and it pages with startIndex andresultsPerPage up to 2,000 records per page. No API key is required, but the keyless rate limit is strict — roughly five requests per rolling thirty seconds — so any real workload wants a free API key (which raises the limit roughly tenfold) and generous sleeps between pages. The script below works without a key by pacing itself.

It pulls a quarter of recently published CVEs, aggregates them by CWE weakness type and by CVSS v3.1 severity band, and then walks several quarters to compute the Critical share of disclosures over time — using a count-only query (asking for a single result purely to read totalResults) to get the numerator and denominator for each quarter without downloading every record. Note how it prefers the NVD's own Primary CVSS analysis but falls back to a CNA-supplied Secondary score, and how it treats unscored records explicitly — both direct accommodations to the enrichment backlog described above.

import requests
import time
from collections import Counter, defaultdict
from datetime import date

# ---------------------------------------------------------------------------
# NIST NVD 2.0 REST API: pull recent CVEs by severity, aggregate by CWE,
# and track the Critical share over time.
#
# Endpoint: https://services.nvd.nist.gov/rest/json/cves/2.0
# Docs:     https://nvd.nist.gov/developers/vulnerabilities
#
# No API key is required, but the public rate limit is punishing:
#   ~5 requests per rolling 30 seconds without a key,
#   ~50 requests per rolling 30 seconds with a free key.
# Request a key at https://nvd.nist.gov/developers/request-an-api-key and
# pass it in the "apiKey" header. We sleep generously between pages.
#
# The 2.0 API pages with startIndex/resultsPerPage (max 2,000 per page) and
# is bounded by a 120-day window per pubStartDate/pubEndDate request.
# ---------------------------------------------------------------------------

BASE_URL = "https://services.nvd.nist.gov/rest/json/cves/2.0"
PAGE_SIZE = 2000           # API maximum
SLEEP_SECONDS = 6.5        # stay under the keyless 5-per-30s ceiling
API_KEY = None             # put your free key here to raise the limits


def fetch_window(pub_start: str, pub_end: str, cvss_severity: str | None = None) -> list[dict]:
    """Page through all CVEs published in a <= 120-day window.

    pub_start / pub_end are ISO-8601 with a time component and offset, e.g.
    "2024-01-01T00:00:00.000Z". cvss_severity filters server-side on the
    v3.1 band: LOW, MEDIUM, HIGH, or CRITICAL.
    """
    headers = {"apiKey": API_KEY} if API_KEY else {}
    results: list[dict] = []
    start_index = 0
    while True:
        params = {
            "pubStartDate": pub_start,
            "pubEndDate": pub_end,
            "startIndex": start_index,
            "resultsPerPage": PAGE_SIZE,
        }
        if cvss_severity:
            params["cvssV3Severity"] = cvss_severity
        resp = requests.get(BASE_URL, params=params, headers=headers, timeout=60)
        resp.raise_for_status()
        payload = resp.json()
        batch = payload.get("vulnerabilities", [])
        results.extend(batch)
        total = payload.get("totalResults", 0)
        print(f"  Fetched {len(results):,} / {total:,} CVEs...")
        start_index += PAGE_SIZE
        if start_index >= total or not batch:
            break
        time.sleep(SLEEP_SECONDS)   # respect the rate limit between pages
    return results


def primary_cvss_v3(cve: dict) -> tuple[float | None, str | None]:
    """Extract the v3.1 base score and severity, preferring the NVD's
    own (Primary) analysis over a CNA-supplied (Secondary) score."""
    metrics = cve.get("metrics", {})
    candidates = metrics.get("cvssMetricV31", []) + metrics.get("cvssMetricV30", [])
    primary = [m for m in candidates if m.get("type") == "Primary"]
    chosen = (primary or candidates or [None])[0]
    if not chosen:
        return None, None
    data = chosen.get("cvssData", {})
    return data.get("baseScore"), data.get("baseSeverity")


def primary_cwe(cve: dict) -> str:
    """Return the first concrete CWE id, or a sentinel for the common
    'no mapping' and 'insufficient information' placeholders."""
    for weakness in cve.get("weaknesses", []):
        for desc in weakness.get("description", []):
            value = desc.get("value", "")
            if value.startswith("CWE-"):
                return value
            if value in ("NVD-CWE-noinfo", "NVD-CWE-Other"):
                return value
    return "UNMAPPED"


# Human-readable labels for the CWE Top 25 mainstays we expect to see.
CWE_LABELS = {
    "CWE-79":  "Cross-site Scripting (XSS)",
    "CWE-89":  "SQL Injection",
    "CWE-787": "Out-of-bounds Write",
    "CWE-125": "Out-of-bounds Read",
    "CWE-20":  "Improper Input Validation",
    "CWE-22":  "Path Traversal",
    "CWE-352": "Cross-Site Request Forgery (CSRF)",
    "CWE-78":  "OS Command Injection",
    "CWE-416": "Use After Free",
    "CWE-862": "Missing Authorization",
}


# ---------------------------------------------------------------------------
# Step 1: Pull one quarter of CVEs and aggregate by CWE weakness type.
# ---------------------------------------------------------------------------
WIN_START = "2024-01-01T00:00:00.000Z"
WIN_END = "2024-03-31T23:59:59.999Z"   # <= 120 days, satisfies the API bound

print("Fetching CVEs published 2024-Q1 ...")
cves_q1 = fetch_window(WIN_START, WIN_END)
print(f"Retrieved {len(cves_q1):,} CVE records for the window.\n")

cwe_counts: Counter = Counter()
severity_counts: Counter = Counter()
for entry in cves_q1:
    cve = entry["cve"]
    cwe_counts[primary_cwe(cve)] += 1
    _, sev = primary_cvss_v3(cve)
    severity_counts[sev or "UNSCORED"] += 1

print("Top 12 Weakness Types (CWE) in 2024-Q1")
print("-" * 60)
print(f"{'CWE':<10} {'Count':>7}  {'Weakness':<38}")
print("-" * 60)
for cwe_id, n in cwe_counts.most_common(12):
    label = CWE_LABELS.get(cwe_id, "(see cwe.mitre.org)")
    print(f"{cwe_id:<10} {n:>7,}  {label:<38}")

print("\nSeverity Distribution (CVSS v3.1)")
print("-" * 36)
for band in ["CRITICAL", "HIGH", "MEDIUM", "LOW", "UNSCORED"]:
    n = severity_counts.get(band, 0)
    print(f"{band:<10} {n:>7,}")


# ---------------------------------------------------------------------------
# Step 2: Critical share over time. Walk several quarters and, for each,
#         ask the API only for the Critical count vs the total count.
#         totalResults on a filtered query gives us the numerator cheaply.
# ---------------------------------------------------------------------------
def window_count(pub_start: str, pub_end: str, cvss_severity: str | None = None) -> int:
    """Return just totalResults for a window, without downloading records."""
    headers = {"apiKey": API_KEY} if API_KEY else {}
    params = {
        "pubStartDate": pub_start,
        "pubEndDate": pub_end,
        "resultsPerPage": 1,        # we only want the count
    }
    if cvss_severity:
        params["cvssV3Severity"] = cvss_severity
    resp = requests.get(BASE_URL, params=params, headers=headers, timeout=60)
    resp.raise_for_status()
    return resp.json().get("totalResults", 0)


QUARTERS = [
    ("2023-Q4", "2023-10-01T00:00:00.000Z", "2023-12-31T23:59:59.999Z"),
    ("2024-Q1", "2024-01-01T00:00:00.000Z", "2024-03-31T23:59:59.999Z"),
    ("2024-Q2", "2024-04-01T00:00:00.000Z", "2024-06-30T23:59:59.999Z"),
]

print("\nCritical Share by Quarter")
print("-" * 48)
print(f"{'Quarter':<10} {'Total':>8} {'Critical':>9} {'Crit %':>8}")
print("-" * 48)
for label, start, end in QUARTERS:
    total = window_count(start, end)
    time.sleep(SLEEP_SECONDS)
    crit = window_count(start, end, cvss_severity="CRITICAL")
    time.sleep(SLEEP_SECONDS)
    pct = (100.0 * crit / total) if total else 0.0
    print(f"{label:<10} {total:>8,} {crit:>9,} {pct:>7.1f}%")

# NOTE: a rising Critical share can reflect genuinely worse vulnerabilities,
# or it can reflect the enrichment backlog -- if NVD has not yet scored a
# tranche of recent CVEs, "UNSCORED" inflates and the scored mix skews.
# Always read the share against the UNSCORED count from Step 1.

The pattern generalizes readily. Swap the date-window parameters for a cpeName filter to pull every CVE matched to a specific product and version — the SBOM use case — or for acweId filter to study a single weakness class across the whole corpus. Replace the severity aggregation with a join, on CVE ID, to a locally cached copy of the CISA KEV catalog to reduce any result set to its actively-exploited subset, or to an EPSS export to attach exploitation probabilities and re-rank. The NVD record is the spine: the CVSS score, the CWE class, and the CPE match come from it, and each additional source — KEV, EPSS, your own asset inventory — hangs a further dimension off the same CVE ID.

Caveats and limits

CPE coverage is incomplete, so product joins undercount. A large number of CVEs carry no CPE applicability data — chronically for some recent records, and acutely for the tranche affected by the enrichment backlog — and the controlled product dictionary lags new and renamed products. A query that filters by affected product therefore returns a floor, not a complete set, and version-range boundaries can be subtly wrong. Treat a CPE-based affected-systems count as a lower bound and reconcile vendor and product name variants before trusting an aggregate.

The enrichment backlog distorts recent data. Since early 2024, a substantial share of newly published CVEs has lacked NVD-supplied CVSS, CWE, and CPE for extended periods. This means a missing score is not evidence of low severity, recent CVE counts and severity mixes can be artifacts of what NVD has and has not analyzed rather than of the underlying vulnerabilities, and any trend line crossing 2024 must account for the gap. The rise of CISA Vulnrichment and CNA-provided CVSS partly fills the void, but it also means scores for recent CVEs now come from multiple authorities that can disagree, and you must track provenance (Primary vs Secondary) deliberately.

CVSS base scores are debated and can be inflated. The base score deliberately omits temporal and environmental context, so a 9.8 in the NVD is a worst-case intrinsic rating, not a statement that the flaw is dangerous in your deployment. Critics note that the scoring distribution skews high — a great many vulnerabilities land in High or Critical — which erodes the score's usefulness as a triage signal when used alone, and that scoring is partly subjective, so independent analysts sometimes assign different bases to the same CVE. The remedy is not to discard CVSS but to refuse to use it in isolation: pair it with KEV and EPSS, and apply environmental context, before treating any number as a priority.

Reserved and rejected CVEs are part of the count. A CVE ID can be reserved before details are public, appearing as a placeholder with no description or scores; and an ID can be rejected or marked a duplicate, withdrawn after assignment, yet it persists in the record for stability. Raw counts of CVE IDs therefore overstate the number of real, current vulnerabilities unless you filter on status, and a year-over-year “CVE volume” figure that ignores reserved and rejected entries is measuring identifier issuance as much as actual flaw discovery. Read with these limits in mind, the NVD remains the indispensable, openly accessible federal record of what is known to be wrong with the world's software — the scored, categorized, product-matched foundation that nearly every other vulnerability tool is built on top of.


Related writing

FDA Device Classification Database is another federal taxonomy built around a risk-graded identifier — product codes and Class I/II/III tiers rather than CVEs and CVSS bands — where, as with the NVD, a single key joins a catalogue to the downstream records of real-world activity.

SEC EDGAR Company Registry is the canonical example of a federal index whose value lies in a stable join key (the CIK), the same connective-tissue role the CVE ID plays across the vulnerability ecosystem of NVD, KEV, and EPSS.

FAA Aircraft Registry is a comparable federal record where a permanent identifier resolves to a rich attribute set and ties together otherwise separate datasets, and where coverage gaps and stale fields demand the same analytical caution.