Technical writing

EIA Energy Data: The Federal Database Behind Oil Prices, Natural Gas Storage, and Electricity Generation

· AI Analytics
EIAEnergyEconomyFederal Data

The Energy Information Administration is the statistical agency of the Department of Energy and the primary federal authority for US energy data. By statute its data collection and publication functions are independent of the policy arm of DOE — EIA's numbers cannot be altered for political purposes. The result is a comprehensive, mandatory, publicly available record of every significant dimension of US energy: production, consumption, trade, prices, and stocks across petroleum, natural gas, coal, electricity, and renewable energy. Markets move on EIA releases. Policy is set from EIA forecasts. Academic energy economics runs on EIA data.

EIA's Mandate and Independence

Congress established the EIA in 1977 under the Department of Energy Organization Act. The statutory framework deliberately separates EIA from the policy functions of DOE. EIA's administrator reports to the Secretary of Energy but the agency's analysis and forecasts are not subject to review or approval by DOE or the Executive Office of the President. EIA publishes what the data show, not what any administration would prefer them to show. This independence is not merely customary; it is enforced through the Federal Energy Administration Act, which prohibits EIA from suppressing, altering, or delaying release of its statistical and analytical outputs.

EIA's primary data collection authority covers all US energy: petroleum production, refinery inputs, product supplied, and inventory; natural gas production, storage, imports, exports, and consumption; coal production, stocks, and consumption; electricity generation, retail sales, prices, and utility financial data; and nuclear, wind, solar, geothermal, and biomass energy. For petroleum and natural gas, EIA data reporting is mandatory for companies above threshold size, with civil penalties for non-compliance. The mandatory reporting framework gives EIA data coverage that private data services — which rely on voluntary participation or purchase agreements — cannot match.

Short-Term Energy Outlook (STEO)

The Short-Term Energy Outlook is EIA's monthly flagship publication: a 12-to-18-month forward forecast for the full US and global energy market. STEO is released on the second Wednesday of each month, typically at noon Eastern time, and covers:

STEO data tables are published in Excel format and via the EIA API, allowing programmatic extraction of all forecast series. Historical STEO archives go back to 1995, providing a decades-long record of how EIA's forecasts compared to actual outcomes — a useful input to any model that must account for forecast uncertainty.

The political dimension of STEO is significant. EIA's WTI price forecasts are closely watched against OPEC+ production decisions, US shale output trends, and macroeconomic demand conditions. When STEO's crude price forecast diverges substantially from futures market pricing — which reflects the market's aggregate view — the discrepancy itself becomes news. EIA STEO forecast revisions are also used to time debates over Strategic Petroleum Reserve releases, which require justification in terms of the market impact EIA data projects.

Weekly Petroleum Status Report

The Weekly Petroleum Status Report (WPSR) is the most market-moving regular EIA release. Published every Wednesday at 10:30 AM Eastern time, the WPSR reports petroleum inventory levels and flows for the week ending the previous Friday. Within minutes of release, WTI futures prices move in response to the headline inventory number.

The single most market-sensitive figure in the WPSR is commercial crude oil inventories at Cushing, Oklahoma. Cushing is the physical delivery point for NYMEX WTI crude oil futures contracts: when a trader holds a futures contract to expiry and takes physical delivery, the crude arrives at Cushing storage terminals. Cushing stocks therefore directly affect the cost of carrying futures positions and the WTI spot-futures spread. An unexpected weekly build of 3 to 5 million barrels at Cushing can push WTI down by $1 to $2 per barrel within minutes; an unexpected draw of similar magnitude can push it up by the same amount. The weekly Cushing stock figure (EIA series codePET.W_EPC0_SAX_YCUOK_MBBL.W) is the closest thing in the federal data catalog to a real-time demand gauge for the US crude oil market.

Beyond Cushing, the WPSR covers:

Natural Gas Weekly Update and Storage Report

The EIA Natural Gas Weekly Update is published each Thursday and contains the headline number that moves Henry Hub gas futures: the weekly storage injection or withdrawal figure from the EIA-914 storage report. The storage number is released at 10:30 AM Eastern on Thursdays. Natural gas futures and options traders call it simply “the storage number,” and a surprise relative to the consensus analyst estimate of plus or minus 10 billion cubic feet (Bcf) can move front-month Henry Hub futures by 5 to 15 cents per MMBtu.

US natural gas storage facilities are categorized by EIA into five regions:

The injection and withdrawal pattern is highly seasonal: net injections accumulate storage from April through October as summer production exceeds demand, building stocks in anticipation of winter heating demand. Net withdrawals run from November through March. The magnitude of the winter draw depends on temperatures — measured in heating degree days (HDD), each HDD being one degree below 65°F averaged over a day. A colder-than-normal winter draws more gas, drives storage to lower seasonal lows, and pushes spring Henry Hub prices higher as the market prices in a tighter supply balance heading into the next injection season. A warm winter leaves storage high, suppresses spring and summer prices, and reduces the incentive to inject. The weekly storage number and the weekly HDD tracking from NOAA are the two numbers natural gas traders watch most closely through the winter months.

The 2022 European energy crisis illustrated how EIA storage data affects global natural gas pricing. As European countries scrambled to replace Russian pipeline gas with US LNG exports, Henry Hub prices rose from under $4/MMBtu in early 2022 to a peak above $9/MMBtu in August 2022 — the highest level in 14 years. EIA storage data showed US storage inventories tracking below the five-year average through 2022 as LNG exports competed with domestic demand for available supply, amplifying Henry Hub price volatility. The storage number each Thursday became a global market event during this period.

EIA-860 Annual Electric Generator Report

EIA Form 860 is the authoritative census of every utility-scale power plant in the United States. More than 15,000 generators are tracked, covering every facility with 1 megawatt or more of nameplate capacity. Form 860 is filed annually by plant owners and operators and includes:

Form 860 is the source for all grid capacity analysis: total installed capacity by fuel type and region, the pipeline of planned additions, the schedule of announced retirements, and the geographic distribution of generation assets. Because EIA assigns persistent Plant Codes, Form 860 data is linkable across years to track the full lifecycle of each facility — from planned status through construction to commercial operation to eventual retirement — and across datasets to link generator characteristics to monthly generation from Form 923.

EIA-923 Power Plant Operations Report

EIA Form 923 is the monthly operational counterpart to Form 860. Where Form 860 tracks generator characteristics and status, Form 923 tracks what generators actually did each month: how much they generated, how much fuel they burned, and what that fuel cost. Form 923 is mandatory for all plants at or above the 1 MW reporting threshold.

The primary Form 923 data elements are:

Form 923 links to Form 860 via EIA Plant Code and generator ID, enabling plant-level analysis that connects generator characteristics (Form 860) to monthly performance (Form 923). This linkage is the foundation for every plant-level study of capacity factors, fuel switching, retirement economics, and the efficiency difference between new combined cycle gas plants and aging coal steam units.

Electric Power Monthly

The Electric Power Monthly (EPM) is EIA's comprehensive monthly compilation of US electricity statistics. Published approximately 60 days after the reference month, EPM aggregates Form 923 generation data, Form 861 retail sales data, and additional price and capacity factor series into a single reference publication covering the national and state-level electricity sector.

Key EPM series include net electricity generation by fuel type for the US total and each state, broken into utility, independent power producer, and combined heat and power sectors; retail electricity sales in megawatt-hours and revenues in dollars by sector (residential, commercial, industrial, transportation) at the state and national levels; average retail electricity prices by sector and state; and generation capacity factors by fuel type and region, showing how intensively each technology type is being utilized.

The EPM's value as a historical archive of the US energy transition is substantial. The coal-to-gas displacement of the 2010s, the wind build-out in the Great Plains and Texas, the solar inflection that began around 2018 as module costs fell below $0.40/watt, the nuclear capacity factor improvements from relicensing and uprates, and the recent emergence of battery storage as a grid resource are all quantified in EPM's monthly state-level generation tables. Because EPM data extends back to 2001 in consistent format and is available via the EIA API, it is the preferred source for long-run electricity market research.

Petroleum Supply Monthly

The Petroleum Supply Monthly (PSM) provides detailed monthly statistics on US crude oil and petroleum product supply, disposition, and inventories at the national and state levels, with a lag of approximately 60 days. PSM expands on the weekly WPSR data with significantly more geographic and product-level detail:

EIA Open Data API

The EIA Open Data API at api.eia.gov is the programmatic access point for more than 500,000 time series across all EIA publication areas. A free API key obtained at eia.gov/opendata/ is required; there is no paid tier. The current version is v2, which uses a faceted query structure where callers navigate a category hierarchy and filter dimensions using named facets.

The API v2 base URL is https://api.eia.gov/v2/. Top-level categories includepetroleum, natural-gas, electricity,coal, nuclear-outages, total-energy, andaeo (the Annual Energy Outlook). Each category has sub-routes for specific datasets. For example:

Legacy v1 series IDs (structured as DATASET.SERIESNAME.FREQUENCY) can be accessed via the v2 compatibility endpoint at https://api.eia.gov/v2/seriesid/SERIES_ID/data/. This is the most reliable approach for known series IDs, as the v1-to-v2 migration preserved all series codes while restructuring the category navigation. Pagination is handled through offset and length parameters (maximum 5,000 records per page). The API response structure is response.data (the records array), response.total(total matching record count), and response.description (series metadata).

The Political Dimension: Energy Independence and Policy Debates

EIA data sits at the center of several politically charged policy debates, and its independence is tested precisely because the data matters so much to those debates.

The question of US energy independence — a phrase with a long and often misleading political history — has an empirically precise answer in EIA data. The US became a net exporter of petroleum and other liquids on an annual basis in 2019, the first time since the 1940s. It became a net exporter of natural gas on a sustained basis in 2017 as LNG export terminals came online. The primary driver was the shale revolution: US crude oil production rose from 5.0 million barrels per day in 2008 to a pre-pandemic peak of 12.9 million barrels per day in November 2019. EIA's Petroleum Supply Monthly and its production estimates are the primary public record of this transformation.

STEO crude oil price forecasts interact with OPEC+ production management decisions in ways that create feedback loops. When EIA projects that US shale production will fill a supply gap that OPEC+ cuts are intended to create, OPEC+ may deepen or extend cuts, which changes the supply balance EIA is forecasting. The STEO releases closest to OPEC+ ministerial meetings are watched especially closely as signals of how EIA expects the market to absorb the group's production decisions. EIA forecasters explicitly incorporate futures prices and OPEC+ guidance into their price path assumptions, making STEO a hybrid of quantitative modeling and stated policy assumptions.

The Strategic Petroleum Reserve (SPR) — the 700-million-barrel emergency crude oil stockpile held in salt caverns on the US Gulf Coast — is managed by the Department of Energy, but SPR release decisions are justified and evaluated using EIA data. The Biden administration's 2022 SPR releases, totaling approximately 180 million barrels over about a year, were framed in terms of the supply gap created by Russia's invasion of Ukraine and the anticipated market impact measured against EIA's supply-demand balance projections. EIA's weekly inventory data then recorded the drawdown as it occurred and tracked how quickly market prices responded — providing the empirical record for evaluating whether the release achieved its stated objectives.

The natural gas market provides another example of EIA data's political salience. The 2022 European energy crisis, driven by Russia's reduction of pipeline gas flows to Europe, created a demand surge for US LNG exports that EIA quantified in real time through its LNG export terminal utilization and export volume tracking. EIA's projection that European LNG demand would keep Henry Hub above $5/MMBtu through 2023 informed debates about permitting new LNG export terminals — a debate where both sides cited EIA data selectively to support their case.

Python: Cushing Crude Stocks and Henry Hub Gas Price, Dual-Axis Chart

The following script uses the EIA Open Data API to pull two weekly time series: crude oil inventories at Cushing, Oklahoma and the Henry Hub natural gas spot price. It plots them on a dual-axis chart, which makes the independent dynamics of the two markets visible while allowing visual inspection of periods when both responded to common macro drivers such as the 2020 demand collapse or the 2022 European energy crisis spike. The script uses the v2 series compatibility endpoint with legacy series IDs, the most reliable approach for well-known EIA series with long histories.

import requests
import pandas as pd
import matplotlib
matplotlib.use("Agg")
import matplotlib.pyplot as plt

EIA_API_BASE = "https://api.eia.gov/v2/seriesid/"

# EIA v1-style series IDs still work via the v2 compatibility endpoint
# PET.WCRSTUS1.W  - Weekly crude oil stocks, total US (thousand barrels)
# PET.W_EPC0_SAX_YCUOK_MBBL.W - Weekly crude stocks at Cushing OK (thousand barrels)
# NG.RNGWHHD.W   - Henry Hub natural gas spot price (dollars per million BTU)

SERIES = {
    "cushing_stocks": "PET.W_EPC0_SAX_YCUOK_MBBL.W",
    "henry_hub_price": "NG.RNGWHHD.W",
}

def fetch_series(api_key, series_id, start="2015-01-01", end="2025-12-31"):
    """
    Pull a weekly EIA time series using the v2 series endpoint.
    Returns a DataFrame with columns: period (date), value (numeric).
    """
    params = {
        "api_key": api_key,
        "frequency": "weekly",
        "data[0]": "value",
        "start": start,
        "end": end,
        "sort[0][column]": "period",
        "sort[0][direction]": "asc",
        "offset": 0,
        "length": 5000,
    }
    url = "https://api.eia.gov/v2/seriesid/" + series_id + "/data/"
    records = []
    while True:
        resp = requests.get(url, params=params, timeout=30)
        resp.raise_for_status()
        body = resp.json()
        page = body["response"]["data"]
        records.extend(page)
        total = body["response"]["total"]
        params["offset"] += len(page)
        if params["offset"] >= total or not page:
            break
    df = pd.DataFrame(records)
    df["period"] = pd.to_datetime(df["period"])
    df["value"] = pd.to_numeric(df["value"], errors="coerce")
    return df.dropna(subset=["value"]).sort_values("period").reset_index(drop=True)


def plot_cushing_vs_henry_hub(api_key, output_path="eia_cushing_henry_hub.png"):
    """
    Dual-axis chart: Cushing crude stocks (left axis, million barrels)
    and Henry Hub spot price (right axis, dollars per MMBtu).
    The inverse relationship between Cushing inventory and gas price is
    structurally weak but both series respond to the same macro drivers:
    industrial demand, weather extremes, and supply disruptions.
    """
    cushing = fetch_series(api_key, SERIES["cushing_stocks"])
    henryhub = fetch_series(api_key, SERIES["henry_hub_price"])

    # Convert Cushing from thousand barrels to million barrels
    cushing["value_mmbbl"] = cushing["value"] / 1000.0

    # Align on common date index via merge
    merged = pd.merge(
        cushing[["period", "value_mmbbl"]].rename(columns={"value_mmbbl": "cushing"}),
        henryhub[["period", "value"]].rename(columns={"value": "henry_hub"}),
        on="period",
        how="inner",
    )

    fig, ax1 = plt.subplots(figsize=(14, 6))

    color_cushing = "#0b4a8f"
    color_henryhub = "#d97706"

    ax1.fill_between(
        merged["period"],
        merged["cushing"],
        alpha=0.25,
        color=color_cushing,
        label="_nolegend_",
    )
    ax1.plot(
        merged["period"],
        merged["cushing"],
        color=color_cushing,
        linewidth=1.6,
        label="Cushing crude stocks (MMbbl)",
    )
    ax1.set_ylabel("Cushing, OK Crude Stocks (million barrels)", color=color_cushing, fontsize=10)
    ax1.tick_params(axis="y", labelcolor=color_cushing)

    ax2 = ax1.twinx()
    ax2.plot(
        merged["period"],
        merged["henry_hub"],
        color=color_henryhub,
        linewidth=1.6,
        label="Henry Hub spot price ($/MMBtu)",
        alpha=0.9,
    )
    ax2.set_ylabel("Henry Hub Natural Gas Price ($/MMBtu)", color=color_henryhub, fontsize=10)
    ax2.tick_params(axis="y", labelcolor=color_henryhub)

    # Annotate the 2022 European energy crisis spike
    spike_date = pd.Timestamp("2022-08-22")
    mask = abs(merged["period"] - spike_date) < pd.Timedelta(days=10)
    if mask.any():
        spike_price = float(merged.loc[mask, "henry_hub"].max())
        ax2.annotate(
            "Aug 2022 spike
$" + str(round(spike_price, 2)) + "/MMBtu",
            xy=(spike_date, spike_price),
            xytext=(pd.Timestamp("2021-06-01"), spike_price * 0.85),
            fontsize=8,
            color=color_henryhub,
            arrowprops={"arrowstyle": "->", "color": color_henryhub},
        )

    lines1, labels1 = ax1.get_legend_handles_labels()
    lines2, labels2 = ax2.get_legend_handles_labels()
    ax1.legend(lines1 + lines2, labels1 + labels2, loc="upper left", fontsize=9)

    ax1.set_title(
        "Cushing Crude Oil Stocks vs. Henry Hub Natural Gas Price (Weekly, EIA)",
        fontsize=13,
        fontweight="bold",
    )
    ax1.set_xlabel("Week ending date")
    fig.tight_layout()
    plt.savefig(output_path, dpi=150)
    print("Saved chart to " + output_path)
    return merged


# --- Usage ---
# Register for a free EIA API key at https://www.eia.gov/opendata/
# api_key = "YOUR_EIA_API_KEY"
# df = plot_cushing_vs_henry_hub(api_key)
# print(df.tail(10))

The Cushing series (PET.W_EPC0_SAX_YCUOK_MBBL.W) reports in thousand barrels; the script converts to million barrels for readability. The Henry Hub series (NG.RNGWHHD.W) reports in dollars per MMBtu directly. The August 2022 spike annotation will appear automatically if the merged dataset includes that date — the script checks the nearest week and annotates the peak price in that window. To extend the script to include total US commercial crude stocks alongside Cushing, add series PET.WCRSTUS1.W and plot it on a third axis or a separate panel; the ratio of Cushing stocks to total US stocks is a useful measure of the concentration of WTI pricing pressure at the Cushing hub.


EIA data is built into the federal economic accounting system. The Bureau of Economic Analysis uses EIA petroleum and natural gas price data as source inputs in constructing the GDP deflators and the Personal Consumption Expenditures price index. For the broader context of how federal statistical agencies produce and revise national economic accounts, see BEA GDP and National Accounts: The Federal Dataset That Measures the US Economy.

Energy input costs drive producer price inflation across manufacturing, transportation, and agriculture. The BLS Producer Price Index measures selling prices received by domestic producers and is the primary leading indicator for how energy cost shocks propagate into consumer prices. See BLS PPI: The Producer Price Index and the Federal Inflation Dataset That Leads CPI.

The Federal Highway Administration's traffic and freight data tracks motor vehicle miles traveled and freight tonnage — the two primary drivers of gasoline and diesel demand that EIA uses in its petroleum consumption forecasts. See FHWA Highway Data: The Federal Dataset Behind Bridge Conditions, Pavement Quality, and Traffic Counts.