Technical writing
EIA Energy Data: The Federal Database Behind Oil Prices, Natural Gas Storage, and Electricity Generation
The Energy Information Administration is the statistical agency of the Department of Energy and the primary federal authority for US energy data. By statute its data collection and publication functions are independent of the policy arm of DOE — EIA's numbers cannot be altered for political purposes. The result is a comprehensive, mandatory, publicly available record of every significant dimension of US energy: production, consumption, trade, prices, and stocks across petroleum, natural gas, coal, electricity, and renewable energy. Markets move on EIA releases. Policy is set from EIA forecasts. Academic energy economics runs on EIA data.
EIA's Mandate and Independence
Congress established the EIA in 1977 under the Department of Energy Organization Act. The statutory framework deliberately separates EIA from the policy functions of DOE. EIA's administrator reports to the Secretary of Energy but the agency's analysis and forecasts are not subject to review or approval by DOE or the Executive Office of the President. EIA publishes what the data show, not what any administration would prefer them to show. This independence is not merely customary; it is enforced through the Federal Energy Administration Act, which prohibits EIA from suppressing, altering, or delaying release of its statistical and analytical outputs.
EIA's primary data collection authority covers all US energy: petroleum production, refinery inputs, product supplied, and inventory; natural gas production, storage, imports, exports, and consumption; coal production, stocks, and consumption; electricity generation, retail sales, prices, and utility financial data; and nuclear, wind, solar, geothermal, and biomass energy. For petroleum and natural gas, EIA data reporting is mandatory for companies above threshold size, with civil penalties for non-compliance. The mandatory reporting framework gives EIA data coverage that private data services — which rely on voluntary participation or purchase agreements — cannot match.
Short-Term Energy Outlook (STEO)
The Short-Term Energy Outlook is EIA's monthly flagship publication: a 12-to-18-month forward forecast for the full US and global energy market. STEO is released on the second Wednesday of each month, typically at noon Eastern time, and covers:
- Crude oil prices — West Texas Intermediate (WTI) and Brent crude spot price forecasts in dollars per barrel, monthly, through the forecast horizon. WTI is the US benchmark and the settlement price for NYMEX crude futures; Brent is the international benchmark set in the North Sea.
- Natural gas prices — Henry Hub spot price forecast in dollars per million BTU (MMBtu). Henry Hub in Erath, Louisiana is the delivery point for NYMEX natural gas futures and the US gas market benchmark.
- Retail gasoline and diesel prices — regular-grade retail gasoline and on-highway diesel forecasts by region (US average, East Coast, Midwest, Gulf Coast, Rocky Mountain, West Coast).
- Electricity prices — average retail electricity price forecasts by sector (residential, commercial, industrial) and national average.
- Energy production — forecast US crude oil production (barrels per day), natural gas production (billion cubic feet per day), and electricity generation by fuel type.
- Demand by sector — petroleum product demand (gasoline, distillate, jet fuel, residual fuel oil), natural gas consumption by sector (residential, commercial, industrial, electric power), and electricity demand.
STEO data tables are published in Excel format and via the EIA API, allowing programmatic extraction of all forecast series. Historical STEO archives go back to 1995, providing a decades-long record of how EIA's forecasts compared to actual outcomes — a useful input to any model that must account for forecast uncertainty.
The political dimension of STEO is significant. EIA's WTI price forecasts are closely watched against OPEC+ production decisions, US shale output trends, and macroeconomic demand conditions. When STEO's crude price forecast diverges substantially from futures market pricing — which reflects the market's aggregate view — the discrepancy itself becomes news. EIA STEO forecast revisions are also used to time debates over Strategic Petroleum Reserve releases, which require justification in terms of the market impact EIA data projects.
Weekly Petroleum Status Report
The Weekly Petroleum Status Report (WPSR) is the most market-moving regular EIA release. Published every Wednesday at 10:30 AM Eastern time, the WPSR reports petroleum inventory levels and flows for the week ending the previous Friday. Within minutes of release, WTI futures prices move in response to the headline inventory number.
The single most market-sensitive figure in the WPSR is commercial crude oil inventories at Cushing, Oklahoma. Cushing is the physical delivery point for NYMEX WTI crude oil futures contracts: when a trader holds a futures contract to expiry and takes physical delivery, the crude arrives at Cushing storage terminals. Cushing stocks therefore directly affect the cost of carrying futures positions and the WTI spot-futures spread. An unexpected weekly build of 3 to 5 million barrels at Cushing can push WTI down by $1 to $2 per barrel within minutes; an unexpected draw of similar magnitude can push it up by the same amount. The weekly Cushing stock figure (EIA series codePET.W_EPC0_SAX_YCUOK_MBBL.W) is the closest thing in the federal data catalog to a real-time demand gauge for the US crude oil market.
Beyond Cushing, the WPSR covers:
- Total commercial petroleum stocks — crude oil plus all finished petroleum products held in primary storage across the US. The broadest inventory measure; compared to the five-year seasonal average to assess whether stocks are above or below historical norms.
- Refinery utilization rates — the percentage of operable refinery capacity that is actively processing crude. Utilization affects the product supply balance: low utilization (due to turnarounds, hurricanes, or demand weakness) tightens gasoline and distillate stocks; high utilization builds product inventories. US refinery capacity runs between 17 and 18 million barrels per day; utilization routinely reaches 90 to 95 percent during peak driving season.
- Gasoline and distillate fuel oil stocks — total commercial inventories of motor gasoline and distillate (diesel plus heating oil). Gasoline stocks are the primary driver of retail gasoline price expectations; distillate stocks affect diesel prices and, in the US Northeast where heating oil is still common, home heating costs.
- Crude oil imports and exports — weekly volumes by country of origin for imports; total export volumes. The US became a net petroleum exporter on an annual basis in 2019 for the first time since the 1940s, a transition visible in the WPSR export data.
Natural Gas Weekly Update and Storage Report
The EIA Natural Gas Weekly Update is published each Thursday and contains the headline number that moves Henry Hub gas futures: the weekly storage injection or withdrawal figure from the EIA-914 storage report. The storage number is released at 10:30 AM Eastern on Thursdays. Natural gas futures and options traders call it simply “the storage number,” and a surprise relative to the consensus analyst estimate of plus or minus 10 billion cubic feet (Bcf) can move front-month Henry Hub futures by 5 to 15 cents per MMBtu.
US natural gas storage facilities are categorized by EIA into five regions:
- East — includes the Mid-Atlantic and Appalachian producing region
- Midwest — the large Midwest storage complex serving residential and industrial demand
- Mountain — Rocky Mountain region
- Pacific — California and Pacific Northwest
- South Central — the Gulf Coast storage hub, divided between salt cavern and depleted field storage; the largest region by capacity
The injection and withdrawal pattern is highly seasonal: net injections accumulate storage from April through October as summer production exceeds demand, building stocks in anticipation of winter heating demand. Net withdrawals run from November through March. The magnitude of the winter draw depends on temperatures — measured in heating degree days (HDD), each HDD being one degree below 65°F averaged over a day. A colder-than-normal winter draws more gas, drives storage to lower seasonal lows, and pushes spring Henry Hub prices higher as the market prices in a tighter supply balance heading into the next injection season. A warm winter leaves storage high, suppresses spring and summer prices, and reduces the incentive to inject. The weekly storage number and the weekly HDD tracking from NOAA are the two numbers natural gas traders watch most closely through the winter months.
The 2022 European energy crisis illustrated how EIA storage data affects global natural gas pricing. As European countries scrambled to replace Russian pipeline gas with US LNG exports, Henry Hub prices rose from under $4/MMBtu in early 2022 to a peak above $9/MMBtu in August 2022 — the highest level in 14 years. EIA storage data showed US storage inventories tracking below the five-year average through 2022 as LNG exports competed with domestic demand for available supply, amplifying Henry Hub price volatility. The storage number each Thursday became a global market event during this period.
EIA-860 Annual Electric Generator Report
EIA Form 860 is the authoritative census of every utility-scale power plant in the United States. More than 15,000 generators are tracked, covering every facility with 1 megawatt or more of nameplate capacity. Form 860 is filed annually by plant owners and operators and includes:
- Plant identification — EIA Plant Code (persistent integer), plant name, operator name and ID, owner name, NERC reliability region, balancing authority, state, county, latitude, and longitude.
- Generator characteristics — generator ID, prime mover technology (steam turbine, combined cycle, combustion turbine, wind turbine, photovoltaic, etc.), primary energy source, nameplate capacity (MW AC for solar, MW for other technologies), summer and winter capacity ratings, commercial operation date, and planned retirement date if announced.
- Operational status — one of a set of status codes: OP (operating), SB (standby/backup), OA (out of service but not retired), RE (retired), IP (in operation under special circumstances), P (planned), U (under construction), T (proposed, not yet submitted for permitting), L (regulatory approvals pending).
- Associated storage — co-located battery storage capacity and technology for generators paired with storage systems.
Form 860 is the source for all grid capacity analysis: total installed capacity by fuel type and region, the pipeline of planned additions, the schedule of announced retirements, and the geographic distribution of generation assets. Because EIA assigns persistent Plant Codes, Form 860 data is linkable across years to track the full lifecycle of each facility — from planned status through construction to commercial operation to eventual retirement — and across datasets to link generator characteristics to monthly generation from Form 923.
EIA-923 Power Plant Operations Report
EIA Form 923 is the monthly operational counterpart to Form 860. Where Form 860 tracks generator characteristics and status, Form 923 tracks what generators actually did each month: how much they generated, how much fuel they burned, and what that fuel cost. Form 923 is mandatory for all plants at or above the 1 MW reporting threshold.
The primary Form 923 data elements are:
- Net generation — electricity sent to the grid in megawatt-hours, by month, fuel type, and prime mover. The difference between gross generation at the turbine and net generation at the busbar reflects plant auxiliary loads.
- Fuel consumption — physical quantities consumed in the reporting month by fuel type: short tons for coal, thousand cubic feet (Mcf) for natural gas, barrels for petroleum liquids, pounds of uranium for nuclear, and MWh for pumped hydro.
- Fuel receipts — for coal and petroleum plants, monthly fuel delivery quantities, heat content, and cost per million BTU. This fuel cost data combined with heat rates (BTU/kWh) yields the variable cost of electricity generation by plant and fuel type.
- Heat rates — BTU consumed per net kWh generated, calculated from the fuel consumption and generation figures. Heat rate is the measure of a thermal plant's efficiency; combined with fuel cost, it determines dispatch economics.
Form 923 links to Form 860 via EIA Plant Code and generator ID, enabling plant-level analysis that connects generator characteristics (Form 860) to monthly performance (Form 923). This linkage is the foundation for every plant-level study of capacity factors, fuel switching, retirement economics, and the efficiency difference between new combined cycle gas plants and aging coal steam units.
Electric Power Monthly
The Electric Power Monthly (EPM) is EIA's comprehensive monthly compilation of US electricity statistics. Published approximately 60 days after the reference month, EPM aggregates Form 923 generation data, Form 861 retail sales data, and additional price and capacity factor series into a single reference publication covering the national and state-level electricity sector.
Key EPM series include net electricity generation by fuel type for the US total and each state, broken into utility, independent power producer, and combined heat and power sectors; retail electricity sales in megawatt-hours and revenues in dollars by sector (residential, commercial, industrial, transportation) at the state and national levels; average retail electricity prices by sector and state; and generation capacity factors by fuel type and region, showing how intensively each technology type is being utilized.
The EPM's value as a historical archive of the US energy transition is substantial. The coal-to-gas displacement of the 2010s, the wind build-out in the Great Plains and Texas, the solar inflection that began around 2018 as module costs fell below $0.40/watt, the nuclear capacity factor improvements from relicensing and uprates, and the recent emergence of battery storage as a grid resource are all quantified in EPM's monthly state-level generation tables. Because EPM data extends back to 2001 in consistent format and is available via the EIA API, it is the preferred source for long-run electricity market research.
Petroleum Supply Monthly
The Petroleum Supply Monthly (PSM) provides detailed monthly statistics on US crude oil and petroleum product supply, disposition, and inventories at the national and state levels, with a lag of approximately 60 days. PSM expands on the weekly WPSR data with significantly more geographic and product-level detail:
- Crude oil production by state — monthly barrels per day of field production for each producing state, extending back decades. The shale revolution in Texas (Permian Basin and Eagle Ford) and North Dakota (Bakken) is quantified in this series: combined Texas and North Dakota output rose from under 2 million barrels per day in 2010 to over 6 million barrels per day by 2023.
- Crude oil imports by country of origin — monthly volumes from each supplying country, revealing which exporters supply US refineries and how that mix has shifted as domestic production increased and OPEC+ supply management changed global trade flows.
- Refinery processing and product output — monthly crude inputs to refineries and output of each refined product: motor gasoline, distillate fuel oil, jet fuel, residual fuel oil, liquefied petroleum gases, and other products. Refinery yields — the percentage of each product in the output barrel — vary by crude type, refinery configuration, and seasonal demand signals.
- Product supplied — EIA's proxy for petroleum product demand, calculated as production plus imports minus exports minus inventory change. Product supplied is an imperfect demand measure because it reflects supply-side adjustments, but it is the only monthly demand estimate available for the US at the national product level.
- Inventory changes — monthly stock changes by product at primary storage facilities, pipelines, and in transit, providing a more detailed view of the supply balance than the weekly WPSR numbers.
EIA Open Data API
The EIA Open Data API at api.eia.gov is the programmatic access point for more than 500,000 time series across all EIA publication areas. A free API key obtained at eia.gov/opendata/ is required; there is no paid tier. The current version is v2, which uses a faceted query structure where callers navigate a category hierarchy and filter dimensions using named facets.
The API v2 base URL is https://api.eia.gov/v2/. Top-level categories includepetroleum, natural-gas, electricity,coal, nuclear-outages, total-energy, andaeo (the Annual Energy Outlook). Each category has sub-routes for specific datasets. For example:
/petroleum/pri/spt/data/— crude oil spot prices (WTI and Brent); the legacy series ID for weekly WTI isPET.RWTC.W/natural-gas/pri/fut/data/— natural gas futures prices; the legacy series for weekly Henry Hub spot isNG.RNGWHHD.W/petroleum/stoc/wstk/data/— weekly petroleum stocks; the series for total US commercial crude isPET.WCRSTUS1.Wand for Cushing crude isPET.W_EPC0_SAX_YCUOK_MBBL.W/natural-gas/stor/wkly/data/— weekly natural gas storage by region/electricity/electric-power-operational-data/data/— Form 923 monthly generation by fuel type and sector/electricity/rto/fuel-type-data/data/— EIA-930 hourly generation by fuel type and balancing authority
Legacy v1 series IDs (structured as DATASET.SERIESNAME.FREQUENCY) can be accessed via the v2 compatibility endpoint at https://api.eia.gov/v2/seriesid/SERIES_ID/data/. This is the most reliable approach for known series IDs, as the v1-to-v2 migration preserved all series codes while restructuring the category navigation. Pagination is handled through offset and length parameters (maximum 5,000 records per page). The API response structure is response.data (the records array), response.total(total matching record count), and response.description (series metadata).
The Political Dimension: Energy Independence and Policy Debates
EIA data sits at the center of several politically charged policy debates, and its independence is tested precisely because the data matters so much to those debates.
The question of US energy independence — a phrase with a long and often misleading political history — has an empirically precise answer in EIA data. The US became a net exporter of petroleum and other liquids on an annual basis in 2019, the first time since the 1940s. It became a net exporter of natural gas on a sustained basis in 2017 as LNG export terminals came online. The primary driver was the shale revolution: US crude oil production rose from 5.0 million barrels per day in 2008 to a pre-pandemic peak of 12.9 million barrels per day in November 2019. EIA's Petroleum Supply Monthly and its production estimates are the primary public record of this transformation.
STEO crude oil price forecasts interact with OPEC+ production management decisions in ways that create feedback loops. When EIA projects that US shale production will fill a supply gap that OPEC+ cuts are intended to create, OPEC+ may deepen or extend cuts, which changes the supply balance EIA is forecasting. The STEO releases closest to OPEC+ ministerial meetings are watched especially closely as signals of how EIA expects the market to absorb the group's production decisions. EIA forecasters explicitly incorporate futures prices and OPEC+ guidance into their price path assumptions, making STEO a hybrid of quantitative modeling and stated policy assumptions.
The Strategic Petroleum Reserve (SPR) — the 700-million-barrel emergency crude oil stockpile held in salt caverns on the US Gulf Coast — is managed by the Department of Energy, but SPR release decisions are justified and evaluated using EIA data. The Biden administration's 2022 SPR releases, totaling approximately 180 million barrels over about a year, were framed in terms of the supply gap created by Russia's invasion of Ukraine and the anticipated market impact measured against EIA's supply-demand balance projections. EIA's weekly inventory data then recorded the drawdown as it occurred and tracked how quickly market prices responded — providing the empirical record for evaluating whether the release achieved its stated objectives.
The natural gas market provides another example of EIA data's political salience. The 2022 European energy crisis, driven by Russia's reduction of pipeline gas flows to Europe, created a demand surge for US LNG exports that EIA quantified in real time through its LNG export terminal utilization and export volume tracking. EIA's projection that European LNG demand would keep Henry Hub above $5/MMBtu through 2023 informed debates about permitting new LNG export terminals — a debate where both sides cited EIA data selectively to support their case.
Python: Cushing Crude Stocks and Henry Hub Gas Price, Dual-Axis Chart
The following script uses the EIA Open Data API to pull two weekly time series: crude oil inventories at Cushing, Oklahoma and the Henry Hub natural gas spot price. It plots them on a dual-axis chart, which makes the independent dynamics of the two markets visible while allowing visual inspection of periods when both responded to common macro drivers such as the 2020 demand collapse or the 2022 European energy crisis spike. The script uses the v2 series compatibility endpoint with legacy series IDs, the most reliable approach for well-known EIA series with long histories.
import requests
import pandas as pd
import matplotlib
matplotlib.use("Agg")
import matplotlib.pyplot as plt
EIA_API_BASE = "https://api.eia.gov/v2/seriesid/"
# EIA v1-style series IDs still work via the v2 compatibility endpoint
# PET.WCRSTUS1.W - Weekly crude oil stocks, total US (thousand barrels)
# PET.W_EPC0_SAX_YCUOK_MBBL.W - Weekly crude stocks at Cushing OK (thousand barrels)
# NG.RNGWHHD.W - Henry Hub natural gas spot price (dollars per million BTU)
SERIES = {
"cushing_stocks": "PET.W_EPC0_SAX_YCUOK_MBBL.W",
"henry_hub_price": "NG.RNGWHHD.W",
}
def fetch_series(api_key, series_id, start="2015-01-01", end="2025-12-31"):
"""
Pull a weekly EIA time series using the v2 series endpoint.
Returns a DataFrame with columns: period (date), value (numeric).
"""
params = {
"api_key": api_key,
"frequency": "weekly",
"data[0]": "value",
"start": start,
"end": end,
"sort[0][column]": "period",
"sort[0][direction]": "asc",
"offset": 0,
"length": 5000,
}
url = "https://api.eia.gov/v2/seriesid/" + series_id + "/data/"
records = []
while True:
resp = requests.get(url, params=params, timeout=30)
resp.raise_for_status()
body = resp.json()
page = body["response"]["data"]
records.extend(page)
total = body["response"]["total"]
params["offset"] += len(page)
if params["offset"] >= total or not page:
break
df = pd.DataFrame(records)
df["period"] = pd.to_datetime(df["period"])
df["value"] = pd.to_numeric(df["value"], errors="coerce")
return df.dropna(subset=["value"]).sort_values("period").reset_index(drop=True)
def plot_cushing_vs_henry_hub(api_key, output_path="eia_cushing_henry_hub.png"):
"""
Dual-axis chart: Cushing crude stocks (left axis, million barrels)
and Henry Hub spot price (right axis, dollars per MMBtu).
The inverse relationship between Cushing inventory and gas price is
structurally weak but both series respond to the same macro drivers:
industrial demand, weather extremes, and supply disruptions.
"""
cushing = fetch_series(api_key, SERIES["cushing_stocks"])
henryhub = fetch_series(api_key, SERIES["henry_hub_price"])
# Convert Cushing from thousand barrels to million barrels
cushing["value_mmbbl"] = cushing["value"] / 1000.0
# Align on common date index via merge
merged = pd.merge(
cushing[["period", "value_mmbbl"]].rename(columns={"value_mmbbl": "cushing"}),
henryhub[["period", "value"]].rename(columns={"value": "henry_hub"}),
on="period",
how="inner",
)
fig, ax1 = plt.subplots(figsize=(14, 6))
color_cushing = "#0b4a8f"
color_henryhub = "#d97706"
ax1.fill_between(
merged["period"],
merged["cushing"],
alpha=0.25,
color=color_cushing,
label="_nolegend_",
)
ax1.plot(
merged["period"],
merged["cushing"],
color=color_cushing,
linewidth=1.6,
label="Cushing crude stocks (MMbbl)",
)
ax1.set_ylabel("Cushing, OK Crude Stocks (million barrels)", color=color_cushing, fontsize=10)
ax1.tick_params(axis="y", labelcolor=color_cushing)
ax2 = ax1.twinx()
ax2.plot(
merged["period"],
merged["henry_hub"],
color=color_henryhub,
linewidth=1.6,
label="Henry Hub spot price ($/MMBtu)",
alpha=0.9,
)
ax2.set_ylabel("Henry Hub Natural Gas Price ($/MMBtu)", color=color_henryhub, fontsize=10)
ax2.tick_params(axis="y", labelcolor=color_henryhub)
# Annotate the 2022 European energy crisis spike
spike_date = pd.Timestamp("2022-08-22")
mask = abs(merged["period"] - spike_date) < pd.Timedelta(days=10)
if mask.any():
spike_price = float(merged.loc[mask, "henry_hub"].max())
ax2.annotate(
"Aug 2022 spike
$" + str(round(spike_price, 2)) + "/MMBtu",
xy=(spike_date, spike_price),
xytext=(pd.Timestamp("2021-06-01"), spike_price * 0.85),
fontsize=8,
color=color_henryhub,
arrowprops={"arrowstyle": "->", "color": color_henryhub},
)
lines1, labels1 = ax1.get_legend_handles_labels()
lines2, labels2 = ax2.get_legend_handles_labels()
ax1.legend(lines1 + lines2, labels1 + labels2, loc="upper left", fontsize=9)
ax1.set_title(
"Cushing Crude Oil Stocks vs. Henry Hub Natural Gas Price (Weekly, EIA)",
fontsize=13,
fontweight="bold",
)
ax1.set_xlabel("Week ending date")
fig.tight_layout()
plt.savefig(output_path, dpi=150)
print("Saved chart to " + output_path)
return merged
# --- Usage ---
# Register for a free EIA API key at https://www.eia.gov/opendata/
# api_key = "YOUR_EIA_API_KEY"
# df = plot_cushing_vs_henry_hub(api_key)
# print(df.tail(10))
The Cushing series (PET.W_EPC0_SAX_YCUOK_MBBL.W) reports in thousand barrels; the script converts to million barrels for readability. The Henry Hub series (NG.RNGWHHD.W) reports in dollars per MMBtu directly. The August 2022 spike annotation will appear automatically if the merged dataset includes that date — the script checks the nearest week and annotates the peak price in that window. To extend the script to include total US commercial crude stocks alongside Cushing, add series PET.WCRSTUS1.W and plot it on a third axis or a separate panel; the ratio of Cushing stocks to total US stocks is a useful measure of the concentration of WTI pricing pressure at the Cushing hub.
EIA data is built into the federal economic accounting system. The Bureau of Economic Analysis uses EIA petroleum and natural gas price data as source inputs in constructing the GDP deflators and the Personal Consumption Expenditures price index. For the broader context of how federal statistical agencies produce and revise national economic accounts, see BEA GDP and National Accounts: The Federal Dataset That Measures the US Economy.
Energy input costs drive producer price inflation across manufacturing, transportation, and agriculture. The BLS Producer Price Index measures selling prices received by domestic producers and is the primary leading indicator for how energy cost shocks propagate into consumer prices. See BLS PPI: The Producer Price Index and the Federal Inflation Dataset That Leads CPI.
The Federal Highway Administration's traffic and freight data tracks motor vehicle miles traveled and freight tonnage — the two primary drivers of gasoline and diesel demand that EIA uses in its petroleum consumption forecasts. See FHWA Highway Data: The Federal Dataset Behind Bridge Conditions, Pavement Quality, and Traffic Counts.