Population-Weighted HDD/CDD — Why Where People Live Changes Energy Demand Forecasts

What Standard HDD/CDD Gets Wrong

Heating degree days (HDD) and cooling degree days (CDD) have been the standard measure of weather-sensitive energy demand since the 1940s, when utilities developed them to estimate fuel oil consumption. The arithmetic is simple: one HDD accumulates for each degree Fahrenheit that the daily average temperature falls below 65°F. One CDD for each degree above it. Aggregate across days, and you get a running measure of the seasonal heating or cooling load.

The problem is not the formula. It is the aggregation. Most commercial HDD/CDD services report regional figures using one of two methods: averaging readings across a set of National Weather Service observation stations, or averaging grid point temperatures across a geographic area. Both methods treat all points in the region as equally important. A grid cell in the high plains of eastern Montana receives the same weight as a grid cell covering suburban Chicago.

From a demand perspective, this is wrong. Energy consumption — residential heating, commercial cooling, industrial load — is concentrated where people live and work. The high plains of Montana cover roughly 90,000 square miles and contain around 300,000 people. The Chicago metropolitan area covers about 10,000 square miles and contains nearly 10 million. A temperature anomaly of the same magnitude in both locations produces energy demand responses that differ by a factor of more than 30 — not because of anything inherent to the cold, but because of who is there to turn up the thermostat.

The Core Problem

Equal-weight geographic averages systematically over-represent lightly populated regions and under-represent densely populated demand centers. The bias is most severe in the Northeast and South Central regions, where population density is highly uneven across the regional geographic footprint.

Why this produces forecast errors

When a simple regional average weights a cold snap in sparsely populated grid cells equally with those in dense metro areas, the resulting HDD figure can overstate demand by a material margin. Conversely, when the anomaly is concentrated in population centers — a heat dome settling over Dallas, a polar vortex spilling into the Great Lakes — simple geographic averages may understate the actual demand signal.

These are not edge cases. The largest demand-driven price events in natural gas markets tend to be episodes where an anomaly hits a population-dense region hard. The geographic average may correctly describe the regional temperature, but it answers the wrong question. For energy markets, the question is not "how cold was the region?" but "how many people needed more heat, and by how much?"

Why Energy Demand Follows People, Not Geography

US natural gas demand in the residential and commercial sectors — the weather-sensitive portion of the demand stack — is dominated by space heating in winter and, to a lesser extent, space cooling in summer. This load is inherently tied to where people spend time in buildings. Industrial demand has some weather sensitivity too, but it is smaller in magnitude and far less correlated with short-term temperature swings. For a gas trader trying to forecast weekly storage draws or near-term basis moves, residential and commercial weather-driven demand is the variable that matters most.

US population is highly concentrated. The top ten metropolitan statistical areas account for roughly a quarter of the national population. The Northeast corridor from Boston to Washington, D.C., packs more than 50 million people into a relatively narrow geographic band. The Gulf Coast from Houston through New Orleans contains another 8 million in a high-CDD environment. The Mountain region, by contrast, stretches across enormous territory — Nevada, Utah, Colorado, Wyoming, Idaho, Montana — with total population comparable to a single Northeast city cluster.

When you build an HDD or CDD figure that does not account for this distribution, you are measuring the climate of the land, not the climate of the people. These are related but not the same thing. For energy demand purposes, only one of them matters.

Why the Normals Must Be Weighted Too

Population weighting the forecast temperature is only half of the problem. If you weight the forecast by population but compare it against a normal that was computed using a simple geographic average — the standard NWS climate normal — your deviation is measuring the wrong thing. The forecast says one thing; the baseline says another. Any anomaly you calculate mixes two different methodologies.

The right approach: apply identical population weights to both forecast and normals, using the same 91 grid points and the same regional groupings. When the forecast and the normal use the same structure, the deviation you calculate is genuinely apples-to-apples. The +4 HDD anomaly in the South Central region means the same population-weighted quantity that was used to define "normal" in that region is now 4 degrees cooler than the 30-year average for that date.

The degree-day baseline: 65°F

The 65°F base temperature in the standard HDD/CDD formula is a convention, not a physical constant. It was set in the mid-twentieth century as an approximation of the outdoor temperature above which most buildings do not require active heating. In an era of better insulation, passive solar design, and internal heat gains from electronics and occupants, the actual balance point for modern buildings is somewhat lower — closer to 55–60°F in many residential applications. The 65°F standard persists because decades of historical data are built on it, and its absolute value matters less for market analysis than its consistency: traders and storage analysts use the same baseline, which means the signals are comparable across time and across market participants.

How SoftSignal Builds Population-Weighted HDD/CDD

SoftSignal's population-weighted HDD/CDD is built on three components: a forecast dataset, a normals dataset, and a population grid. The design goal is that the same grid structure and the same weights apply to all three — so that when you look at a deviation from normal, you are comparing like to like across every step of the calculation.

The components

1

91 grid points, continental US. Temperature forecast and normal values are read at 91 fixed grid points distributed across the continental United States. Grid point selection prioritizes coverage of major population centers within each of the nine energy regions, while maintaining sufficient geographic spread to capture regional temperature gradients.
2

Population weights. Each grid point is assigned a weight proportional to the resident population it represents, derived from census data. Points in high-density metros receive large weights; points in rural or sparsely populated areas receive small ones. Weights are normalized to sum to 1.0 within each region and again across the full continental domain.
3

GFS 16-day forecast. Daily maximum and minimum temperature forecasts from the Global Forecast System (GFS) are extracted at each of the 91 grid points. Average daily temperature is computed as (Tmax + Tmin) / 2. The 16-day forecast horizon provides a forward demand view that the weekly EIA storage report cannot: it lets you see whether demand is trending toward or away from normal before the storage data lands.
4

ERA5 climate normals, 1991–2020. The 30-year baseline for each calendar date is derived from the ERA5 global reanalysis dataset, covering the standard 1991–2020 climate normal period. ERA5 normals are computed at the same 91 grid points with the same population weights applied. This is the critical design decision: by using ERA5 for both the historical baseline and as the backbone for the current-season anomaly, the forecast-minus-normal deviation is free of the methodological inconsistency that would arise from mixing GFS forecasts against NWS station-based normals.
5

9 US energy regions. Grid points are assigned to one of nine energy regions aligned with the EIA's standard US regional structure. Regional population-weighted HDD and CDD are computed by summing the product of each grid point's HDD (or CDD) and its normalized weight within the region. National totals are a further population-weighted sum across regions.

The formula

Population-Weighted HDD — Regional Calculation

HDD_i = max(0, 65 − T_avg_i)

PW-HDD_region = Σ (HDD_i × w_i) / Σ w_i

Deviation = PW-HDD_forecast − PW-HDD_normal

Where i = each grid point in the region, T_avg_i = (Tmax_i + Tmin_i) / 2, and w_i = population weight for grid point i. The same formula applies to both forecast and ERA5 normals.

The nine regions

The regional structure follows EIA's standard US energy consumption geographic breakdown. Each region corresponds to a distinct climate and demand zone.

New England

Boston, Providence, Hartford, Portland

Middle Atlantic

New York, Philadelphia, Baltimore, Washington

East North Central

Chicago, Detroit, Cleveland, Milwaukee

West North Central

Minneapolis, Kansas City, Omaha, St. Louis

South Atlantic

Charlotte, Atlanta, Miami, Richmond

East South Central

Nashville, Birmingham, Memphis, Louisville

West South Central

Houston, Dallas, New Orleans, Oklahoma City

Mountain

Denver, Phoenix, Salt Lake City, Las Vegas

Pacific

Los Angeles, Seattle, San Francisco, Portland

Key Design Principle

Identical weights, identical grid, identical base period — for both forecast and normals. This structural consistency is what makes deviation meaningful. Any service that weights the forecast differently from its normals is measuring the difference between two methodologies as much as it is measuring weather anomaly.

What the Difference Looks Like in Practice

The gap between simple geographic averages and population-weighted figures is not always large — but when it is large, it tends to be large at exactly the moments that matter most for market positioning. The clearest examples come from cold events with uneven geographic footprints: a polar vortex outbreak that drives extreme cold across the sparsely populated northern plains while leaving the major demand centers in the Great Lakes and Northeast only moderately below normal, or conversely, a trough that drops temperatures sharply into the Texas and Oklahoma population belt while leaving the Rockies relatively unaffected.

A constructed illustration

Consider a cold front that moves across the West North Central region in early January. The surface map shows a broad area of −10 to −15°F anomaly (relative to normal) stretching from eastern Montana through the Dakotas and western Minnesota. Simultaneously, the large population centers of the region — Minneapolis–Saint Paul, Kansas City, Omaha, Des Moines — are showing a more modest −4 to −6°F anomaly as the front stalls and weakens before reaching them.

Simple Geographic Average

+9.3

HDD above normal, West North Central

Weights the extreme cold in sparsely populated eastern Montana and the Dakotas equally with Minneapolis, Omaha, and Kansas City.

Population-Weighted

+4.1

HDD above normal, West North Central

Concentrates weight on Minneapolis–Saint Paul, Kansas City, and Omaha — where most of the region's heating demand actually resides.

Same cold front, same regional boundary, same date — but the demand-relevant signal is less than half what the simple average implies.

That difference — 9.3 vs. 4.1 HDDs above normal, for a single day in a single region — has direct implications for storage draw estimates. A regional model anchored to the simple geographic average might forecast a withdrawal materially above consensus. A population-weighted forecast anchors to the demand reality. Over a week-long forecast window, the compounding effect of that mismatch can shift a national storage draw estimate by 10–20 Bcf.

The opposite case: anomaly concentrated in dense metros

The same logic runs in reverse. When a heat dome settles over Houston, Dallas, and San Antonio in August — one of the highest-CDD concentrations in the country — a simple geographic average for the West South Central region includes the comparatively modest warmth in the rural portions of New Mexico and Oklahoma. The population-weighted CDD for that week correctly assigns most of the regional weight to the Gulf Coast metro corridor and returns a figure that more accurately reflects the spike in air conditioning load and peak power demand.

In this direction, the population-weighted measure will exceed the simple average, and the demand signal will be stronger than a geographic-average service would indicate. Understanding which direction the bias runs — and when — is itself a market edge.

What This Means for Energy Market Signals

The population-weighted HDD/CDD framework is the demand layer of SoftSignal's Energy Intelligence Report. It is not a standalone product — it is the input that makes the forward demand view meaningful. Here is how it connects to the signals that matter in natural gas and power markets.

▸

Storage draw forecasting. The EIA weekly natural gas storage report is the single most-watched data release in the US natural gas market. Surprises relative to consensus are the primary driver of intraday price moves on Thursday mornings. Population-weighted HDD/CDD deviation from normal is the most reliable leading indicator of those surprises. When our regional HDD figures show persistent above-normal demand accumulating across high-weight regions — Middle Atlantic, East North Central, West South Central — the probability of a below-consensus (tighter) storage outcome rises.
▸

16-day demand trajectory. The EIA report reflects what happened in the previous week. The GFS 16-day forecast, weighted by population, tells you what is likely to happen in the next two weeks before the pipeline data or next storage print confirms it. This forward window is where population-weighted HDD/CDD does its most actionable work: identifying runs of above- or below-normal demand before the market has fully priced them.
▸

Regional basis signals. US natural gas basis differentials — the spread between a regional hub price and the NYMEX Henry Hub prompt — reflect local supply/demand imbalances before they show up in aggregate national figures. When population-weighted HDDs are running materially above normal in the Northeast (New England + Middle Atlantic regions), basis at Algonquin, Transco, and Tennessee Zone 6 tends to firm. The same regional specificity that makes population weighting analytically superior also maps cleanly onto the geographic structure of basis markets.
▸

Forecast divergence as a signal. When population-weighted HDD/CDD diverges meaningfully from what the simple NWS regional averages or commercial weather service products show, that divergence itself is information. It suggests the simple-average signal used by many market participants is mispricing the demand outlook, and that a reversal toward the population-weighted view is likely as the market incorporates actual demand data.
▸

ENSO seasonal context. Population-weighted demand signals are most interpretable when layered against the seasonal ENSO outlook. A La Niña winter historically produces cooler-than-normal conditions across the northern tier of the US — and that anomaly is more or less demand-relevant depending on whether it is concentrated in heavily populated Great Lakes and Northeast corridor or in the less-populated northern plains. The combination of ENSO phase and population-weighted demand provides a fuller picture than either alone.

In the Energy Intelligence Report

Each weekly report includes population-weighted HDD and CDD deviations from normal for all nine US energy regions, across both the observed trailing week and the 16-day GFS outlook. Regional heat maps highlight where demand is running materially above or below normal. The same framework is available via the SoftSignal MCP for quantitative integration into models and spreadsheet workflows.

What population weighting cannot tell you

Population-weighted HDD/CDD is a demand-side tool. It tells you how much weather-driven demand there is and how it compares to normal. It says nothing about supply: storage levels, pipeline maintenance windows, LNG export commitments, production freeze-offs, or the direction of flows from Canada. A high-demand reading against a tight supply background is a very different market condition from the same demand reading against a well-supplied background. The HDD/CDD figures are one layer of a fuller fundamental picture — the layer that anchors the demand side.

The 65°F base temperature is also a simplification. In exceptionally cold weather, demand responds nonlinearly — gas consumption per HDD rises as temperatures fall further below baseline, because both residential and commercial heating systems are working harder per unit of cold. The standard HDD framework linearizes this relationship. Our approach captures the population distribution correctly; it does not claim to model nonlinear demand response curves.