In part on of this series, I presented four load forecasting challenges that have arisen as the result of deep penetration of behind-the-meter (BTM) distributed energy resources (e.g. solar generation), time-of-use rates, demand response programs, BTM storage, and electric vehicle charging. These four challenges are:

- Challenge 1. Growing Disconnect between Measured Load and Demand for Electricity Services
- Challenge 2. The Relationship between Measured Load and Weather is Becoming Cloudy
- Challenge 3. Increased Load Forecast Errors and Error Volatility
- Challenge 4. Constructing Load Forecast Confidence Bounds.

From a load forecasting perspective, these challenges are leading to an evolution in the way we develop models and forecasts. In this blog, I introduce the steps that Itron is taking to lead this evolution.

**Incorporating BTM Solar Generation into a Real-time Load Forecast**. In 2010, I was first introduced to the problem of load forecast performance erosion as a result of a deep penetration of BTM solar generation by one of Itron’s retail energy clients operating in Belgium. At that time, government sponsored incentives had led to a significant ramp up of roughly 16,000 solar power installations in 2009 to approximately 60,000 solar power installations at the end of 2010. At the same time, our client, which supplies a significant portion of the Belgium market with power, experienced a significant erosion in their day-ahead forecast Mean Absolute Percentage Errors (MAPE). At first, I was a skeptical that BTM solar generation could have such a visible impact on a load forecast. Then, I saw firsthand how much 60,000 solar power installations can swing loads in merely minutes. If you have had the good fortune to have visited the lovely country of Belgium, the one thing you will remember (besides the chocolate and beer) is that it can be very cloudy there. In fact, the day I arrived at the Brussels airport there was a blanket of clouds covering most of Belgium. The next day the clouds broke up around noon and what followed was a lovely sunny day. Itron’s client had access to real-time SCADA measurement of Belgium’s transmission loads. To convince me that BTM solar was having a significant impact, they showed me the Belgium loads from the day before up to the current time. It was a moment I will never forget, because right there in the load data I could see a dramatic and significant drop in loads just as the clouds disappeared and all 60,000 solar panels went into full generation mode at virtually the same moment. That drop in loads represented roughly 7% of this client’s total power requirements. In the span of 15 minutes, the load forecast error jumped by almost 7%. That wasn’t an erosion; that was a calamity.

This is the operating reality of many system operators located throughout the world. In the span of a matter of minutes what is measured as load can drop or rise with nothing more than the passing of clouds over head. From a load forecasting perspective, what is happening is the load data that we build our models on have grown more volatile during the sunlight hours. This bouncing around of load results in an erosion of forecast accuracy as measured by statistics like MAPE.

To fix the load forecast problem, it helps to understand how BTM solar generation impacts a load forecast model. At a very high level, the process of estimating the model coefficients is an averaging of the historical load data, where the explanatory variables segment the load data over which the averages are taken. While this is not an exact description of the least squares approach, it is a useful metaphor when describing how solar PV impacts the estimated coefficients of a short-term load forecast model. Over time, an increased penetration of solar PV has the net effect of reducing on average measured load. This implies that the estimated model coefficients embody this reduction in measured loads. That is, the model coefficients are tuned to measured load under average solar PV production that occurred over the model estimation period. As a result, the short-term load forecasts produce a forecast under average solar PV production conditions. The challenge is that on any given day actual solar PV production will not necessarily align with the average solar PV production. On cloudy days, when solar PV production is smaller than average, the load forecast will under forecast loads because the model fails to reflect the bump up in loads due to lower solar PV production. On sunny days, when solar PV production is greater than average, the load forecast will over forecast loads because the model fails to reflect the drop in loads due to higher solar PV production.

The following examples illustrate how solar PV generation can impact a load forecast. In these examples, assume the demand for electricity at noon, regardless of how it is sourced, is 1,300 MW.

**No Solar PV Generation**. Under this first example, there is no solar PV generation. As a result, Measured Load, which is the load that a system operator sees, equals actual Demand for electricity services. That is,

Where,

Now consider developing a forecasting model of measured load. If there is a year’s worth of measured load, the following regression model can be used.

Where,

In this case, the estimated coefficient on the Intercept variable will be equal to the average measured load, or 1,300 MW. As a result, the forecast from the estimated model will provide an accurate forecast of both measured load and actual demand.

That is,

Where,

**With Constant Solar PV Generation.** Now, assume that 100 MW of solar PV generation is produced every day at noon. The measured load can be re-written as follows:

Because measured load will be 100 MW lower, the estimated coefficient from regressing the new lower measured load on the Intercept variable will lead to an estimated coefficient of 1,200 MW. In this case, the resulting model forecast will accurately forecast measured load, but will under predict demand by 100 MW.

From the perspective of system operations, the fact that the forecast model under predicts demand for electricity is not a concern, since in this unrealistic example, they can rely on the 100 MW of solar generation being there all the time.

**With Volatile Solar PV Generation.** In reality, solar PV generation is not as reliable as the above example suggests. One can introduce uncertainty into the amount of solar generation that is available by assuming that half of the time, cloud cover is thick enough to drive the solar generation to 0 MW. The other days are perfectly clear and the solar generation is 100 MW. This means that half of the time measured load equals 1,200 MW and the other half of the time measured load equals 1,300 MW. If the cloudy and sunny days are equal in number, the average measured load over the year of data will be 1,250 MW. This implies the estimated coefficient on the Intercept variable will be equal to 1,250 MW. That is,

Now consider using this model on two types of days: a Cloudy Day and a Sunny Day. On a Cloudy Day, solar PV generation is 0 MW and measured load will be equal to 1,300 MW, computed as (). In this case, the model forecast of 1,250 MW under predicts measured load. On a Sunny Day, solar PV generation is 100 MW resulting in a measured load of 1,200 MW, computed as (). In this case, the model forecast will over predict measured load.

The variability in solar generation means that the statistical model that was fitted to measured load will under predict measured loads on cloudy days and over predict measured loads on sunny days. From the perspective of system operations, this means they will need additional spinning reserves available to cover the load variability and subsequent load forecast error introduced by the volatile solar PV generation. The inherent bias that arises from fitting statistical models to measured load implies that a growing penetration of solar PV generation will lead to an erosion of the forecast accuracy of load forecast models that do not account for this impact.

**Accounting for Average Solar Generation.** Is it possible to improve the accuracy of the load forecast? Assuming a perfect forecast of cloud cover can be obtained, it is possible to accurately predict how much solar generation will be available tomorrow. It seems reasonable to adjust the baseline load forecast with the forecast of solar generation. Specifically, the adjusted forecast of measured load can be constructed as:

Where,

Following the example from above, the average solar PV generation over the model estimation period is equal to 50 MW, computed as (50% of the days at 0 MW + 50% of the days at 100 MW). On a sunny day, the forecast of measured load will be equal to the predicted value of 1,250 MW from the model of measured load plus (50 MW – 100 MW), or 1,200 MW. On a cloudy day, the forecast of measured load will be equal to the predicted value of 1,250 MW from the model of measured load plus (50 MW – 0 MW), or 1,300 MW. On a sunny day, this approach lowers the forecast of measured load by 50 MW which is the additional solar generation that occurs on a sunny day versus an average day. Conversely, on a cloudy day, this approach raises the forecast of measured load by an additional 50 MW to account for no solar generation taking place on that day.

These examples illustrate that a statistical model of measured load will capture in the estimated model coefficients the average impact of solar generation. Accordingly, with volatile solar PV generation, the model-based forecast of measured load needs to be adjusted to account for the solar PV generation not already accounted for by the estimated model coefficients.

So how do we fix our current set of load forecast models? In practice, I have seen the following approaches implemented.

**Option 1. Do Nothing**. Not all systems have seen deep penetration of BTM solar panels. For now, a do nothing approach is working. If it’s not broken, why fix it?

**Option 2. Error Correction**. Under this approach, load forecasters recognize that their model forecasts are wrong. So, they make an *ex post* bias adjustment of the raw model forecast to account for non-average solar generation output. For example, on sunny days when solar generation output is expected to be greater than average, the *ex post* bias adjustment lowers the raw model forecast. In contrast, on cloudy days when solar generation output is expected to be lower than average, the *ex post* bias adjustment raises the raw model forecast.

**Option 3. Shorten the Model Estimation Period**. Under this approach, there is recognition that the average load has been reduced as a result of the penetration of BTM solar panels. To sync the load forecast to the new lower average load requires fitting the load forecast model coefficients to the most recent load data. In many cases, this is all the load forecaster can do because the load forecast model is a black box. By syncing the load forecast to the most recent data, the problem of systematic over forecasting goes away, but the load forecast will continue to miss the swing in loads that are driven by swings in BTM solar generation. But really, what else can you do with a black box model?

**Option 4. Fit the Load Forecast Model to Reconstituted Loads**. Under this approach, the load forecaster creates or purchases an historical and forecasted time series of BTM solar generation output. Then, they add these data to the historical time series of measured load to form a time series of electricity demand. The load forecast model is fit to this time series of reconstituted load. A forecast of BTM solar generation is then subtracted from the forecast of reconstituted loads to form a forecast of measured loads. The challenge of this approach is developing and maintaining an historical time series of BTM solar generation.

**Option 5. Include BTM Solar Generation as an Explanatory Variable**. At the end of the day, I believe this will prove to be the best approach to this problem. It is the most difficult approach to implement, but if you can get the model variables right the resulting load forecast model will capture not only the direct impact of BTM solar generation, but also the indirect behavioral changes that come with homes and businesses investing in solar panels. Under this approach, the load forecaster creates or purchases an historical and forecasted time series of BTM solar generation output. Unlike the reconstituted load approach which assumes the coefficient on the BTM solar generation variable is equal -1.0, you include the BTM solar generation as an explanatory variable in your existing load forecast model. By doing this, we allow the process of parameter estimation to determine what is the optimal coefficient to hang on this variable. The most straightforward advantage of this is if the historical time series of BTM solar generation is in the wrong scale (e.g. it is too low by a factor of two), then the estimated coefficient can adjust for scale. But, the real power is in capturing behavioral changes that are associated with investments in solar panels. These behavioral changes can be teased out of the data by including key interaction terms with the BTM solar generation data. The interaction terms can capture things like a trend toward homes keeping their air conditioners on while they are at work in order to use the “free” electricity that comes with the solar panel.

Itron is working diligently with several system operators both in the US and abroad to determine the right set of behavioral interactions. Once we have isolated these factors, we will have a solution to Challenges 1 through 3 that we will share with the industry. The preliminary work we have completed for the California Energy Commission (CEC) demonstrates that we are on the right path. The study we completed for the CEC *Improving Short-term Load Forecasts by Incorporating Solar PV Generation*, CEC EPC-14-001 (draft, February, 2017) provides a template for how to improve short-term load forecast models by incorporating forecasts of BTM solar generation.

I will address approaches to Challenge 4. Constructing Load Forecast Confidence Bounds in my next blog.