Quantitative Performance of ARIMA, kNN, ANN, GA/ANN

Models built on the historical data of 2009 and 2010 (the shaded area in Figure 15.7) are applied to the 2011 data (unshaded area) without modifications or retraining. Given that, as seen in Figure 15.1, there is a strong seasonality in power-output variability, we expect a strong seasonality in the accuracy of the predicted values as well. To study the influence of this factor, we consider three solar-variability seasons, or periods, that are subsets of the total error- evaluation dataset. The three periods are defined based on the solar- variability study summarized in Figure 15.1 as

• High variability, from January 1, 2011, to April 30, 2011 (P1)

• Medium variability, from May 1, 2011, to June 30, 2011 (P2)

• Low variability, from July 1, 2011, to August 15, 2011 (P3)

All of the statistical metrics for the error are calculated for the three periods. Table 15.1 lists them for the 1 h and 2 h forecasting horizons, respectively.“P1,” “P2,” and “P3” and “Total” identify the error values for the 3 subsets and for the entire validation dataset, respectively. The boldfaced values identify the best model for a given error metric and a given dataset.

The scatter plots for 1 h ahead forecasting are depicted in Figure 15.9; those for 2 h ahead are shown in Figure 15.10. In these figures, each row corresponds to a different forecasting model and each column corresponds to a different variability period. Morning and afternoon values in the scatter plots are iden­tified by different symbols. The dispersion of the forecast values about the identity line is uniform for morning or afternoon. This is clear evidence that the models are free of systematic errors related to daily solar variation.

Table 15.1 shows that the two ANN-based methods, ANN and GA/ANN, clearly outperform the others. Only in terms of MBE is the GA/ANN worse

^ s

TABLE 15.1 Statistical Error Metrics for the 1 h and 2 h Ahead, Hourly – Average Forecasts for Several Stochastic Methodologies

1 h ahead

Clear-sky index persistence

ARIMA

kNN

ANN

GA/ANN

MAE

Total

61.7

72.8

61.9

53.5

43.0

P1

61.3

79.6

71.7

61.2

48.0

P2

66.9

73.0

69.2

53.8

43.0

P3

56.1

51.8

22.9

29.5

24.8

MBE

Total

29.5

0.5

-0.6

1.6

1.1

P1

24.5

-0.9

2.4

-1.6

0.5

P2

32.5

-0.5

-4.5

0.3

-2.1

P3

40.8

0.8

-4.5

13.0

6.9

RMSE

Total

107.5

105.7

116.5

88.2

72.9

P1

109.8

115.6

129.2

98.2

80.6

P2

110.1

104.2

124.1

87.6

72.5

P3

96.3

69.8

42.1

47.2

42.2

R2

Total

0.92

0.92

0.91

0.95

0.96

P1

0.91

0.90

0.87

0.93

0.95

P2

0.92

0.93

0.90

0.95

0.97

P3

0.94

0.97

0.99

0.98

0.99

2 h ahead

MAE

Total

91.1

102.8

87.8

89.1

62.5

P1

91.7

113.8

104.4

100.1

72.9

P2

95.3

102.8

92.7

92.0

57.5

P3

83.9

67.0

30.6

52.0

37.3

MBE

Total

44.2

-0.7

-3.4

4.5

0.2

P1

37.8

-1.9

-0.8

-6.8

-0.7

P2

45.5

-0.1

-8.1

8.8

-3.4

P3

62.0

2.5

-5.6

33.4

7.6

4. >

1 h ahead

Clear-sky index persistence

ARIMA

kNN

ANN

GA/ANN

RMSE Total

160.8

144.3

162.4

142.7

104.3

P1

164.3

158.0

182.4

154.3

117.5

P2

160.9

142.7

167.6

149.6

98.3

P3

149.3

93.4

55.6

85.3

59.1

R2

Total

0.83

0.86

0.82

0.86

0.93

P1

0.79

0.81

0.75

0.82

0.89

P2

0.83

0.87

0.82

0.85

0.94

P3

0.85

0.94

0.98

0.95

0.98

Note:

No nightime

values are considered.

All values in kW except R2,

which is nond

mensional.

TABLE 15.1 Statistical Error Metrics for the 1 h and 2 h Ahead, Hourly – Average Forecasts for Several Stochastic Methodologies—cont’d

image107

than the ARIMA for certain periods. The table also shows how strongly the accuracy of the methods depends on season. For all models, the error metrics for P3 are substantially better than those for the other two periods, as expected. The table shows that, for its simplicity, kNN performs very well for low – variability situations. This is not surprising given that in those cases the mapping pattern/forecasting becomes “almost” deterministic. However, for the periods of medium and high variability kNN performs the worst for most error metrics.

The results also show that the GA/ANN represents a large improvement with respect to the results from the ANN predictor for both forecasting hori­zons. Notably, this improvement is more substantial for the periods of higher variability, P1 and P2. Scatter plots m/n from Figure 15.9 show a clear clus­tering of the data close to the unity line when compared to plots j/k. For the 2 h ahead forecasting, the improvement is even greater, as seen in Figure 15.10. Table 15.2 compares the ARIMA, kNN, and ANN models with respect to the persistence model in terms of RMSE for the entire validation period. A positive value indicates a decrease in RMSE relative to the persistence model; a nega­tive value indicates an increase. The table shows that overall only kNN performs worse than the persistence model. The ARIMA model shows substantial improvement for the 2 h time horizon. Both ANN models outper­form the others, with the GA/ANN hybrid yielding consistent improvements of more than 30% in relation to persistence for a wide range of conditions.

image416

FIGURE 15.9 Scatter plots for 1 h ahead forecasts (kW). Each row corresponds to a different model. Row 1: clear-sky index persistence; row 2: kNN; row 3: ARIMA; row 4: ANN; row 5: GA/ ANN. Each column corresponds to forecasting for a different variability period. Left: January – April 2011 (high variability); middle: May-June 2011 (medium variability); right: July-August 2011 (low variability). Squares identify morning values; circles identify afternoon values.

Подпись: Chapter I 15 Stochastic-Learning Methods
Подпись: 500 1000 ”0 500 1000 “0 500 1000 Measured Value Measured Value Measured Value FIGURE 15.10 2 h ahead forecasts for the models shown in Figure 15.9.

image419image420

TABLE 15.2 Forecasting-

Skill Improvement in

the

Clear-Sky Persistence Model

Forecasting Horizon

1 hr

2 hr

Persistence RMSE

107.48 kW

160.79 kW

ARIMA

1.7%

10.3%

KNN

-8.4%

-1.0%

ANN

17.9%

11.2%

GA/ANN

32.2%

35.1%

Note: Measured by the decrease

in RMSE forthe validation dataset. Negative

values indicate an increase in RMSE.

__ ^

Updated: August 25, 2015 — 4:21 am