Given two historics of measures produced by a sensor which performs 96 daily measures:
- power consumption from 01/01/2010 to 16/02/2010
- temperature from 01/01/2010 to 17/02/2010
we had to forecast power consumption on the 17/02/2010.
We used the following models:
- linear regression (LR)
- exponential smoothing (Holt-Winters seasonal method) (HW)
- autoregressive integrated moving average (ARIMA)
- neural network auto-regressive (NNAR)
We took 46 days as training set and 1 day as validation set, measures of the 16/02/2010. Taking 1 day of data as validation may be a bit suspicious, probably 2 days would have been better. Nevertheless our idea was that since we only had to forecast one day, that is 96 measures of the same day, we should take a validation set which would most likely look similar as the test set.
We observe that both time series have seasonality. Although power consumption time series does not have trend while temperature time series has a slight trend.
On boxplots we notice that for power consumption most of the measures on a day have low variance. Measures near 8 am and 5 pm have high variance though. It could be expected since it is when the people leave and go back home. In the contrary in the case of temperature we notice high variance for each measure of the day. Indeed temperature cannot be explained only by trend and seasonality.
The second baseline model looks like the following (alhough here it has been aggregated over hours instead of daily measures):
On validation set we get the following forecasts:
On validation set we get the following results:
We performed the following models:
On validation set we get the following forecasts:
On validation set we get the following results:
We performed the following models:
On validation set we get the following forecasts:
On validation set we get the following results:
We performed the autoregressive integrated moving average model for seasonality (SARIMA):
On validation set we get the following forecasts:
On validation set we get the following results:
We performed the following models:
On validation set we get the following forecasts:
On validation set we get the following results:
We were supposed to provide two forecasts one without temperature, the second with temperature. It should be noted though that we could not achieve better results with temperature. Best result was achieved by using additive exponential smoothing in log space.
On test set we performed the following models:
On test set we submitted the following forecasts:

























