Short-Term Load Forecasting with Dilated Recurrent Attention Networks in Presence of Missing Data

  • Changkyu Choi UiT the Arctic University of Norway
Keywords: Attention, Short-Term Load Forecasting, Recurrent, Neural Networks, Dilated RNN, Dilation


Forecasting the dynamics of time-varying systems is essential to maintaining the sustainability of the systems.
Recent studies have discovered that Recurrent Neural Networks(RNN) applied in the forecasting tasks outperform conventional models that include AutoRegressive Integrated Moving Average(ARIMA).
However, due to the structural limitation of vanilla RNN which holds unit-length internal connections, learning the representation of time series with \textit{missing data} can be severely biased.
The goal of this paper is to provide a robust RNN architecture against the bias from missing data.
We propose Dilated Recurrent Attention Networks(DRAN).
The proposed model has a stacked structure of multiple RNNs which layer of each having a different length of internal connections.
This structure allows incorporating previous information at different time scales.
DRAN updates its state by a weighted average of the layers.
In order to focus more on the layer that carries reliable information against bias from missing data, it leverages attention mechanism which learns the distribution of attention weights among the layers.
We report that our model outperforms conventional ones with respect to the forecast accuracy from two benchmark datasets, including a real-world electricity load dataset.


E. Almeshaiei and H. Soltan. A methodology for electric power load forecasting. Alexandria Engineering Journal, 50(2):137–144, 2011.

D. Bahdanau, K. Cho, and Y. Bengio. Neural machine translation by jointly learning to align and translate. ICLR, 2015.

F. M. Bianchi, E. De Santis, A. Rizzi, and A. Sadeghian. Short-term electric load forecasting using echo state networks and PCA decomposition. IEEE Access, 3:1931–1943, 2015.

F. M. Bianchi, E. Maiorino, M. Kampffmeyer, A. Rizzi, and R. Jenssen. Recurrent neuralnetworks for short-term load forecasting: an overview and comparative analysis. Springer, 2017.

S. Chang, Y. Zhang, W. Han, M. Yu, X. Guo, W. Tan, X. Cui, M. Witbrock, M. Hasegawa-Johnson, and T. Huang. Dilated recurrent neu- ral networks. NeurIPS, 30:77–87, 2017.

Z. Che, S. Purushotham, K. Cho, D. Sontag, and Y. Liu. Recurrent Neural Networks for Multivariate Time Series with Missing Values. Scientific Reports, 8(1):6085, 2018.

K. Cho, B. van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio. Learning phrase representations using RNN encoder-decoder for statistical machine translation. Conference on Empirical Methods in Natural Language Processing, 1:1724–1734, 2014.

T.-H. Dang-Ha, F. M. Bianchi, and R. Olsson. Local short term electricity load forecasting: Automatic approaches. International Joint Conference on Neural Networks, 7:4267–4274, 2017.

J. D. Farmer and J. J. Sidorowich. Predicting chaotic time series. Physical Review Letter, 59(8):845–848, 1987.

T. Hong and M. Shahidehpour. Load forecasting case study. EISPC, US Department of Energy, 2015.

Kaggle. GEFCom global energy forecasting competition, 2012.

Z. C. Lipton, D. Kale, and R. Wetzel. Modeling missing data in clinical time series with RNNs. Machine Learning for Healthcare Conference, 56:253–270, 2016.

I. Shpitser, K. Mohan, and J. Pearl. Missing data as a causal and probabilistic problem. Conference on Uncertainty in Artificial Intelligence, 31:802–811, 2015.

A. van den Oord, S. Dieleman, H. Zen, K. Simonyan, O. Vinyals, A. Graves, N. Kalchbrenner, A. W. Senior, and K. Kavukcuoglu. Wavenet: A generative model for raw audio. Arxiv, 2016.

M. Woodward, W. Smith, and H. Tunstallpedoe. Bias from missing values: sex differences in implication of failed venepuncture for the scottish heart health study. International journal of epidemiology, 20(2):379–383, 1991.

F. Yu and V. Koltun. Multi-scale context aggregation by dilated convolutions. ICLR, 2016.