Skip Navigation Links.
Collapse <span class="m110 colortj mt20 fontw700">Volume 12 (2024)</span>Volume 12 (2024)
Collapse <span class="m110 colortj mt20 fontw700">Volume 11 (2023)</span>Volume 11 (2023)
Collapse <span class="m110 colortj mt20 fontw700">Volume 10 (2022)</span>Volume 10 (2022)
Collapse <span class="m110 colortj mt20 fontw700">Volume 9 (2021)</span>Volume 9 (2021)
Collapse <span class="m110 colortj mt20 fontw700">Volume 8 (2020)</span>Volume 8 (2020)
Collapse <span class="m110 colortj mt20 fontw700">Volume 7 (2019)</span>Volume 7 (2019)
Collapse <span class="m110 colortj mt20 fontw700">Volume 6 (2018)</span>Volume 6 (2018)
Collapse <span class="m110 colortj mt20 fontw700">Volume 5 (2017)</span>Volume 5 (2017)
Collapse <span class="m110 colortj mt20 fontw700">Volume 4 (2016)</span>Volume 4 (2016)
Collapse <span class="m110 colortj mt20 fontw700">Volume 3 (2015)</span>Volume 3 (2015)
Collapse <span class="m110 colortj mt20 fontw700">Volume 2 (2014)</span>Volume 2 (2014)
Collapse <span class="m110 colortj mt20 fontw700">Volume 1 (2013)</span>Volume 1 (2013)
American Journal of Applied Mathematics and Statistics. 2022, 10(1), 22-27
DOI: 10.12691/AJAMS-10-1-4
Original Research

Clustering Time Related Data: A Regression Tree Approach

K.A.D. Deshani1, , Liwan Liyanage-Hansen2 and Dilhari T. Attygalle1

1Department of Statistics, University of Colombo, Colombo 03, Sri Lanka

2School of Computing, Engineering and Mathematics, University of Western Sydney, Campbelltown, Australia

Pub. Date: March 23, 2022

Cite this paper

K.A.D. Deshani, Liwan Liyanage-Hansen and Dilhari T. Attygalle. Clustering Time Related Data: A Regression Tree Approach. American Journal of Applied Mathematics and Statistics. 2022; 10(1):22-27. doi: 10.12691/AJAMS-10-1-4

Abstract

With the advancement of technology, vast time related databases are created from a plethora of processes. Analyzing such data can be very useful, but due to the large volumes and their relevance to time, extracting useful information and implementing models can be very complex and time consuming. However, using a comprehensive exploratory study to extract hidden features of the data can mitigate this complexity to a great extent. The clustering approach is one such way to extract features but can be demanding with time related data, especially with a trend in the data series. This paper proposes an algorithm, based on regression tree approach, to cluster a time series with a trend, along with other relevant variables. The importance of this algorithm is avoiding the misleading cluster allocations that can be created through clustering a differenced time series. Initially it identifies a suitable consistent time window with no trend, and implements separate regression trees for each window, to obtain the clusters. Through exploring the clusters generated from these trees, a general cluster formation is identified suitable for all windows. This is illustrated using hourly electricity demand in Sri Lanka for five consecutive years. Six meaningful clusters were identified based on the day of the week, specialty, and the time of the day. These cluster memberships provide useful additional information on the data structure, independent of the trend component, and can be used as an additional feature for improving model accuracies.

Keywords

clustering, regression tree, load forecasting, trend, data exploration

Copyright

Creative CommonsThis work is licensed under a Creative Commons Attribution 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

References

[1]  Aghabozorgi, S., Shirkhorshidi, A. S., & Wah, T. Y. (2015). Time-series clustering – A decade review. Information Systems, 53, 16-38.
 
[2]  Breiman, L., Friedman, J., Stone, C. J., & Olshen, R. (1984). Classification and Regression Trees. Taylor & Francis.
 
[3]  Chicco, G., Napoli, R., & Piglione, F. (2006). Comparisons among Clustering Techniques for Electricity Customer Classification. IEEE Transactions on Power Systems, 933-940.
 
[4]  Deshani K.A.D, Liyanage-Hansen L. and Attygalle D. (2019). Artificial Neural Network for Dynamic Iterative Forecasting: Forecasting Hourly Electricity Demand, American Journal of Applied Mathematics and Statistics, Vol. 7, No. 1, January 2019.
 
[5]  Deshani K.A.D, Attygalle D., Liyanage-Hansen L. and Lakraj G.P., (2017). Dynamic Short Term Load Forecasting using Functional Principal Component Regression, International Conference on Machine Learning and Data Engineering Sydney, Australia.
 
[6]  Deshani, K.A.D, Attygalle, M.D.T, Hansen, L. L., & Karunarathne, A. (2014). An Exploratory Analysis on Half-Hourly Electricity Load Patterns Leading to Higher Performances in Neural Network Predictions. International Journal of Artificial Intelligence & Applications (IJAIA), Vol. 5, No. 3, May 2014 (pp. 37-51)
 
[7]  Deshani, K.A.D, Hansen, L. L., Attygalle, M.D.T, & Karunarathne, A. (2014). Improved Neural Network Prediction Performances of Electricity Demand: Modifying Inputs through Clustering. Second International Conference on Computational Science and Engineering (pp. 137-147). India: AIRCC.
 
[8]  Gładysz, Barbara and Kuchta, Dorota, (2008). Application of regression trees in the analysis of electricity load, Operations Research and Decisions, 4, issue, p. 19-28.
 
[9]  Hambali, M., Akinyemi, A., Oladunjoye, J., & Yusuf, N. (2016). Electric Power Load Forecast Using Decision Tree Algorithms. Computing, Information Systems, Development Informatics & Allied Research Journal, 29-42.
 
[10]  Hernández, L., Baladrón, C., Aguiar, J., Carro, B., & Sánchez-Esguevillas, A. (2012). Classification and Clustering of Electricity Demand Patterns in Industrial Parks. Energies 2012, 5215-5228.
 
[11]  Lee, K., Cha, Y., & Park, J. (1992). Shoert Term Load Forecasting using Artificial Neural Networks. Transactions on Power Systems, 124-132.
 
[12]  López, M., Valero, S., Senabre, C., & Aparicio, J. (2011). A SOM Neural Network Approach to Load Forecasting. Meteorological and Time Frame Influence. Proceedings of the 2011 International Conference on Power Engineering, Energy and Electrical Drives. Torremolinos.
 
[13]  Räsänen, T., Voukantsis, D., Niska, H., Karatzas, K., & Kolehmainen, M. (2010). Data-based method for creating electricity use load profiles using large amount of customer-specific hourly measured electricity use data. Applied Energy, 87, 3538-3545.
 
[14]  Ranaweera, D., Hubele, N., & Papalexopoulos, A. (1995). Application of radial basis function neural network model for short-term load forecasting. IEE Proceedings-Generation, Transmission and Distribution, 142, 45-50.
 
[15]  Steinberg, D., & Colla, P. (2009). CART: classification and regression trees. In The top ten algorithms in data mining, (p. 179).