Abstract
Supervised learning of time series data has been extensively studied for the case of a categorical target variable. In some application domains, e.g., energy, environment and health monitoring, it occurs that the target variable is numerical and the problem is known as time series extrinsic regression (TSER). In the literature, some well-known time series classifiers have been extended for TSER problems. As first benchmarking studies have focused on predictive performance, very little attention has been given to interpretability. To fill this gap, in this paper, we suggest an extension of a Bayesian method for robust and interpretable feature construction and selection in the context of TSER. Our approach exploits a relational way to tackle with TSER: (i), we build various and simple representations of the time series which are stored in a relational data scheme, then, (ii), a propositionalisation technique (based on classical aggregation/selection functions from the relational data field) is applied to build interpretable features from secondary tables to “flatten” the data; and (iii), the constructed features are filtered out through a Bayesian Maximum A Posteriori approach. The resulting transformed data can be processed with various existing regressors. Experimental validation on various benchmark data sets demonstrates the benefits of the suggested approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
http://www.khiops.com (available as a shareware for research purpose).
References
Bagnall, A.J., Davis, L.M., Hills, J., Lines, J.: Transformation based ensembles for time series classification. In: Proceedings of the Twelfth SIAM International Conference on Data Mining, (SDM 2012), Anaheim, California, USA, 26-28 April 2012, pp. 307–318 (2012)
Bagnall, A.J., Lines, J., Bostrom, A., Large, J., Keogh, E.J.: The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances. Data Min. Knowl. Disc. 31(3), 606–660 (2017). https://doi.org/10.1007/s10618-016-0483-9
Bondu, A., Gay, D., Lemaire, V., Boullé, M., Cervenka, E.: FEARS: a feature and representation selection approach for time series classification. In: Proceedings of The 11th Asian Conference on Machine Learning, ACML 2019, Nagoya, Japan, 17–19 November 2019, pp. 379–394 (2019)
Boullé, M.: MODL: a Bayes optimal discretization method for continuous attributes. Mach. Learn. 65(1), 131–165 (2006). https://doi.org/10.1007/s10994-006-8364-x
Boullé, M.: Compression-based averaging of selective Naive Bayes classifiers. J. Mach. Learn. Res. 8, 1659–1685 (2007)
Boullé, M., Charnay, C., Lachiche, N.: A scalable robust and automatic propositionalization approach for Bayesian classification of large mixed numerical and categorical data. Mach. Learn. 108(2), 229–266 (2019). https://doi.org/10.1007/s10994-018-5746-9
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001). https://doi.org/10.1023/A:1010933404324
Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016, pp. 785–794. ACM (2016)
Dempster, A., Petitjean, F., Webb, G.I.: ROCKET: exceptionally fast and accurate time series classification using random convolutional kernels. Data Min. Knowl. Disc. 34(5), 1454–1495 (2020)
Demšar, J.: Statistical comparisons of classifiers over multiple data sets. JMLR 7, 1–30 (2006)
Dzeroski, S., Lavrac, N.: Relational Data Mining. Springer, Heidelberg (2001)
Fawaz, H.I., Forestier, G., Weber, J., Idoumghar, L., Muller, P.: Deep learning for time series classification: a review. Data Min. Knowl. Disc. 33(4), 917–963 (2019). https://doi.org/10.1007/s10618-019-00619-1
Fawaz, H.I., et al.: InceptionTime: finding AlexNet for time series classification. Data Min. Knowl. Disc. 34(6), 1936–1962 (2020). https://doi.org/10.1007/s10618-020-00710-y
Gay, D., Bondu, A., Lemaire, V., Boullé, M., Clérot, F.: Multivariate time series classification: a relational way. In: Song, M., Song, I.-Y., Kotsis, G., Tjoa, A.M., Khalil, I. (eds.) DaWaK 2020. LNCS, vol. 12393, pp. 316–330. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59065-9_25
Hue, C., Boullé, M.: A new probabilistic approach in rank regression with optimal Bayesian partitioning. J. Mach. Learn. Res. 8, 2727–2754 (2007)
Lachiche, N.: Propositionalization. In: Encyclopedia of Machine Learning and Data Mining, pp. 1025–1031. Springer (2017)
Lines, J., Taylor, S., Bagnall, A.J.: Time series classification with HIVE-COTE: the hierarchical vote collective of transformation-based ensembles. ACM Trans. Knowl. Disc. Data 12(5), 52:1-52:35 (2018)
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Shannon, C.E.: A Mathematical theory of communication. ACM SIGMOBILE Mob. Comput. Commun. Rev. 5(1), 3–55 (2001)
Tan, C.W., Bergmeir, C., Petitjean, F., Webb, G.I.: Monash University, UEA, UCR time series regression archive. CoRR abs/2006.10996 (2020). https://arxiv.org/abs/2006.10996
Tan, C.W., Bergmeir, C., Petitjean, F., Webb, G.I.: Time series regression. CoRR abs/2006.12672 (2020). https://arxiv.org/abs/2006.12672
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Gay, D., Bondu, A., Lemaire, V., Boullé, M. (2021). Interpretable Feature Construction for Time Series Extrinsic Regression. In: Karlapalem, K., et al. Advances in Knowledge Discovery and Data Mining. PAKDD 2021. Lecture Notes in Computer Science(), vol 12712. Springer, Cham. https://doi.org/10.1007/978-3-030-75762-5_63
Download citation
DOI: https://doi.org/10.1007/978-3-030-75762-5_63
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-75761-8
Online ISBN: 978-3-030-75762-5
eBook Packages: Computer ScienceComputer Science (R0)