Abstract
The search for patterns or motifs in data represents a problem area of key interest to finance and economic researchers. In this paper, we introduce the motif tracking algorithm (MTA), a novel immune inspired (IS) pattern identification tool that is able to identify unknown motifs of a non specified length which repeat within time series data. The power of the algorithm comes from the fact that it uses a small number of parameters with minimal assumptions regarding the data being examined or the underlying motifs. Our interest lies in applying the algorithm to financial time series data to identify unknown patterns that exist. The algorithm is tested using three separate data sets. Particular suitability to financial data is shown by applying it to oil price data. In all cases, the algorithm identifies the presence of a motif population in a fast and efficient manner due to the utilization of an intuitive symbolic representation. The resulting population of motifs is shown to have considerable potential value for other applications such as forecasting and algorithm seeding.
Similar content being viewed by others
References
M. Ghiassi, H. Saidane, D. K. Zimbra. A Dynamic Artificial Neural Network Model for Forecasting Time Series Events. International Journal of Forecasting, vo. 21, no. 1, pp. 341–362, 2005.
G. Zhang, B. E. Patuwo, M. Y. Hu. Forecasting with Artificial Neural Networks: The State of the Art. International Journal of Forecasting, vol. 14, no. 1, pp. 35–62, 1998.
C. Grosan, A. Abraham, V. Ramos, S. Y. Han. Stock Market Prediction Using Multi Expression Programming. In Proceedings of Portuguese Conference of Artificial Intelligence, Workshop on Artificial Life and Evolutionary Algorithms, IEEE Press, Portuguese, pp. 73–78, 2005.
S. H. Chen. Genetic Algorithms and Genetic Programming in Computational Finance, Kluwer Academic Publishers, Dordrecht, The Netherlands, 2002.
I. Nunn, T. White. The Application of Antigenic Search Techniques to Time Series Forecasting. In Proceedings of Conference on Genetic and Evolutionary Computation, Washington D.C., USA, pp. 353–360, 2005.
J. H. Carter. The Immune System as a Model for Pattern Recognition and Classification. Journal of American Medical Informatics Association, vol. 7, no. 1, pp. 28–41, 2000.
L. N. de Castro, F. J. Von Zuben. Learning and Optimization Using the Clonal Selection Principle. IEEE Transactions on Evolutionary Computation, vol. 6, no. 3, pp. 239–251, 2002.
T. Knight, J. Timmis. AINE: An Immunological Approach to Data Mining. In Proceedings of IEEE International Conference on Data Mining, San Jose, CA. USA, pp. 297–304, 2001.
J. Lin, E. Keogh, S. Lonardi, P. Patel. Finding Motifs in Time Series. In Proceedings of the 2nd Workshop on Temporal Data Mining, the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Edmonton, Alberta, Canada, pp. 53–68, 2002.
W. O. Wilson, J. Feyereisl, U. Aickelin. Detecting Motifs in System Call Sequences. In Proceedings of the 8th International Workshop on Information Security Applications, Jeju, Korea, pp. 157–172, 2007.
E. B. Bell, S. M. Sparshott, C. Bunce. CD4+ T-cell Memory, CD45R Subsets and the Persistence of Antigen: A Unifying Concept. Immunology Today, vol. 19, no. 2, pp. 60–64, 1998.
X. Guan, E. C. Uberbacher. A Fast Look up Algorithm for Detecting Repetitive DNA Sequences. In Proceedings of the Pacific Symposium on Biocomputing, Singerpore, pp. 718–719, 1996.
G. Benson, M. S. Waterman. A Method for Fast Database Search for All K-nucleotide Repeats. Nucleic Acids Resrarch, vol. 22, no. 22, pp. 4828–4836, 1994.
I. Rigoutsos, A. Floratos. Combinatorial Pattern Discovery in Biological Sequences: The TEIRESIAS Algorithm. Bioinformatics, vol. 14, no. 1, pp. 55–67, 1998.
E. Keogh, P. Smyth. A Probabilistic Approach to Fast Pattern Matching in Time Series Databases. In Proceedings of the 3rd International Conference on Knowledge Discovery and Data Mining, California, USA, pp. 20–24, 1997.
C. Faloutsos, M. Ranganathan, Y. Manolopoulos. Fast Subsequence Matching in Time Series Databases. In Proceedings of ACM SIGMOD International Conference on Management of Data, Minneapolis, Minnesota, USA, pp. 419–429, 1994.
S. Singh. Pattern Modelling in Time Series Forecasting. Cybernetics and Systems, vol. 31, no. 1, pp. 49–66, 2000.
B. Chiu, E. Keogh, S. Lonardi. Probabilistic Discovery of Time Series Motifs. In Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington D.C., USA, pp. 493–498, 2003.
J. Lin, E. Keogh, S. Lonardi. Visualizing and Discovering Non Trivial Patterns in Large Time Series Databases. Information Visualization, vol. 4, no. 2, pp. 61–82, 2005.
J. Lin, E. Keogh, S. Lonardi, B. Chiu. A Symbolic Representation of Time Series, with Implications for Streaming Algorithms. In Proceedings of ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, San Diego, California, USA, pp. 2–11, 2003.
W. Wilson, S. Garrett. Modelling Immune Memory for Prediction and Computation. In Proceedings of the 3rd International Conference in Artificial Immune Systems, Catania, Sicily, Italy, pp. 386–399, 2004.
A. S. Perelson, G. Weisbuch. Immunology for Physicists. Reviews of Modern Physics, vol. 69, no. 4, pp. 1219–1267, 1997.
D. Chowdhury. Immune Networks: An Example of Complex Adaptive Systems. Artificial Immune Systems and their Applications, D. Dasgupta (ed.), pp. 89–104, 1999.
A. Yates, R. Callard. Cell Death and the Maintenance of Immunological Memory. Discrete and Continuous Dynamical Systems, vol. 1, no. 1, pp. 43–60, 2001.
J. J. Espinosa, J. Vandewalle. Predictive Control Using Fuzzy Models Applied to a Steam Generating Unit. In Proceedings of 3rd International Workshop on Fuzzy Logic and Intelligent Technologies for Nuclear Science and Industry, pp. 151–160, 1998.
G. Pellegrinetti, J. Benstman. Nonlinear Control Oriented Boiler Modeling: A Benchamrk Problem for Controller Design. IEEE Transactions Control Systems Technology, vol. 4, no. 1, pp. 57–64, 1996.
E. Keogh, S. Kasetty. On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration. Data Mining and Knowledge Discovery, vol. 7, no. 4, pp. 1384–5810, 2002.
E. Keogh, S. Lonardi, B. Chui. Finding Suprising Patterns in a Time Series Database in Linear Time and Space. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Edmonton, Alberta, Canada, pp. 550–556, 2002.
Author information
Authors and Affiliations
Corresponding author
Additional information
William Wilson graduated from the University of Wales Aberystwyth, UK, in 1996 where he received a degree in economics and financial accounting. After a number of years working in industry, he returned to Aberystwyth University in 2002 where he received a Master’s degree in computer science.
He is currently a Ph. D. candidate in the Department of Computer Science at Nottingham University.
His research interests include artificial immune systems, immune memory, time series analysis, and motif detection.
Phil Birkin received a mathematics degree and a computer science degree from the Open University, UK, in 1993 and 1995, respectively. He has worked in the computing industry for over 30 years. He is currently a Ph. D. candidate in researching robotics at the University of Nottingham.
His research interests include robot football, artificial immune systems, and fuzzy logic.
Uwe Aickelin received a management science degree from the University of Mannheim, Germany, in 1996 and a European Master and Ph.D. in management science from the University of Wales, Swansea, UK, in 1996 and 1999, respectively.
Following his Ph. D., he joined the University of the West of England in Bristol, where he worked for three years in the Mathematics Department as a lecturer in operational research. In 2002, he accepted a lectureship in computer science at the University of Bradford, mainly focusing on computer security. Since 2003 he has worked for the University of Nottingham in the School of Computer Science where he is now a reader in computer science and director of the Interdisciplinary Optimization Laboratory. He currently holds an EPSRC advanced fellowship focusing on artificial immune systems, anomaly detection and mathematical modelling. He has been awarded EPSRC research funding as principal investigator (including an adventure grant and two IDEAS factory projects) on topics including artificial immune systems, danger theory, computer security, robotics and agent based simulation. He is an associate editor of the IEEE Transactions on Evolutionary Computation, the assistant editor of the Journal of the Operational Research Society, and an editorial board member of Evolutionary Intelligence. He is a member of IEEE.
His research interests include mathematical modelling, heuristic optimization, artificial immune systems, and innate immunology applied to computer security problems.
Rights and permissions
About this article
Cite this article
Wilson, W., Birkin, P. & Aickelin, U. The motif tracking algorithm. Int. J. Autom. Comput. 5, 32–44 (2008). https://doi.org/10.1007/s11633-008-0032-0
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/s11633-008-0032-0