skip to main content
research-article

An Evolutive Frequent Pattern Tree-based Incremental Knowledge Discovery Algorithm

Authors Info & Claims
Published:04 February 2022Publication History
Skip Abstract Section

Abstract

To understand current situation in specific scenarios, valuable knowledge should be mined from both historical data and emerging new data. However, most existing algorithms take the historical data and the emerging data as a whole and periodically repeat to analyze all of them, which results in heavy computation overhead. It is also challenging to accurately discover new knowledge in time, because the emerging data are usually small compared to the historical data. To address these challenges, we propose a novel knowledge discovery algorithm based on double evolving frequent pattern trees that can trace the dynamically evolving data by an incremental sliding window. One tree is used to record frequent patterns from the historical data, and the other one records incremental frequent items. The structures of the double frequent pattern trees and their relationships are updated periodically according to the emerging data and a sliding window. New frequent patterns are mined from the incremental data and new knowledge can be obtained from pattern changes. Evaluations show that this algorithm can discover new knowledge from evolving data with good performance and high accuracy.

REFERENCES

  1. [1] Aggarwal Charu C., Yu Philip S., Han Jiawei, and Wang Jianyong. 2003. A Framework for Clustering Evolving Data Streams. In Proceedings 2003 VLDB Conference, Freytag Johann-Christoph, Lockemann Peter, Abiteboul Serge, Carey Michael, Selinger Patricia, and Heuer Andreas (Eds.). Morgan Kaufmann, San Francisco, 8192.DOI: https://doi.org/10.1016/B978-012722442-8/50016-1 Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. [2] Agrawal Rakesh, Srikant Ramakrishnan et al. 1994. Fast algorithms for mining association rules. In Proceedings of the 20th International Conference on Very Large Data Bases. 487499. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. [3] Ahmed U., Lin J. C. W., Srivastava G., Yasin R., and Djenouri Y.. 2021. An evolutionary model to mine high expected utility patterns from uncertain databases. IEEE Trans. Emerg. Topics Computat. Intell.1928.Google ScholarGoogle ScholarCross RefCross Ref
  4. [4] Borah Anindita and Nath Bhabesh. 2020. Rare association rule mining from incremental databases. Pattern Anal. Applic. 23, 1 (2020), 113134. DOI: https://doi.org/10.1007/s10044-018-0759-3Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. [5] Chen Jiawei, Liu Hongyan, Yang Yinghui (Catherine), and He Jun. 2019. Effective selection of a compact and high-quality review set with information preservation. ACM Trans. Manag. Inf. Syst. 10, 4 (Dec. 2019). DOI: https://doi.org/10.1145/3369395 Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. [6] Cheung Dwl, Han J., Ng V., and Wong Cy. 1996. Maintenance of discovered association rules in large databases: An incremental updating technique. In Proc. of the Intl. Conf. on Data Engineering (New Orleans, Louisiana, USA). 106–114. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. [7] Cheung David W., Lee S. D., and Kao Benjamin. 1997. A general incremental technique for maintaining discovered association rules. In Database Systems for Advanced Applications ’97. Advanced Database Research and Development Series, Vol. 6. WORLD SCIENTIFIC, 185194. DOI: https://doi.org/10.1142/9789812819536_0020 Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. [8] Danjuma S., Herawan T., Ismail M. A., Chiroma H., Abubakar A. I., and Zeki A. M.. 2017. A review on soft set-based parameter reduction and decision making. IEEE Access 5 (2017), 46714689. DOI: https://doi.org/10.1109/ACCESS.2017.2682231Google ScholarGoogle ScholarCross RefCross Ref
  9. [9] Davashi Razieh. 2021. ILUNA: Single-pass incremental method for uncertain frequent pattern mining without false positives. Inf. Sci. 564 (2021), 126. DOI: https://doi.org/10.1016/j.ins.2021.02.067Google ScholarGoogle ScholarCross RefCross Ref
  10. [10] Djenouri Youcef, Belhadi Asma, Fournier-Viger Philippe, and Fujita Hamido. 2018. Mining diversified association rules in big datasets: A cluster/GPU/genetic approach. Inf. Sci. 459 (2018), 117134. DOI: https://doi.org/10.1016/j.ins.2018.05.031Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. [11] Djenouri Youcef, Lin Jerry Chun-Wei, Nørvåg Kjetil, Ramampiaro Heri, and Yu Philip S.. 2021. Exploring decomposition for solving pattern mining problems. ACM Trans. Manag. Inf. Syst. 12, 2 (Feb. 2021). DOI: https://doi.org/10.1145/3439771 Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. [12] Gan Wensheng, Lin Jerry Chun-Wei, Chao Han-Chieh, Fournier-Viger Philippe, Wang Xuan, and Yu Philip S.. 2020. Utility-driven mining of trend information for intelligent system. ACM Trans. Manag. Inf. Syst. 11, 3 (June 2020). DOI: https://doi.org/10.1145/3391251 Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. [13] Gan Wensheng, Lin Jerry Chun-Wei, Fournier-Viger Philippe, and Chao Han-Chieh. 2016. More efficient algorithm for mining frequent patterns with multiple minimum supports. In Web-Age Information Management, Cui Bin, Zhang Nan, Xu Jianliang, Lian Xiang, and Liu Dexi (Eds.). Springer International Publishing, Cham, 316. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. [14] Gepperth Alexander and Hammer Barbara. 2016. Incremental learning algorithms and applications. In European Symposium on Artificial Neural Networks (ESANN’16), Bruges, Belgium.Google ScholarGoogle Scholar
  15. [15] Gu Bin, Quan Xin, Gu Yunhua, Sheng Victor S., and Zheng Guansheng. 2018. Chunk incremental learning for cost-sensitive hinge loss support vector machine. Pattern Recog. 83 (2018), 196208.Google ScholarGoogle ScholarCross RefCross Ref
  16. [16] Heraguemi Kamel Eddine, Kamel Nadjet, and Drias Habiba. 2016. Multi-swarm bat algorithm for association rule mining using multiple cooperative strategies. Appl. Intell. 45, 4 (2016), 10211033.Google ScholarGoogle ScholarCross RefCross Ref
  17. [17] Hong Tzung Pei, Lin Chun Wei, and Wu Yu Lung. 2008. An efficient FUFP-tree maintenance algorithm for record modification. Int. J. Innov. Comput. Inf. Contr. 4, 11 (2008).Google ScholarGoogle Scholar
  18. [18] Huang Yanyong, Li Tianrui, Luo Chuan, Fujita Hamido, and Horng Shi-jinn. 2017. Matrix-based dynamic updating rough fuzzy approximations for data mining. Knowl.-based Syst. 119 (2017), 273283. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. [19] Jiang Yexi, Zeng Chunqiu, Xu Jian, and Li Tao. 2014. Real time contextual collective anomaly detection over multiple data streams. Proceedings of the ODD. 2330.Google ScholarGoogle Scholar
  20. [20] Jiang Yu, Zhao Minghao, Hu Chengquan, He Lili, Bai Hongtao, and Wang Jin. 2018. A parallel FP-growth algorithm on World Ocean Atlas data with multi-core CPU. J. Supercomput. 75, 2 (2018). Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. [21] Kim Jaein and Hwang Buhyun. 2016. Real-time stream data mining based on CanTree and Gtree. Inf. Sci. 367 (2016), 512528. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. [22] Krempl Georg, Žliobaite Indre, Brzeziński Dariusz, Hüllermeier Eyke, Last Mark, Lemaire Vincent, Noack Tino, Shaker Ammar, Sievi Sonja, Spiliopoulou Myra et al. 2014. Open challenges for data stream mining research. ACM SIGKDD Explor. Newslett. 16, 1 (2014), 110. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. [23] Le Tuong, Vo Bay, Fournier-Viger Philippe, Lee Mi Young, and Baik Sung Wook. 2019. SPPC: A new tree structure for mining erasable patterns in data streams. Appl. Intell. 49, 2 (Feb. 2019), 478495. DOI: https://doi.org/10.1007/s10489-018-1280-5 Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. [24] Li Ning, Luo Wenjuan, Yang Kun, Zhuang Fuzhen, He Qing, and Shi Zhongzhi. 2018. Self-organizing weighted incremental probabilistic latent semantic analysis. Int. J. Mach. Learn. Cybern. 9, 12 (Dec. 2018), 19871998. DOI: https://doi.org/10.1007/s13042-017-0681-9Google ScholarGoogle ScholarCross RefCross Ref
  25. [25] Liang Guanqing, Zhao Jingxin, Lau Helena Yan Ping, and Leung Cane Wing-Ki. 2021. Using social media to analyze public concerns and policy responses to COVID-19 in Hong Kong. ACM Trans. Manag. Inf. Syst. 12, 4 (Sept. 2021). DOI: https://doi.org/10.1145/3460124 Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. [26] Lin Chun-Wei and Hong Tzung-Pei. 2014. Maintenance of prelarge trees for data mining with modified records. Inf. Sci. 278 (2014), 88103. DOI: https://doi.org/10.1016/j.ins.2014.03.023Google ScholarGoogle ScholarCross RefCross Ref
  27. [27] Lin Chun-Wei, Hong Tzung-Pei, and Lu Wen-Hsiang. 2009. The pre-FUFP algorithm for incremental mining. Expert Syst. Applic. 36, 5 (2009), 94989505. DOI: https://doi.org/10.1016/j.eswa.2008.03.014 Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. [28] Lin Jerry Chun-Wei, Gan Wensheng, and Hong Tzung-Pei. 2016. Maintaining the discovered high-utility itemsets with transaction modification. Appl. Intell. 44, 1 (2016), 166178. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. [29] Lin Jerry Chun-Wei, Shao Yina, Fournier-Viger Philippe, Djenouri Youcef, and Guo Xiangmin. 2018. Maintenance algorithm for high average-utility itemsets with transaction deletion. Appl. Intell. 48, 10 (Oct. 2018), 36913706. DOI: https://doi.org/10.1007/s10489-018-1180-8 Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. [30] Lin Kawuu W., Chung Sheng-Hao, and Lin Chun-Cheng. 2016. A fast and distributed algorithm for mining frequent patterns in congested networks. Computing 98, 3 (2016), 235256. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. [31] Liu Xin, Zhang Xiaomiao, Wang Yiwen, Zhou Jiehan, Helal Sumi, Xu Zhidong, Zhang Weishan, and Cao Shuai. 2018. PARMTRD: Parallel association rules based multiple-topic relationships detection. In International Conference on Web Services. Springer, 422436.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. [32] Luna José María, Cano Alberto, Pechenizkiy Mykola, and Ventura Sebastián. 2016. Speeding-up association rule mining with inverted index compression. IEEE Trans. Cybern. 46, 12 (2016), 30593072.Google ScholarGoogle ScholarCross RefCross Ref
  33. [33] Martín D., Martínez-Ballesteros María, García-Gil Diego, Alcalá-Fdez Jesús, Herrera Francisco, and Riquelme-Santos J. C.. 2018. MRQAR: A generic MapReduce framework to discover quantitative association rules in big data problems. Knowl.-based Syst. 153 (2018), 176192. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. [34] Ni Li, Luo Wenjian, Lu Nannan, and Zhu Wenjie. 2020. Mining the local dependency itemset in a products network. ACM Trans. Manag. Inf. Syst. 11, 1 (Apr. 2020). https://doi.org/10.1145/3384473 Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. [35] Pratama Mahardhika, Lu Jie, Lughofer Edwin, Zhang Guangquan, and Er Meng Joo. 2016. An incremental learning of concept drifts using evolving type-2 recurrent fuzzy neural networks. IEEE Trans. Fuzzy Syst. 25, 5 (2016), 11751192.Google ScholarGoogle Scholar
  36. [36] Pérez-Sánchez Beatriz, Fontenla-Romero Oscar, and Guijarro-Berdiñas Bertha. 2018. A review of adaptive online learning for artificial neural networks. Artif. Intell. Rev. 49, 2 (2018), 281299.DOI: https://doi.org/10.1007/s10462-016-9526-2 Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. [37] Qiu Xueheng, Suganthan Ponnuthurai Nagaratnam, and Amaratunga Gehan A. J.. 2018. Ensemble incremental learning random vector functional link network for short-term electric load forecasting. Knowl.-based Syst. 145 (2018), 182196.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. [38] Qu Zhiguo, Keeney John, Robitzsch Sebastian, Zaman Faisal, and Wang Xiaojun. 2016. Multilevel pattern mining architecture for automatic network monitoring in heterogeneous wireless communication networks. China Commun. 13, 7 (2016), 108116.Google ScholarGoogle ScholarCross RefCross Ref
  39. [39] Rashid Md Mamunur, Amar Muhammad, Gondal Iqbal, and Kamruzzaman Joarder. 2016. A data mining approach for machine fault diagnosis based on associated frequency patterns. Appl. Intell. 45, 3 (2016), 638651. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. [40] Saggi Mandeep Kaur and Jain Sushma. 2018. A survey towards an integration of big data analytics to big insights for value-creation. Inf. Process. Manag. 54, 5 (Sept. 2018), 758790. https://doi.org/10.1016/j.ipm.2018.01.010Google ScholarGoogle ScholarCross RefCross Ref
  41. [41] Shi Wenzhong, Zhang Anshu, and Webb Geoffrey I.. 2018. Mining significant crisp-fuzzy spatial association rules. Int. J. Geog. Inf. Sci. 32, 6 (2018), 12471270.Google ScholarGoogle ScholarCross RefCross Ref
  42. [42] Soysal Ömer M., Gupta Eera, and Donepudi Harisha. 2016. A sparse memory allocation data structure for sequential and parallel association rule mining. J. Supercomput. 72, 2 (Feb. 2016), 347370. https://doi.org/10.1007/s11227-015-1566-x Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. [43] Srivastava G., Lin J. C. W., Jolfaei A., Li Y., and Djenouri Y.. 2020. Uncertain-driven analytics of sequence data in IoCV environments. IEEE Trans. Intell. Transport. Syst. 22, 8 (2020), 112. DOI: https://doi.org/10.1109/TITS.2020.3012387Google ScholarGoogle Scholar
  44. [44] Srivastava G., Lin J. C. W., Zhang X., and Li Y.. 2020. Large-scale high-utility sequential pattern analytics in internet of things. IEEE Internet Things J. 8, 16 (2020), 11. DOI: https://doi.org/10.1109/JIOT.2020.3026826Google ScholarGoogle Scholar
  45. [45] Sun Y., Tang K., Zhu Z., and Yao X.. 2018. Concept drift adaptation by exploiting historical knowledge. IEEE Trans. Neural Netw. Learn. Syst. 29, 10 (2018), 48224832. DOI: https://doi.org/10.1109/TNNLS.2017.2775225Google ScholarGoogle ScholarCross RefCross Ref
  46. [46] Tang Liang, Tang Chang-Jie, Duan Lei, Li Chuan, Jiang Ye-Xi, Zeng Chun-Qiu, and Zhu Jun. 2008. MoStream: An efficient algorithm for monitoring clusters evolving in data streams. In Proceedings of the IEEE International Conference on Granular Computing. IEEE, 582587.Google ScholarGoogle Scholar
  47. [47] Tank Darshan. 2012. Real-time Business Intelligence & Frequent Pattern Mining Algorithm: Timely Consistent Analysis Using Real-time Data Warehouse Environment and Improving Efficiency of Apriori Algorithm. LAP Lambert Academic Publishing. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. [48] Thanh-Long Nguyen, Bay Vo, and Snasel Vaclav. 2017. Efficient algorithms for mining colossal patterns in high dimensional databases. Knowl.-based Systems 122 (2017). Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. [49] Aalst Wil van der. 2012. Process mining: Overview and opportunities. ACM Trans. Manag. Inf. Syst. 3, 2 (July 2012). DOI: https://doi.org/10.1145/2229156.2229157 Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. [50] Wei-Dong Huang, Qian Wang, and Jie Cao. 2018. Tracing public opinion propagation and emotional evolution based on public emergencies in social networks. Int. J. Comput. Commun. Contr. 13, 1 (2018), 129142.Google ScholarGoogle ScholarCross RefCross Ref
  51. [51] Wu Jun, He Zengyou, Gu Feiyang, Liu Xiaoqing, Zhou Jianyu, and Yang Can. 2016. Computing exact permutation p-values for association rules. Inf. Sci. 346 (2016), 146162. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. [52] Xun Y., Cui Xiaohui, Zhang Jifu, and Yin Qingxia. 2021. Incremental frequent itemsets mining based on frequent pattern tree and multi-scale. Expert Syst. Appl. 163 (2021), 113805.Google ScholarGoogle ScholarCross RefCross Ref
  53. [53] Smyth Usama M. Fayyad, Gregory Piatetsky-Shapiro, Padhraic, and (Eds.). Ramasamy Uthurusamy1996. Advances in Knowledge Discovery and Data Mining. AAAI/MIT Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. [54] Zhang Binbin, Lin Jerry Chun-Wei, Shao Yinan, Fournier-Viger Philippe, and Djenouri Youcef. 2018. Maintenance of discovered high average-utility itemsets in dynamic databases. Appl. Sci. 8, 5 (2018). DOI: https://doi.org/10.3390/app8050769Google ScholarGoogle ScholarCross RefCross Ref
  55. [55] Zhu Lei, Ikeda Kazushi, Pang Shaoning, Ban Tao, and Sarrafzadeh Abdolhossein. 2018. Merging weighted SVMs for parallel incremental learning. Neural Netw. 100 (2018), 2538.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. An Evolutive Frequent Pattern Tree-based Incremental Knowledge Discovery Algorithm

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Management Information Systems
      ACM Transactions on Management Information Systems  Volume 13, Issue 3
      September 2022
      312 pages
      ISSN:2158-656X
      EISSN:2158-6578
      DOI:10.1145/3512349
      Issue’s Table of Contents

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 4 February 2022
      • Accepted: 1 November 2021
      • Revised: 1 October 2021
      • Received: 1 January 2021
      Published in tmis Volume 13, Issue 3

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    View Full Text

    HTML Format

    View this article in HTML Format .

    View HTML Format