Skip to main content
Log in

Generalized maximal utility for mining high average-utility itemsets

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Mining high average-utility itemsets (HAUIs) is a promising research topic in data mining because, in contrast to high utility itemsets, they are not biased toward long itemsets. Regardless of what upper bounds and pruning strategies are used, most existing HAUI mining algorithms are founded on the concept of maximal utility, namely the highest utility of a single item in each transaction. In this paper, we study this problem by generalizing the typical maximal utility and average-utility upper bound from a single item to an itemset, and propose an efficient HAIU mining algorithm based on generalized maximal utility (HAUIM-GMU). For this algorithm, we first propose the concepts of generalized maximal utility and the generalized average-utility upper bound, and discuss how the proposed upper bound can be made tighter to generate fewer candidates. A new pruning strategy is then proposed based on the concept of support, and this is shown to be effective for filtering out unpromising itemsets. The final algorithm is described in detail. Extensive experimental results show that the HAUIM-GMU algorithm outperforms existing state-of-the-art algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: Proceedings 20th international conference on very large data bases. Morgan Kaufmann, Santiago de Chile, pp 487–499

  2. Deng Z-H (2018) An efficient structure for fast mining high utility itemsets. Appl Intell 48(9):3161–3177

    Article  Google Scholar 

  3. Fournier-Viger P, Lin CW, Gomariz A, Gueniche T, Soltani A, Deng Z, Lam HT (2016) The SPMF open-source data mining library version 2. In: Proceedings of the 19th European conference on machine learning and knowledge discovery in databases, Riva del Garda, Italy (September 2016) Lecture notes in computer science, vol 9853. Springer, Cham, pp 36–40

  4. Han J, Pei J, Yin Y, Mao R (2004) Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min Knowl Discov 8(1):53–87

    Article  MathSciNet  Google Scholar 

  5. Hong T-P, Lee C-H, Wang S-L (2009) Mining high average-utility itemsets. In: Proceedings of the 2009 IEEE international conference on systems, man, and cybernetics. IEEE, San Antonio, pp 2526–2530

  6. Jaysawal BP, Huang J-W (2019) DMHUPS: discovering multiple high utility patterns simultaneously. Knowl Inf Syst 59(2):337–359

    Article  Google Scholar 

  7. Kim D, Yun U (2017) Efficient algorithm for mining high average-utility itemsets in incremental transaction databases. Appl Intell 47(1):114–131

    Article  Google Scholar 

  8. Lan G-C, Hong T-P, Tseng VS (2012) Efficiently mining high average-utility itemsets with an improved upper-bound strategy. Int J Inf Tech Decis 11(5):1009–1030

    Article  Google Scholar 

  9. Lan G-C, Hong T-P, Tseng VS (2012) A projection-based approach for discovering high average-utility itemsets. J Inform Sci Eng 28:193–209

    Google Scholar 

  10. Lin C-W, Hong T-P, Lu W-H (2010) Efficiently mining high average utility itemsets with a tree structure. In: Proceedings of the second international conference on intelligent information and database systems, Hue City, Vietnam (March 2010). Lecture notes in computer science, vol 5990. Springer, Berlin, pp 131–139

  11. Lin J C-W, Li T, Fournier-Viger P, Hong T-P, Su J-H (2016) Efficient mining of high average-utility itemsets with multiple minimum thresholds. In: Proceedings of the industrial conference on data mining, New York, NY, USA (July 2016). Lecture notes in computer science, vol 9728. Springer, Cham, pp 14–28

  12. Lin JC-W, Li T, Fournier-Viger P, Hong T-P, Zhan J, Voznak M (2016) An efficient algorithm to mine high average-utility itemsets. Adv Eng Inform 30(2):233–243

    Article  Google Scholar 

  13. Lin JC-W, Ren S, Fournier-Viger P (2018) MEMU: more efficient algorithm to mine high average-utility patterns with multiple minimum average-utility thresholds. IEEE Access 6:7593–7609

    Article  Google Scholar 

  14. Lin JC-W, Ren S, Fournier-Viger P, Hong T-P (2017) EHAUPM: efficient high average-utility pattern mining with tighter upper bounds. IEEE Access 5:12927–12940

    Article  Google Scholar 

  15. Lin JC-W, Shao Y, Fournier-Viger P, Djenouri Y, Guo X (2018) Maintenance algorithm for high average-utility itemsets with transaction deletion. Appl Intell 48(10):3691–3706

    Article  Google Scholar 

  16. Liu Y, Liao W-K, Choudhary AN (2005) A two-phase algorithm for fast discovery of high utility itemsets. In: Proceedings of the 9th Pacific-Asia conference on advances in knowledge discovery and data mining, Hanoi, Vietnam (May 2005). Lecture notes in computer science, vol 3518. Springer, Berlin, pp 689–695

  17. Lu T, Vo B, Nguyen H, Hong T-P (2015) A new method for mining high average utility itemsets. In: Proceedings of the 13th IFIP international conference on computer information systems and industrial management. Springer, Ho Chi Minh City, pp 33–42

  18. Sethi KK, Ramesh D, Sreenu M (2019) Parallel high average-utility itemset mining using better search space division approach. In: Proceedings of the international conference on distributed computing and internet technology, Bhubaneswar, India (January 2019). Lecture notes in computer science, vol 11319. Springer, Cham, pp 108–124

  19. Song W, Liu Y, Li JH (2014) Mining high utility itemsets by dynamically pruning the tree structure. Appl Intell 40(1):29–43

    Article  Google Scholar 

  20. Song W, Liu Y, Li JH (2014) BAHUI: fast and memory efficient mining of high utility itemsets based on bitmap. Int J Data Warehous 10(1):1–15

    Article  Google Scholar 

  21. Song W, Yang BR, Xu ZY (2008) Index-BitTableFI: an improved algorithm for mining frequent itemsets. Knowl-Based Syst 21(6):507–513

    Article  Google Scholar 

  22. Song W, Zhang Z, Li JH (2016) A high utility itemset mining algorithm based on subsume index. Knowl Inf Syst 49(1):315–340

    Article  Google Scholar 

  23. Wu JM-T, Lin JC-W, Pirouz M, Fournier-Viger P (2018) TUB-HAUPM: tighter upper bound for mining high average-utility patterns. IEEE Access 6:18655–18669

    Article  Google Scholar 

  24. Wu R, He Z (2018) Top-k high average-utility itemsets mining with effective pruning strategies. Appl Intell 48(10):3429–3445

    Article  Google Scholar 

  25. Yun U, Kim D (2017) Mining of high average-utility itemsets using novel list structure and pruning strategy. Future Gen Comp Syst 68:346–360

    Article  Google Scholar 

  26. Yun U, Kim D, Ryang H, Lee G, Lee K-M (2016) Mining recent high average utility patterns based on sliding window from stream data. J Intell Fuzzy Syst 30(6):3605–3617

    Article  Google Scholar 

  27. Yun U, Kim D, Yoon E, Fujita H (2018) Damped window based high average utility pattern mining over data streams. Knowl-Based Syst 144:188–205

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to thank the anonymous reviewers for their valuable comments and suggestions, which helped to improve the quality of this paper. This work was partially supported by the National Natural Science Foundation of China (61977001) and the Great Wall Scholar Program (CIT & TCD20190305).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wei Song.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Song, W., Liu, L. & Huang, C. Generalized maximal utility for mining high average-utility itemsets. Knowl Inf Syst 63, 2947–2967 (2021). https://doi.org/10.1007/s10115-021-01614-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-021-01614-z

Keywords

Navigation