skip to main content
research-article

Profile Decomposition Based Hybrid Transfer Learning for Cold-Start Data Anomaly Detection

Published:30 July 2022Publication History
Skip Abstract Section

Abstract

Anomaly detection is an essential task for quality management in smart manufacturing. An accurate data-driven detection method usually needs enough data and labels. However, in practice, there commonly exist newly set-up processes in manufacturing, and they only have quite limited data available for analysis. Borrowing the name from the recommender system, we call this process a cold-start process. The sparsity of anomaly, the deviation of the profile, and noise aggravate the detection difficulty.

Transfer learning could help to detect anomalies for cold-start processes by transferring the knowledge from more experienced processes to the new processes. However, the existing transfer learning and multi-task learning frameworks are established on task- or domain-level relatedness. We observe instead, within a domain, some components (background and anomaly) share more commonality, others (profile deviation and noise) not. To this end, we propose a more delicate component-level transfer learning scheme, i.e., decomposition-based hybrid transfer learning (DHTL): It first decomposes a domain (e.g., a data source containing profiles) into different components (smooth background, profile deviation, anomaly, and noise); then, each component’s transferability is analyzed by expert knowledge; Lastly, different transfer learning techniques could be tailored accordingly. We adopted the Bayesian probabilistic hierarchical model to formulate parameter transfer for the background, and “L2,1+L1”-norm to formulate low dimension feature-representation transfer for the anomaly. An efficient algorithm based on Block Coordinate Descend is proposed to learn the parameters. A case study based on glass coating pressure profiles demonstrates the improved accuracy and completeness of detected anomaly, and a simulation demonstrates the fidelity of the decomposition results.

REFERENCES

  1. [1] Bakker Bart and Heskes Tom. 2003. Task clustering and gating for bayesian multitask learning. Journal of Machine Learning Research 4, May (2003), 8399.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. [2] Beck Amir and Teboulle Marc. 2009. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM Journal on Imaging Sciences 2, 1 (2009), 183202.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. [3] Ben-David Shai and Schuller Reba. 2003. Exploiting task relatedness for multiple task learning. In Proceedings of the Learning Theory and Kernel Machines. Springer, 567580.Google ScholarGoogle ScholarCross RefCross Ref
  4. [4] Bitar Ahmad W., Cheong Loong-Fah, and Ovarlez Jean-Philippe. 2019. Sparse and low-rank matrix decomposition for automatic target detection in hyperspectral imagery. IEEE Transactions on Geoscience and Remote Sensing 57, 8 (2019), 52395251.Google ScholarGoogle ScholarCross RefCross Ref
  5. [5] Chandola Varun, Banerjee Arindam, and Kumar Vipin. 2009. Anomaly detection: A survey. ACM Computing Surveys 41, 3 (2009), 158.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. [6] Chen Jianhui, Liu Ji, and Ye Jieping. 2012. Learning incoherent sparse and low-rank patterns from multiple tasks. ACM Transactions on Knowledge Discovery from Data 5, 4 (2012), 131.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. [7] Longwei Cheng, Kai Wang, and Fugee Tsung. 2021. A hybrid transfer learning framework for in-plane freeform shape accuracy control in additive manufacturing. IISE Transactions 53, 3 (2021), 298–312. Google ScholarGoogle ScholarCross RefCross Ref
  8. [8] Coakley John D.. 1950. Human operators and automatic machines. Personnel Psychology 3, 4 (1950), 401411.Google ScholarGoogle ScholarCross RefCross Ref
  9. [9] Du Bo and Zhang Liangpei. 2014. A discriminative metric learning based anomaly detection method. IEEE Transactions on Geoscience and Remote Sensing 52, 11 (2014), 68446857.Google ScholarGoogle ScholarCross RefCross Ref
  10. [10] Bo Du, Liangpei Zhang, Dacheng Tao, and Dengyi Zhang. 2013. Unsupervised transfer learning for target detection from hyperspectral images. Neurocomputing 120 (2013), 72–82. Google ScholarGoogle ScholarCross RefCross Ref
  11. [11] Martin Ester, Hans-Peter Kriegel, Jörg Sander, and Xiaowei Xu. 1996. A density-based algorithm for discovering clusters in large spatial databases with noise. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining. 226231.Google ScholarGoogle Scholar
  12. [12] Gonçalves André R., Das Puja, Chatterjee Soumyadeep, Sivakumar Vidyashankar, Zuben Fernando J. Von, and Banerjee Arindam. 2014. Multi-task sparse structure learning. In Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management. 451460.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. [13] Gonçalves André R., Zuben Fernando J. Von, and Banerjee Arindam. 2016. Multi-task sparse structure learning with gaussian copula models. The Journal of Machine Learning Research 17, 1 (2016), 12051234.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. [14] Guo Jie, Yan Hao, Zhang Chen, and Hoi Steven. 2020. Partially Observable Online Change Detection via Smooth-Sparse Decomposition. arXiv:2009.10645. Retrieved from https://arxiv.org/abs/2009.10645.Google ScholarGoogle Scholar
  15. [15] Huang Shuai, Li Jing, Chen Kewei, Wu Teresa, Ye Jieping, Wu Xia, and Yao Li. 2012. A transfer learning approach for network modeling. IIE Transactions 44, 11 (2012), 915931.Google ScholarGoogle ScholarCross RefCross Ref
  16. [16] Idé Tsuyoshi, Phan Dzung T., and Kalagnanam Jayant. 2017. Multi-task multi-modal models for collective anomaly detection. In Proceedings of the 2017 IEEE International Conference on Data Mining. IEEE, 177186.Google ScholarGoogle ScholarCross RefCross Ref
  17. [17] III Hal Daume and Marcu Daniel. 2006. Domain adaptation for statistical classifiers. Journal of Artificial Intelligence Research 26,1 (2006), 101126.Google ScholarGoogle Scholar
  18. [18] Jiang Ruoyi, Fei Hongliang, and Huan Jun. 2011. Anomaly localization for network data streams with graph joint sparse PCA. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 886894.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. [19] Kumagai Atsutoshi, Iwata Tomoharu, and Fujiwara Yasuhiro. 2019. Transfer anomaly detection by inferring latent domain representations. In Proceedings of the Advances in Neural Information Processing Systems. 24712481.Google ScholarGoogle Scholar
  20. [20] Li Bin, Yang Qiang, and Xue Xiangyang. 2009. Transfer learning for collaborative filtering via a rating-matrix generative model. In Proceedings of the 26th Annual International Conference on Machine Learning. ACM, 617624.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. [21] Li Ziyue, Sergin Nurettin Dorukhan, Yan Hao, Zhang Chen, and Tsung Fugee. 2020. Tensor completion for weakly-dependent data on graph for metro passenger flow prediction. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 48044810.Google ScholarGoogle ScholarCross RefCross Ref
  22. [22] Jun Liu, Shuiwang Ji, and Jieping Ye. 2009. Multi-task feature learning via efficient l2, 1-norm minimization. In Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence, UAI 2009. AUAI Press, 339–348.Google ScholarGoogle Scholar
  23. [23] Song Liu, Makoto Yamada, Nigel Collier, and Masashi Sugiyama. 2013. Change-point detection in time-series data by relative density-ratio estimation. Neural Networks 43 (2013), 72–83. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. [24] Ma Jiaqi, Zhao Zhe, Yi Xinyang, Chen Jilin, Hong Lichan, and Chi Ed H.. 2018. Modeling task relationships in multi-task learning with multi-gate mixture-of-experts. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 19301939.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. [25] Marques Pedro A., Cardeira Carlos B., Paranhos Paula, Ribeiro Sousa, and Gouveia Helena. 2015. Selection of the most suitable statistical process control approach for short production runs: A decision-model. International Journal of Information and Education Technology 5, 4 (2015), 303.Google ScholarGoogle ScholarCross RefCross Ref
  26. [26] Masoudnia Saeed and Ebrahimpour Reza. 2014. Mixture of experts: A literature survey. Artificial Intelligence Review 42, 2 (2014), 275293.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. [27] Pan Sinno Jialin and Yang Qiang. 2009. A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering 22, 10 (2009), 13451359.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. [28] Pan Sinno Jialin, Zheng Vincent Wenchen, Yang Qiang, and Hu Derek Hao. 2008. Transfer learning for wifi-based indoor localization. In Proceedings of the Association for the Advancement of Artificial Intelligence Workshop. 6.Google ScholarGoogle Scholar
  29. [29] Pratt Lorien Y.. 1993. Discriminability-based transfer between neural networks. In Proceedings of the Advances in Neural Information Processing Systems. 204211.Google ScholarGoogle Scholar
  30. [30] Qu Ying, Wang Wei, Guo Rui, Ayhan Bulent, Kwan Chiman, Vance Steven, and Qi Hairong. 2018. Hyperspectral anomaly detection through spectral unmixing and dictionary-based low-rank decomposition. IEEE Transactions on Geoscience and Remote Sensing 56, 8 (2018), 43914405.Google ScholarGoogle ScholarCross RefCross Ref
  31. [31] Samarov Daniel V., Allen David, Hwang Jeeseong, Lee Young Jong, and Litorja Maritoni. 2017. A coordinate-descent-based approach to solving the sparse group elastic net. Technometrics 59, 4 (2017), 437445.Google ScholarGoogle ScholarCross RefCross Ref
  32. [32] Bo Shen, Rongxuan Wang, Andrew Chung Chee Law, Rakesh Kamath, Hahn Choo, and Zhenyu (James) Kong. 2022. Super resolution for multi-Sources image stream data using smooth and sparse tensor completion and its applications in data acquisition of additive manufacturing. Technometrics 64, 1 (2022), 2–17. Google ScholarGoogle ScholarCross RefCross Ref
  33. [33] Tsung Fugee, Zhang Ke, Cheng Longwei, and Song Zhenli. 2018. Statistical transfer learning: A review and some extensions to statistical process control. Quality Engineering 30, 1 (2018), 115128.Google ScholarGoogle ScholarCross RefCross Ref
  34. [34] Wang Zirui, Dai Zihang, Póczos Barnabás, and Carbonell Jaime. 2019. Characterizing and avoiding negative transfer. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1129311302.Google ScholarGoogle ScholarCross RefCross Ref
  35. [35] Weiss Karl, Khoshgoftaar Taghi M., and Wang DingDing. 2016. A survey of transfer learning. Journal of Big Data 3, 1 (2016), 9.Google ScholarGoogle ScholarCross RefCross Ref
  36. [36] Tao Wu, Ellie Ka-In Chio, Heng-Tze Cheng, Yu Du, Steffen Rendle, Dima Kuzmin, Ritesh Agarwal, Li Zhang, John Anderson, Sarvjeet Singh, Tushar Chandra, Ed H. Chi, Wen Li, Ankit Kumar, Xiang Ma, Alex Soares, Nitin Jindal, and Pei Cao. 2020. Zero-shot heterogeneous transfer learning from recommender systems to cold-start search retrieval. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management. 28212828.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. [37] Xu Yang, Wu Zebin, Li Jun, Plaza Antonio, and Wei Zhihui. 2015. Anomaly detection in hyperspectral images based on low-rank and sparse representation. IEEE Transactions on Geoscience and Remote Sensing 54, 4 (2015), 19902000.Google ScholarGoogle ScholarCross RefCross Ref
  38. [38] Yan Hao, Paynabar Kamran, and Shi Jianjun. 2017. Anomaly detection in images with smooth background via smooth-sparse decomposition. Technometrics 59, 1 (2017), 102114.Google ScholarGoogle ScholarCross RefCross Ref
  39. [39] Yan Hao, Paynabar Kamran, and Shi Jianjun. 2018. Real-time monitoring of high-dimensional functional data streams via spatio-temporal smooth sparse decomposition. Technometrics 60, 2 (2018), 181197.Google ScholarGoogle ScholarCross RefCross Ref
  40. [40] Yan Ming, Sang Jitao, Mei Tao, and Xu Changsheng. 2013. Friend transfer: Cold-start friend recommendation with cross-platform transfer learning of social knowledge. In Proceedings of the 2013 IEEE International Conference on Multimedia and Expo. IEEE, 16.Google ScholarGoogle Scholar
  41. [41] Yuan Yuan, Chen Nan, and Zhou Shiyu. 2013. Adaptive B-spline knot selection using multi-resolution basis set. IIE Transactions 45, 12 (2013), 12631277.Google ScholarGoogle ScholarCross RefCross Ref
  42. [42] Xiaowei Yue, Hao Yan, Jin Gyu Park, Zhiyong Liang, and Jianjun Shi. 2018. A wavelet-based penalized mixed-effects decomposition for Multichannel profile detection of in-line Raman spectroscopy. IEEE Transactions on Automation Science and Engineering 15, 3 (2018), 1258–1271. Google ScholarGoogle ScholarCross RefCross Ref
  43. [43] Yuksel Seniha Esen, Wilson Joseph N., and Gader Paul D.. 2012. Twenty years of mixture of experts. IEEE Transactions on Neural Networks and Learning Systems 23, 8 (2012), 11771193.Google ScholarGoogle ScholarCross RefCross Ref
  44. [44] Zhang Chen, Yan Hao, Lee Seungho, and Shi Jianjun. 2018. Weakly correlated profile monitoring based on sparse multi-channel functional principal component analysis. IISE Transactions 50, 10 (2018), 878891.Google ScholarGoogle ScholarCross RefCross Ref
  45. [45] Zhang Xiaotong, Zhang Xianchao, Liu Han, and Luo Jiebo. 2018. Multi-task clustering with model relation learning. In Proceedings of the 27th International Joint Conference on Artificial Intelligence. 31323140.Google ScholarGoogle Scholar
  46. [46] Zhang Yuxiang, Du Bo, Zhang Liangpei, and Liu Tongliang. 2016. Joint sparse representation and multitask learning for hyperspectral target detection. IEEE Transactions on Geoscience and Remote Sensing 55, 2 (2016), 894906.Google ScholarGoogle ScholarCross RefCross Ref
  47. [47] Zhang Yuxiang, Du Bo, Zhang Liangpei, and Wang Shugen. 2015. A low-rank and sparse matrix decomposition-based Mahalanobis distance method for hyperspectral anomaly detection. IEEE Transactions on Geoscience and Remote Sensing 54, 3 (2015), 13761389.Google ScholarGoogle ScholarCross RefCross Ref
  48. [48] Yu Zhang and Qiang Yang. 2021. A survey on multi-task learning. IEEE Transactions on Knowledge and Data Engineering (2021), 1–1. Google ScholarGoogle ScholarCross RefCross Ref
  49. [49] Zhao Liang, Sun Qian, Ye Jieping, Chen Feng, Lu Chang-Tien, and Ramakrishnan Naren. 2015. Multi-task learning for spatio-temporal event forecasting. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 15031512.Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. [50] Zhao Liang, Sun Qian, Ye Jieping, Chen Feng, Lu Chang-Tien, and Ramakrishnan Naren. 2017. Feature constrained multi-task learning models for spatiotemporal event forecasting. IEEE Transactions on Knowledge and Data Engineering 29, 5 (2017), 10591072.Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. [51] Yujie Zhao, Hao Yan, Sarah Holte, and Yajun Mei. 2022. Rapid detection of hot-spots via tensor decomposition with applications to crime rate data. Journal of Applied Statistics 49, 7 (2022), 1636–1662. Google ScholarGoogle ScholarCross RefCross Ref
  52. [52] Zhong Runxing, Lv Weifeng, Du Bowen, Lei Shuo, and Huang Runhe. 2017. Spatiotemporal multi-task learning for citywide passenger flow prediction. In Proceedings of the 2017 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computed, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI). IEEE, 18.Google ScholarGoogle Scholar
  53. [53] Zhou Jiayu, Chen Jianhui, and Ye Jieping. 2011. Malsar: Multi-task learning via structural regularization. Arizona State University 21 (2011).Google ScholarGoogle Scholar
  54. [54] Zhou Jiayu, Yuan Lei, Liu Jun, and Ye Jieping. 2011. A multi-task learning formulation for predicting disease progression. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 814822.Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. [55] Zhou Joey Tianyi, Pan Sinno Jialin, Tsang Ivor W., and Yan Yan. 2014. Hybrid heterogeneous transfer learning through deep learning. In Proceedings of the National Conference on Artificial Intelligence.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Profile Decomposition Based Hybrid Transfer Learning for Cold-Start Data Anomaly Detection

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Knowledge Discovery from Data
        ACM Transactions on Knowledge Discovery from Data  Volume 16, Issue 6
        December 2022
        631 pages
        ISSN:1556-4681
        EISSN:1556-472X
        DOI:10.1145/3543989
        Issue’s Table of Contents

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 30 July 2022
        • Online AM: 24 April 2022
        • Accepted: 1 April 2022
        • Revised: 1 February 2022
        • Received: 1 April 2021
        Published in tkdd Volume 16, Issue 6

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Refereed
      • Article Metrics

        • Downloads (Last 12 months)196
        • Downloads (Last 6 weeks)46

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Full Text

      View this article in Full Text.

      View Full Text

      HTML Format

      View this article in HTML Format .

      View HTML Format