Abstract
Anomaly detection is an essential task for quality management in smart manufacturing. An accurate data-driven detection method usually needs enough data and labels. However, in practice, there commonly exist newly set-up processes in manufacturing, and they only have quite limited data available for analysis. Borrowing the name from the recommender system, we call this process a cold-start process. The sparsity of anomaly, the deviation of the profile, and noise aggravate the detection difficulty.
Transfer learning could help to detect anomalies for cold-start processes by transferring the knowledge from more experienced processes to the new processes. However, the existing transfer learning and multi-task learning frameworks are established on task- or domain-level relatedness. We observe instead, within a domain, some components (background and anomaly) share more commonality, others (profile deviation and noise) not. To this end, we propose a more delicate component-level transfer learning scheme, i.e., decomposition-based hybrid transfer learning (DHTL): It first decomposes a domain (e.g., a data source containing profiles) into different components (smooth background, profile deviation, anomaly, and noise); then, each component’s transferability is analyzed by expert knowledge; Lastly, different transfer learning techniques could be tailored accordingly. We adopted the Bayesian probabilistic hierarchical model to formulate parameter transfer for the background, and “L2,1+L1”-norm to formulate low dimension feature-representation transfer for the anomaly. An efficient algorithm based on Block Coordinate Descend is proposed to learn the parameters. A case study based on glass coating pressure profiles demonstrates the improved accuracy and completeness of detected anomaly, and a simulation demonstrates the fidelity of the decomposition results.
- [1] . 2003. Task clustering and gating for bayesian multitask learning. Journal of Machine Learning Research 4, May (2003), 83–99.Google ScholarDigital Library
- [2] . 2009. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM Journal on Imaging Sciences 2, 1 (2009), 183–202.Google ScholarDigital Library
- [3] . 2003. Exploiting task relatedness for multiple task learning. In Proceedings of the Learning Theory and Kernel Machines. Springer, 567–580.Google ScholarCross Ref
- [4] . 2019. Sparse and low-rank matrix decomposition for automatic target detection in hyperspectral imagery. IEEE Transactions on Geoscience and Remote Sensing 57, 8 (2019), 5239–5251.Google ScholarCross Ref
- [5] . 2009. Anomaly detection: A survey. ACM Computing Surveys 41, 3 (2009), 1–58.Google ScholarDigital Library
- [6] . 2012. Learning incoherent sparse and low-rank patterns from multiple tasks. ACM Transactions on Knowledge Discovery from Data 5, 4 (2012), 1–31.Google ScholarDigital Library
- [7] Longwei Cheng, Kai Wang, and Fugee Tsung. 2021. A hybrid transfer learning framework for in-plane freeform shape accuracy control in additive manufacturing. IISE Transactions 53, 3 (2021), 298–312. Google ScholarCross Ref
- [8] . 1950. Human operators and automatic machines. Personnel Psychology 3, 4 (1950), 401–411.Google ScholarCross Ref
- [9] . 2014. A discriminative metric learning based anomaly detection method. IEEE Transactions on Geoscience and Remote Sensing 52, 11 (2014), 6844–6857.Google ScholarCross Ref
- [10] Bo Du, Liangpei Zhang, Dacheng Tao, and Dengyi Zhang. 2013. Unsupervised transfer learning for target detection from hyperspectral images. Neurocomputing 120 (2013), 72–82. Google ScholarCross Ref
- [11] Martin Ester, Hans-Peter Kriegel, Jörg Sander, and Xiaowei Xu. 1996. A density-based algorithm for discovering clusters in large spatial databases with noise. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining. 226–231.Google Scholar
- [12] . 2014. Multi-task sparse structure learning. In Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management. 451–460.Google ScholarDigital Library
- [13] . 2016. Multi-task sparse structure learning with gaussian copula models. The Journal of Machine Learning Research 17, 1 (2016), 1205–1234.Google ScholarDigital Library
- [14] . 2020. Partially Observable Online Change Detection via Smooth-Sparse Decomposition. arXiv:2009.10645. Retrieved from https://arxiv.org/abs/2009.10645.Google Scholar
- [15] . 2012. A transfer learning approach for network modeling. IIE Transactions 44, 11 (2012), 915–931.Google ScholarCross Ref
- [16] . 2017. Multi-task multi-modal models for collective anomaly detection. In Proceedings of the 2017 IEEE International Conference on Data Mining. IEEE, 177–186.Google ScholarCross Ref
- [17] . 2006. Domain adaptation for statistical classifiers. Journal of Artificial Intelligence Research 26,1 (2006), 101–126.Google Scholar
- [18] . 2011. Anomaly localization for network data streams with graph joint sparse PCA. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 886–894.Google ScholarDigital Library
- [19] . 2019. Transfer anomaly detection by inferring latent domain representations. In Proceedings of the Advances in Neural Information Processing Systems. 2471–2481.Google Scholar
- [20] . 2009. Transfer learning for collaborative filtering via a rating-matrix generative model. In Proceedings of the 26th Annual International Conference on Machine Learning. ACM, 617–624.Google ScholarDigital Library
- [21] . 2020. Tensor completion for weakly-dependent data on graph for metro passenger flow prediction. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 4804–4810.Google ScholarCross Ref
- [22] Jun Liu, Shuiwang Ji, and Jieping Ye. 2009. Multi-task feature learning via efficient l2, 1-norm minimization. In Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence, UAI 2009. AUAI Press, 339–348.Google Scholar
- [23] Song Liu, Makoto Yamada, Nigel Collier, and Masashi Sugiyama. 2013. Change-point detection in time-series data by relative density-ratio estimation. Neural Networks 43 (2013), 72–83. Google ScholarDigital Library
- [24] . 2018. Modeling task relationships in multi-task learning with multi-gate mixture-of-experts. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1930–1939.Google ScholarDigital Library
- [25] . 2015. Selection of the most suitable statistical process control approach for short production runs: A decision-model. International Journal of Information and Education Technology 5, 4 (2015), 303.Google ScholarCross Ref
- [26] . 2014. Mixture of experts: A literature survey. Artificial Intelligence Review 42, 2 (2014), 275–293.Google ScholarDigital Library
- [27] . 2009. A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering 22, 10 (2009), 1345–1359.Google ScholarDigital Library
- [28] . 2008. Transfer learning for wifi-based indoor localization. In Proceedings of the Association for the Advancement of Artificial Intelligence Workshop. 6.Google Scholar
- [29] . 1993. Discriminability-based transfer between neural networks. In Proceedings of the Advances in Neural Information Processing Systems. 204–211.Google Scholar
- [30] . 2018. Hyperspectral anomaly detection through spectral unmixing and dictionary-based low-rank decomposition. IEEE Transactions on Geoscience and Remote Sensing 56, 8 (2018), 4391–4405.Google ScholarCross Ref
- [31] . 2017. A coordinate-descent-based approach to solving the sparse group elastic net. Technometrics 59, 4 (2017), 437–445.Google ScholarCross Ref
- [32] Bo Shen, Rongxuan Wang, Andrew Chung Chee Law, Rakesh Kamath, Hahn Choo, and Zhenyu (James) Kong. 2022. Super resolution for multi-Sources image stream data using smooth and sparse tensor completion and its applications in data acquisition of additive manufacturing. Technometrics 64, 1 (2022), 2–17. Google ScholarCross Ref
- [33] . 2018. Statistical transfer learning: A review and some extensions to statistical process control. Quality Engineering 30, 1 (2018), 115–128.Google ScholarCross Ref
- [34] . 2019. Characterizing and avoiding negative transfer. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 11293–11302.Google ScholarCross Ref
- [35] . 2016. A survey of transfer learning. Journal of Big Data 3, 1 (2016), 9.Google ScholarCross Ref
- [36] Tao Wu, Ellie Ka-In Chio, Heng-Tze Cheng, Yu Du, Steffen Rendle, Dima Kuzmin, Ritesh Agarwal, Li Zhang, John Anderson, Sarvjeet Singh, Tushar Chandra, Ed H. Chi, Wen Li, Ankit Kumar, Xiang Ma, Alex Soares, Nitin Jindal, and Pei Cao. 2020. Zero-shot heterogeneous transfer learning from recommender systems to cold-start search retrieval. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management. 2821–2828.Google ScholarDigital Library
- [37] . 2015. Anomaly detection in hyperspectral images based on low-rank and sparse representation. IEEE Transactions on Geoscience and Remote Sensing 54, 4 (2015), 1990–2000.Google ScholarCross Ref
- [38] . 2017. Anomaly detection in images with smooth background via smooth-sparse decomposition. Technometrics 59, 1 (2017), 102–114.Google ScholarCross Ref
- [39] . 2018. Real-time monitoring of high-dimensional functional data streams via spatio-temporal smooth sparse decomposition. Technometrics 60, 2 (2018), 181–197.Google ScholarCross Ref
- [40] . 2013. Friend transfer: Cold-start friend recommendation with cross-platform transfer learning of social knowledge. In Proceedings of the 2013 IEEE International Conference on Multimedia and Expo. IEEE, 1–6.Google Scholar
- [41] . 2013. Adaptive B-spline knot selection using multi-resolution basis set. IIE Transactions 45, 12 (2013), 1263–1277.Google ScholarCross Ref
- [42] Xiaowei Yue, Hao Yan, Jin Gyu Park, Zhiyong Liang, and Jianjun Shi. 2018. A wavelet-based penalized mixed-effects decomposition for Multichannel profile detection of in-line Raman spectroscopy. IEEE Transactions on Automation Science and Engineering 15, 3 (2018), 1258–1271. Google ScholarCross Ref
- [43] . 2012. Twenty years of mixture of experts. IEEE Transactions on Neural Networks and Learning Systems 23, 8 (2012), 1177–1193.Google ScholarCross Ref
- [44] . 2018. Weakly correlated profile monitoring based on sparse multi-channel functional principal component analysis. IISE Transactions 50, 10 (2018), 878–891.Google ScholarCross Ref
- [45] . 2018. Multi-task clustering with model relation learning. In Proceedings of the 27th International Joint Conference on Artificial Intelligence. 3132–3140.Google Scholar
- [46] . 2016. Joint sparse representation and multitask learning for hyperspectral target detection. IEEE Transactions on Geoscience and Remote Sensing 55, 2 (2016), 894–906.Google ScholarCross Ref
- [47] . 2015. A low-rank and sparse matrix decomposition-based Mahalanobis distance method for hyperspectral anomaly detection. IEEE Transactions on Geoscience and Remote Sensing 54, 3 (2015), 1376–1389.Google ScholarCross Ref
- [48] Yu Zhang and Qiang Yang. 2021. A survey on multi-task learning. IEEE Transactions on Knowledge and Data Engineering (2021), 1–1. Google ScholarCross Ref
- [49] . 2015. Multi-task learning for spatio-temporal event forecasting. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1503–1512.Google ScholarDigital Library
- [50] . 2017. Feature constrained multi-task learning models for spatiotemporal event forecasting. IEEE Transactions on Knowledge and Data Engineering 29, 5 (2017), 1059–1072.Google ScholarDigital Library
- [51] Yujie Zhao, Hao Yan, Sarah Holte, and Yajun Mei. 2022. Rapid detection of hot-spots via tensor decomposition with applications to crime rate data. Journal of Applied Statistics 49, 7 (2022), 1636–1662. Google ScholarCross Ref
- [52] . 2017. Spatiotemporal multi-task learning for citywide passenger flow prediction. In Proceedings of the 2017 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computed, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI). IEEE, 1–8.Google Scholar
- [53] . 2011. Malsar: Multi-task learning via structural regularization. Arizona State University 21 (2011).Google Scholar
- [54] . 2011. A multi-task learning formulation for predicting disease progression. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 814–822.Google ScholarDigital Library
- [55] . 2014. Hybrid heterogeneous transfer learning through deep learning. In Proceedings of the National Conference on Artificial Intelligence.Google ScholarCross Ref
Index Terms
- Profile Decomposition Based Hybrid Transfer Learning for Cold-Start Data Anomaly Detection
Recommendations
Anomaly Subgraph Detection with Feature Transfer
CIKM '20: Proceedings of the 29th ACM International Conference on Information & Knowledge ManagementAnomaly detection in multilayer graphs becomes more critical in many application scenarios, i.e., identifying crime hotspots in urban areas by discovering suspicious and illicit behaviors in social networks. However, it is a big challenge to identify ...
Transfer learning for video anomaly detection
Soft Computing and Intelligent Systems: Techniques and ApplicationsAnomaly detection from crowd is a widely addressed problem in the field of computer vision. It is an essential part of video surveillance and security. In surveillance videos, very little information about anomalous behaviors is available, so it becomes ...
Sequential anomaly detection based on temporal-difference learning: Principles, models and case studies
Anomaly detection is an important problem that has been popularly researched within diverse research areas and application domains. One of the open problems in anomaly detection is the modeling and prediction of complex sequential data, which consist of ...
Comments