Abstract
The main challenges along with lessons learned from ongoing research in the application of machine learning systems in practice are discussed, taking into account aspects of theoretical foundations, systems engineering, and human-centered AI postulates. The analysis outlines a fundamental theory-practice gap which superimposes the challenges of AI system engineering at the level of data quality assurance, model building, software engineering and deployment.
Special thanks go to A Min Tjoa, former Scientific Director of SCCH, for his encouraging support in bringing together data and software science to tackle the research problems discussed in this paper. The research reported in this paper has been funded by BMK, BMDW, and the Province of Upper Austria in the frame of the COMET Programme managed by FFG.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
Platform supporting an integrated analysis of image and multiOMICs data based on liquid biopsies for tumor diagnostics – https://www.visiomics.at/.
- 10.
Nuclear Segmentation Pipeline code available: https://github.com/SCCH-KVS/NuclearSegmentationPipeline.
- 11.
BioStudies: https://www.ebi.ac.uk/biostudies/studies/S-BSST265.
- 12.
DeepSNP code available: https://github.com/SCCH-KVS/deepsnp.
- 13.
- 14.
References
Amershi, S., et al.: Guidelines for human-AI interaction. In: Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, CHI 2019 (2019)
Anand, S., et al.: An orchestrated survey of methodologies for automated software test case generation. J. Syst. Softw. 86(8), 1978–2001 (2013)
Athalye, A., Engstrom, L., Ilyas, A., Kwok, K.: Synthesizing robust adversarial examples. arXiv e-prints (2017)
Baldoni, R., Coppa, E., D’elia, D.C., Demetrescu, C., Finocchi, I.: A survey of symbolic execution techniques. ACM Comput. Surv. (CSUR) 51(3), 1–39 (2018)
Bensalem, M., Dizdarević, J., Jukan, A.: Modeling of deep neural network (DNN) placement and inference in edge computing. arXiv e-prints (2020)
Breck, E., Zinkevich, M., Polyzotis, N., Whang, S., Roy, S.: Data validation for machine learning. In: Proceedings of SysML (2019)
Cagala, T.: Improving data quality and closing data gaps with machine learning. In: Settlements, B.F.I. (ed.) Data Needs and Statistics Compilation for Macroprudential Analysis, vol. 46 (2017)
Cai, H., Zheng, V.W., Chang, K.C.C.: A comprehensive survey of graph embedding: problems, techniques and applications (2017)
Carvalho, D.V., Pereira, E.M., Cardoso, J.S.: Machine learning interpretability: a survey on methods and metrics. Electronics 8(8), 832 (2019)
Char, D.S., Shah, N.H., Magnus, D.: Implementing machine learning in health care - addressing ethical challenges. N. Engl. J. Med. 378(11), 981–983 (2018). https://doi.org/10.1056/NEJMp1714229. pMID: 29539284
Chrisman, N.: The role of quality information in the long-term functioning of a geographic information system. Cartographica Int. J. Geogr. Inf. Geovisualization 21(2), 79–88 (1983)
Cohen, R., Schaekermann, M., Liu, S., Cormier, M.: Trusted AI and the contribution of trust modeling in multiagent systems. In: Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, AAMAS 2019, pp. 1644–1648 (2019)
Deeks, A.: The judicial demand for explainable artificial intelligence. Columbia Law Rev. 119(7), 1829–1850 (2019)
Dorninger, B., Moser, M., Pichler, J.: Multi-language re-documentation to support a COBOL to Java migration project. In: 2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER), pp. 536–540. IEEE (2017)
Doshi-Velez, F., Kim, B.: Towards a rigorous science of interpretable machine learning. arXiv (2017)
Eghbal-Zadeh, H., et al.: DeepSNP: an end-to-end deep neural network with attention-based localization for breakpoint detection in single-nucleotide polymorphism array genomic data. J. Comput. Biol. 26(6), 572–596 (2018)
Eghbal-zadeh, H., Zellinger, W., Widmer, G.: Mixture density generative adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5820–5829 (2019)
Ehrlinger, L., Grubinger, T., Varga, B., Pichler, M., Natschläger, T., Zeindl, J.: Treating missing data in industrial data analytics. In: 2018 Thirteenth International Conference on Digital Information Management (ICDIM), pp. 148–155. IEEE, September 2018
Ehrlinger, L., Haunschmid, V., Palazzini, D., Lettner, C.: A DaQL to monitor data quality in machine learning applications. In: Hartmann, S., Küng, J., Chakravarthy, S., Anderst-Kotsis, G., Tjoa, A.M., Khalil, I. (eds.) DEXA 2019. LNCS, vol. 11706, pp. 227–237. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-27615-7_17
Ehrlinger, L., Rusz, E., Wöß, W.: A Survey of data quality measurement and monitoring tools. CoRR abs/1907.08138 (2019)
Ehrlinger, L., Werth, B., Wöß, W.: Automated continuous data quality measurement with quaiie. Int. J. Adv. Softw. 11(3&4), 400–417 (2018)
Ehrlinger, L., Wöß, W.: Automated data quality monitoring. In: 22nd MIT International Conference on Information Quality (ICIQ 2017), pp. 15.1–15.9 (2017)
Felderer, M., Ramler, R.: Integrating risk-based testing in industrial test processes. Software Qual. J. 22(3), 543–575 (2014)
Fischer, S., Ramler, R., Linsbauer, L., Egyed, A.: Automating test reuse for highly configurable software. In: Proceedings of the 23rd International Systems and Software Product Line Conference-Volume A, pp. 1–11 (2019)
Forcier, M.B., Gallois, H., Mullan, S., Joly, Y.: Integrating artificial intelligence into health care through data access: can the GDPR act as a beacon for policymakers? J. Law Biosci. 6(1), 317–335 (2019)
Gal, Y.: Uncertainty in deep learning. Thesis (2016)
Gal, Y., Ghahramani, Z.: Dropout as a Bayesian approximation: representing model uncertainty in deep learning. In: Proceedings of the 33rd International Conference on International Conference on Machine Learning, ICML 2016, vol. 48. pp. 1050–1059. JMLR.org (2016)
Galloway, A., Taylor, G.W., Moussa, M.: Predicting adversarial examples with high confidence. arXiv e-prints (2018)
Geist, V., Moser, M., Pichler, J., Beyer, S., Pinzger, M.: Leveraging machine learning for software redocumentation. In: 2020 IEEE 27th International Conference on Software Analysis, Evolution and Reengineering (SANER), pp. 622–626. IEEE (2020)
Gilpin, L.H., Bau, D., Yuan, B.Z., Bajwa, A., Specter, M., Kagal, L.: Explaining explanations: an overview of interpretability of machine learning. In: 2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA), pp. 80–89 (2018)
Gorban, A.N., Tyukin, I.Y.: Blessing of dimensionality: mathematical foundations of the statistical physics of data. Philos. Trans. R. Soc. A Math. Phys. Eng. Sci. 376(2118), 20170237 (2018)
Grancharova, A., Johansen, T.A.: Nonlinear model predictive control, In: Explicit Nonlinear Model Predictive Control, vol. 429, pp. 39–69. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-28780-0_2
Guidotti, R., Monreale, A., Ruggieri, S., Turini, F., Giannotti, F., Pedreschi, D.: A survey of methods for explaining black box models. ACM Comput. Surv. 51(5), 1–42 (2018)
Gunning, D.: Darpa’s explainable artificial intelligence (XAI) program. In: Proceedings of the 24th International Conference on Intelligent User Interfaces. p. ii. IUI 2019. Association for Computing Machinery, New York (2019)
Guo, C., Pleiss, G., Sun, Y., Weinberger, K.Q.: On calibration of modern neural networks. arXiv e-prints (2017)
Gusenleitner, N., et al.: Facing mental workload in AI-transformed working environments. In: h-WORKLOAD 2019: 3rd International Symposium on Human Mental Workload: Models and Applications (2019)
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. SSS. Springer, New York (2009). https://doi.org/10.1007/978-0-387-84858-7
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: IEEE International Conference on Computer Vision (ICCV) (2017). arXiv: 1703.06870
Hein, M., Andriushchenko, M., Bitterwolf, J.: Why ReLU networks yield high-confidence predictions far away from the training data and how to mitigate the problem. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 41–50 (2019)
Heinrich, B., Hristova, D., Klier, M., Schiller, A., Szubartowicz, M.: Requirements for data quality metrics. J. Data Inform. Qual. 9(2), 1–32 (2018)
Hoffman, R.R., Mueller, S.T., Klein, G., Litman, J.: Metrics for explainable AI: challenges and prospects. CoRR abs/1812.04608 (2018)
Holzinger, A.: Interactive machine learning for health informatics: when do we need the human-in-the-loop? Brain Inform. 3(2), 119–131 (2016). https://doi.org/10.1007/s40708-016-0042-6
Holzinger, A., Carrington, A., Müller, H.: Measuring the quality of explanations: the system causability scale (SCS). Special Issue on Interactive Machine Learning. Künstliche Intelligenz (Ger. J. Artif. Intell. 34, 193–198 (2020)
Holzinger, A., Kieseberg, P., Weippl, E., Tjoa, A.M.: Current advances, trends and challenges of machine learning and knowledge extraction: from machine learning to explainable AI. In: Holzinger, A., Kieseberg, P., Tjoa, A.M., Weippl, E. (eds.) CD-MAKE 2018. LNCS, vol. 11015, pp. 1–8. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-99740-7_1
Holzinger, A., Langs, G., Denk, H., Zatloukal, K., Müller, H.: Causability and explainability of artificial intelligence in medicine. WIREs Data Min. Knowl. Discov. 9(4), e1312 (2019)
Holzinger, A.: Introduction to machine learning and knowledge extraction (make). Mach. Learn. Knowl. Extr 1(1), 1–20 (2017)
Jacob, B., et al.: Quantization and training of neural networks for efficient integer-arithmetic-only inference. CoRR abs/1712.05877 (2017)
Jiang, J., Zhai, C.: Instance weighting for domain adaptation in NLP. In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pp. 264–271 (2007)
Johnson, M., Vera, A.: No AI is an island: the case for teaming intelligence. AI Mag. 40(1), 16–28 (2019)
Jung, C., Kim, C.: Impact of the accuracy of automatic segmentation of cell nuclei clusters on classification of thyroid follicular lesions. Cytometry. Part A J. Int. Soc. Anal. Cytol 85(8), 709–718 (2014)
Kelly, C.J., Karthikesalingam, A., Suleyman, M., Corrado, G., King, D.: Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17(1), 195 (2019)
Kromp, F., et al.: An annotated fluorescence image dataset for training nuclear segmentation methods. Nat. Sci. Data (2020, in press)
Kromp, F., et al.: Deep learning architectures for generalized immunofluorescence based nuclear image segmentation. arXiv e-prints (2019)
Lecun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(FEB), 436–444 (2015)
Li, L., Jamieson, K., DeSalvo, G., Rostamizadeh, A., Talwalkar, A.: Hyperband: a novel bandit-based approach to hyperparameter optimization. J. Mach. Learn. Res. 18(1), 6765–6816 (2017)
Li, S., Wang, Y.: Research on interdisciplinary characteristics: a case study in the field of artificial intelligence. IOP Conf. Ser. Mater. Sci. Eng. 677, 052023 (2019)
Lipton, Z.C.: The mythos of model interpretability. Queue 16(3), 31–57 (2018)
Little, M.A., et al.: Using and understanding cross-validation strategies. Perspectives on saeb et al. GigaScience 6(5), gix020 (2017)
Lombrozo, T.: Explanatory preferences shape learning and inference. Trends Cogn. Sci. 20(10), 748–759 (2016)
London, A.: Artificial intelligence and black-box medical decisions: accuracy versus explainability. Hastings Cent. Rep. 49, 15–21 (2019)
Ma, L., Artho, C., Zhang, C., Sato, H., Gmeiner, J., Ramler, R.: GRT: program-analysis-guided random testing (t). In: 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 212–223. IEEE (2015)
Masin, M., et al.: Pluggable analysis viewpoints for design space exploration. Procedia Comput. Sci. 16, 226–235 (2013)
Maydanchik, A.: Data Quality Assessment. Technics Publications, LLC, Bradley Beach (2007)
Meloni, P., et al.: NEURAghe: exploiting CPU-FPGA synergies for efficient and flexible CNN inference acceleration on Zynq SoCs. CoRR abs/1712.00994 (2017)
Meloni, P., et al.: ALOHA: an architectural-aware framework for deep learning at the edge. In: Proceedings of the Workshop on INTelligent Embedded Systems Architectures and Applications - INTESA, pp. 19–26. ACM Press (2018)
Meloni, P., et al.: Architecture-aware design and implementation of CNN algorithms for embedded inference: the ALOHA project. In: 2018 30th International Conference on Microelectronics (ICM), pp. 52–55 (2018)
Meloni, P., et al.: Optimization and deployment of CNNS at the edge: the ALOHA experience. In: Proceedings of the 16th ACM International Conference on Computing Frontiers, CF 2019, pp. 326–332 (2019)
Menzies, T., Milton, Z., Turhan, B., Cukic, B., Jiang, Y., Bener, A.: Defect prediction from static code features: current results, limitations, new approaches. Autom. Softw. Eng. 17(4), 375–407 (2010)
Moser, M., Pichler, J., Fleck, G., Witlatschil, M.: RbGG: a documentation generator for scientific and engineering software. In: 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER), pp. 464–468. IEEE (2015)
Méhes, G., et al.: Detection of disseminated tumor cells in neuroblastoma: 3 log improvement in sensitivity by automatic immunofluorescence plus FISH (AIPF) analysis compared with classical bone marrow cytology. Am. J. Pathol. 163(2), 393–399 (2003)
Newman, S.: Building Microservices, 1st edn. O’Reilly Media Inc. (2015)
Nickel, M., Murphy, K., Tresp, V., Gabrilovich, E.: A review of relational machine learning for knowledge graphs. Proc. IEEE 104(1), 11–33 (2016)
Nielson, F., Nielson, H.R., Hankin, C.: Principles of Program Analysis. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-03811-6
Nikzad-Langerodi, R., Zellinger, W., Lughofer, E., Saminger-Platz, S.: Domain-invariant partial-least-squares regression. Anal. Chem. 90(11), 6693–6701 (2018)
Noy, N., Gao, Y., Jain, A., Narayanan, A., Patterson, A., Taylor, J.: Industry-scale knowledge graphs: lessons and challenges. Commun. ACM 62(8), 36–43 (2019)
Obermeyer, Z., Powers, B., Vogeli, C., Mullainathan, S.: Dissecting racial bias in an algorithm used to manage the health of populations. Science 366(6464), 447–453 (2019)
Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2009)
Pascarella, L., Bacchelli, A.: Classifying code comments in java open-source software systems. In: 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR), pp. 227–237. IEEE (2017)
Paulheim, H.: Knowledge graph refinement: a survey of approaches and evaluation methods. Semant. Web 8(3), 489–508 (2017)
Pimentel, A.D., Erbas, C., Polstra, S.: A systematic approach to exploring embedded system architectures at multiple abstraction levels. IEEE Trans. Comput. 55(2), 99–112 (2006)
Quionero-Candela, J., Sugiyama, M., Schwaighofer, A., Lawrence, N.D.: Dataset Shift in Machine Learning. The MIT Press, Cambridge (2009)
Ramler, R., Buchgeher, G., Klammer, C.: Adapting automated test generation to gui testing of industry applications. Inf. Softw. Technol. 93, 248–263 (2018)
Ramler, R., Felderer, M.: A process for risk-based test strategy development and its industrial evaluation. In: Abrahamsson, P., Corral, L., Oivo, M., Russo, B. (eds.) PROFES 2015. LNCS, vol. 9459, pp. 355–371. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-26844-6_26
Ramler, R., Wolfmaier, K.: Issues and effort in integrating data from heterogeneous software repositories and corporate databases. In: Proceedings of the Second ACM-IEEE International Symposium on Empirical Software Engineering and Measurement, pp. 330–332 (2008)
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Samek, W., Wiegand, T., Müller, K.R.: Explainable artificial intelligence: understanding, visualizing and interpreting deep learning models. arXiv e-prints (2017)
Sculley, D., et al.: Hidden technical debt in machine learning systems. In: 28th International Conference on Neural Information Processing Systems (NIPS), pp. 2503–2511 (2015)
Sebastian-Coleman, L.: Measuring Data Quality for Ongoing Improvement. Elsevier, Amsterdam (2013)
Shinyama, Y., Arahori, Y., Gondow, K.: Analyzing code comments to boost program comprehension. In: 2018 25th Asia-Pacific Software Engineering Conference (APSEC), pp. 325–334. IEEE (2018)
Dosilovic, F.K., Brçiç, M., Hlupic, N.: Explainable artificial intelligence: a survey. In: Skala, K. (ed.) Croatian Society for Information and Communication Technology, Electronics and Microelectronics - MIPRO (2018)
Sobieczky, F.: An interlacing technique for spectra of random walks and its application to finite percolation clusters. J. Theor. Probab. 23, 639–670 (2010)
Sobieczky, F.: Bounds for the annealed return probability on large finite percolation graphs. Electron. J. Probab. 17, 17 (2012)
Sobieczky, F.: Explainability of models with an interpretable base model: explainability vs. accuracy. In: Symposium on Predictive Analytics 2019, Vienna (2019)
Steidl, D., Hummel, B., Juergens, E.: Quality analysis of source code comments. In: 2013 21st International Conference on Program Comprehension (ICPC), pp. 83–92. IEEE (2013)
Sun, B., Saenko, K.: Deep CORAL: correlation alignment for deep domain adaptation. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9915, pp. 443–450. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-49409-8_35
Szegedy, C., et al.: Intriguing properties of neural networks. arXiv e-prints (2013)
Sünderhauf, N., et al.: The limits and potentials of deep learning for robotics. Int. J. Robot. Res. 37(4–5), 405–420 (2018)
Van Geet, J., Ebraert, P., Demeyer, S.: Redocumentation of a legacy banking system: an experience report. In: Proceedings of the Joint ERCIM Workshop on Software Evolution (EVOL) and International Workshop on Principles of Software Evolution (IWPSE), pp. 33–41 (2010)
Vapnik, V.N.: Statistical Learning Theory. Wiley-Interscience, New York (1998)
Vidal, R., Bruna, J., Giryes, R., Soatto, S.: Mathematics of deep learning. arXiv e-prints (2017). arxiv:1712.04741
Wang, Q., Mao, Z., Wang, B., Guo, L.: Knowledge graph embedding: a survey of approaches and applications. IEEE Trans. Knowl. Data Eng. 29(12), 2724–2743 (2017)
Wang, R.Y., Strong, D.M.: Beyond accuracy: what data quality means to data consumers. J. Manage. Inform. Syst. 12(4), 5–33 (1996)
Wang, Y.E., Wei, G.Y., Brooks, D.: Benchmarking TPU, GPU, and CPU platforms for deep learning. arXiv e-prints (2019)
Xu, G., Huang, J.Z.: Asymptotic optimality and efficient computation of the leave-subject-out cross-validation. Ann. Stat. 40(6), 3003–3030 (2012)
Yu, T., Zhu, H.: Hyper-parameter optimization: a review of algorithms and applications. arXiv e-prints (2020)
Zellinger, W., Grubinger, T., Lughofer, E., Natschläger, T., Saminger-Platz, S.: Central moment discrepancy (CMD) for domain-invariant representation learning. In: International Conference on Learning Representations (2017)
Zellinger, W., et al.: Multi-source transfer learning of time series in cyclical manufacturing. J. Intell. Manuf. 31(3), 777–787 (2020)
Zellinger, W., Moser, B.A., Grubinger, T., Lughofer, E., Natschläger, T., Saminger-Platz, S.: Robust unsupervised domain adaptation for neural networks via moment alignment. Inf. Sci. 483, 174–191 (2019)
Zellinger, W., Moser, B.A., Saminger-Platz, S.: Learning bounds for moment-based domain adaptation. arXiv preprint arXiv:2002.08260 (2020)
Zhang, C., Bengio, S., Hardt, M., Recht, B., Vinyals, O.: Understanding deep learning requires rethinking generalization. In: International Conference on Learning Representations (201z)
Zou, J., Schiebinger, L.: AI can be sexist and racist - it’s time to make it fair. Nature 559, 324–326 (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 IFIP International Federation for Information Processing
About this paper
Cite this paper
Fischer, L. et al. (2020). Applying AI in Practice: Key Challenges and Lessons Learned. In: Holzinger, A., Kieseberg, P., Tjoa, A., Weippl, E. (eds) Machine Learning and Knowledge Extraction. CD-MAKE 2020. Lecture Notes in Computer Science(), vol 12279. Springer, Cham. https://doi.org/10.1007/978-3-030-57321-8_25
Download citation
DOI: https://doi.org/10.1007/978-3-030-57321-8_25
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-57320-1
Online ISBN: 978-3-030-57321-8
eBook Packages: Computer ScienceComputer Science (R0)