Abstract
The use of machine learning (ML) with electronic health records (EHR) is growing in popularity as a means to extract knowledge that can improve the decision-making process in healthcare. Such methods require training of high-quality learning models based on diverse and comprehensive datasets, which are hard to obtain due to the sensitive nature of medical data from patients. In this context, federated learning (FL) is a methodology that enables the distributed training of machine learning models with remotely hosted datasets without the need to accumulate data and, therefore, compromise it. FL is a promising solution to improve ML-based systems, better aligning them to regulatory requirements, improving trustworthiness and data sovereignty. However, many open questions must be addressed before the use of FL becomes widespread. This article aims at presenting a systematic literature review on current research about FL in the context of EHR data for healthcare applications. Our analysis highlights the main research topics, proposed solutions, case studies, and respective ML methods. Furthermore, the article discusses a general architecture for FL applied to healthcare data based on the main insights obtained from the literature review. The collected literature corpus indicates that there is extensive research on the privacy and confidentiality aspects of training data and model sharing, which is expected given the sensitive nature of medical data. Studies also explore improvements to the aggregation mechanisms required to generate the learning model from distributed contributions and case studies with different types of medical data.
- [1] , , , , and . 2014. Big data in health care: Using analytics to identify and manage high-risk and high-cost patients. Health Affairs 33 (2014), 1123–1131.Google ScholarCross Ref
- [2] and . 2017. Machine learning for healthcare: On the verge of a major shift in healthcare epidemiology. Clinical Infectious Diseases 66 (2017), 149–153.Google ScholarCross Ref
- [3] , , and . 2019. Making machine learning models clinically useful. JAMA 322 (2019), 1351–1352.Google ScholarCross Ref
- [4] , , , , , and . 2008. Electronic health record components and the quality of care. JSTOR Medical Care 46 (2008), 1267–1272.Google ScholarCross Ref
- [5] , , , and . 2014. Clinical benefits of electronic health record use: National findings. Health Services Research 49 (2014), 392–404.Google ScholarCross Ref
- [6] , , , and . 2018. Deep EHR: A survey of recent advances in deep learning techniques for electronic health record (EHR) analysis. IEEE Journal of Biomedical and Health Informatics 22 (2018), 1589–1604.Google ScholarCross Ref
- [7] , , , and . 2014. Project Adam: Building an efficient and scalable deep learning training system. In 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI’14), 2014, pp. 571–582.Google Scholar
- [8] , , and . 2018. Machine learning in medicine: Addressing ethical challenges. PLOS Medicine 15 (2018), 1–4.Google ScholarCross Ref
- [9] and . 2015. Data, privacy, and the greater good. Science 349 (2015), 253–255.Google ScholarCross Ref
- [10] , , , and . 2019. Federated machine learning: Concept and applications. ACM Transactions on Intelligent Systems and Technologies 10 (2019), 12:1–12:19.Google Scholar
- [11] , , , , , , and . 2021. From server-based to client-based machine learning: A comprehensive survey. ACM Computing Surveys 54 (2021).Google Scholar
- [12] , , , , , , , , and . 2018. Federated learning for mobile keyboard prediction. https://arxiv.org/abs/1811.03604. arXiv:1811.03604.Google Scholar
- [13] , , and . 2021. Decentralised learning in federated deployment environments: A system-level survey. ACM Computing Surveys 54 (2021).Google Scholar
- [14] , , , , , and . 2019. Federated learning for healthcare informatics. https://arxiv.org/abs/2003.08119. arXiv:1911.06270.Google Scholar
- [15] , , , , , , , , , , , , , , , , and . 2020. The future of digital health with federated learning. https://arxiv.org/abs/2003.08119. arXiv:2003.08119.Google Scholar
- [16] , , , , , , , , and . 2020. Systematic review of privacy-preserving distributed machine learning from federated databases in health care. JCO Clinical Cancer Informatics 4 (2020), 184–200.Google ScholarCross Ref
- [17] , , , and . 2020. A review of applications in federated learning. Computers & Industrial Engineering 149 (2020), 106854.Google ScholarCross Ref
- [18] , , , , , and . 2021. A survey on security and privacy of federated learning. Future Generation Computer Systems 115 (2021), 619–640.Google ScholarCross Ref
- [19] , , , , , and . 2021. A survey on federated learning. Knowledge-Based Systems 216 (2021), 106775.Google ScholarCross Ref
- [20] and . 2013. Systematic reviews in software engineering: An empirical investigation. Elsevier Information and Software Technology 55 (2013), 1341–1354.Google ScholarCross Ref
- [21] , , , , , , , and . 2018. Applied federated learning: Improving Google keyboard query suggestions. https://arxiv.org/abs/1812.02903. arXiv:1812.02903.Google Scholar
- [22] , , , , , , , , , , , , , , , and . 2017. Developing and validating a survival prediction model for NSCLC patients through distributed learning across 3 countries. International Journal of Radiation Oncology, Biology, Physics (2017), 344–352.Google Scholar
- [23] , , , and . 2017. Federated tensor factorization for computational phenotyping. https://arxiv.org/abs/1704.03141. arXiv:1704.03141.Google Scholar
- [24] , , , , and . 2019. Multi-institutional deep learning modeling without sharing patient data: A feasibility study on brain tumor segmentation. Brainlesion (2019), 92–104.Google Scholar
- [25] , , , and . 2018. FADL: Federated-autonomous deep learning for distributed electronic health record. https://arxiv.org/abs/1811.11400. arXiv:1811.11400.Google Scholar
- [26] , , and . 2019. Two-stage federated phenotyping and patient representation learning. https://arxiv.org/abs/1908.05596. arXiv:1908.05596.Google Scholar
- [27] , , , , , and . 2019. Federated learning in distributed medical databases: Meta-analysis of large-scale subcortical brain data. In IEEE 16th International Symposium on Biomedical Imaging (ISBI’19). 270–274.Google Scholar
- [28] , , , , , , and . 2019. Federated topic modeling. In International Conference on Information and Knowledge Management (CIKM’19).1071–1080.Google Scholar
- [29] , , and . 2019. Preserving patient privacy while training a predictive model of in-hospital mortality. https://arxiv.org/abs/1912.00354. arXiv:1912.00354.Google Scholar
- [30] , , , and . 2020. Federating recommendations using differentially private prototypes 2020. https://arxiv.org/abs/2003.00602. arXiv:2003.00602.Google Scholar
- [31] , , , , , , , , , , , , , , , , , , , , , and . 2020. Distributed learning on 20 000+ lung cancer patients – the personal health train. Elsevier Radiotherapy and Oncology (2020), 189–200.Google Scholar
- [32] , , , and . 2020. Modelling audiological preferences using federated learning. In 28th ACM Conference on User Modeling, Adaptation and Personalization (UMAP’20).187–190.Google Scholar
- [33] , , , , , , , , , , and . 2020. Federated learning in medicine: facilitating multi-institutional collaborations without sharing patient data. Nature Scientific Reports 10 (2020), e24207.Google Scholar
- [34] , , , , , , , , , , , , , , , , , , , , , , and . 2021. Federated learning of electronic health records to improve mortality prediction in hospitalized patients with Covid-19: Machine learning approach. JMIR Medical Informatics 9 (2021), e24207.Google ScholarCross Ref
- [35] , , , , , , , , , , , , , , , and . 2021. Federated learning for thyroid ultrasound image analysis to protect personal information: Validation study in a real health care environment. JMIR Medical Informatics 9 (2021), e25869.Google ScholarCross Ref
- [36] , , , and . 2021. Learning from others without sacrificing privacy: Simulation comparing centralized and federated machine learning on mobile health data. JMIR Mhealth and Uhealth 9 (2021), e23728.Google ScholarCross Ref
- [37] , , , , and . 2018. Internet of Health Things: Toward intelligent vital signs monitoring in hospital wards. Artificial Intelligence in Medicine 89 (2018), 61–69.Google ScholarCross Ref
- [38] , , , , and . 2020. FEEL: A federated edge learning system for efficient and privacy-preserving mobile healthcare. In 49th International Conference on Parallel Processing (ICPP’20). 1–11.Google Scholar
- [39] , , , , and . 2020. Fedhealth: A federated transfer learning framework for wearable healthcare. IEEE Intelligent Systems 35 (2020), 83–93.Google ScholarCross Ref
- [40] , , , , , and . 2020. Federated transfer learning for EEG signal classification. In 42nd Annual International Conference of the IEEE Engineering in Medicine Biology Society (EMBC’20). 3040–3045.Google Scholar
- [41] , , and . 2020. Federated learning for arrhythmia detection of non-IID ECG. In IEEE 6th International Conference on Computer and Communications (ICCC’20). 1176–1180.Google Scholar
- [42] and . 2021. Privacy-preserving federated deep learning for wearable IoT-based biomedical monitoring. ACM Transactions on Internet Technology 21 (2021).Google Scholar
- [43] , , , , , , and . 2020. Dynamic contract design for federated learning in smart healthcare applications. IEEE Internet of Things Journal (2020), 1–10.Google Scholar
- [44] , , , , , and . 2018. Federated learning of predictive models from federated electronic health records. Elsevier International Journal of Medical Informatics 112 (2018), 59–67.Google ScholarCross Ref
- [45] , , , and . 2019. Learn electronic health records by fully decentralized federated learning. https://arxiv.org/abs/1912.01792. arXiv:1912.01792.Google Scholar
- [46] , , , , and . 2019. Braintorrent: A peer-to-peer environment for decentralized federated learning. https://arxiv.org/abs/1905.06731. arXiv:1905.06731.Google Scholar
- [47] , , and . 2019. A federated filtering framework for Internet of Medical Things. https://arxiv.org/abs/1905.01138. arXiv:1905.01138.Google Scholar
- [48] , , , , , , and . 2021. A resource-constrained and privacy-preserving edge computing enabled clinical decision system: A federated reinforcement learning approach. IEEE Internet of Things Journal (2021), 1–17.Google Scholar
- [49] , , and . 2021. Agent architecture of an intelligent medical system based on federated learning and blockchain technology. Journal of Information Security and Applications 58 (2021), 102748.Google ScholarCross Ref
- [50] , , , , and . 2020. Privacy-aware and resource-saving collaborative learning for healthcare in cloud computing. In 2020 IEEE International Conference on Communications (ICC’20). 1–6.Google Scholar
- [51] and . 2018. Transform blockchain into distributed parallel computing architecture for precision medicine. In IEEE 38th International Conference on Distributed Computing Systems (ICDCS’18).1290–1299.Google Scholar
- [52] , , , , , and . 2019. Secure architectures implementing trusted coalitions for blockchained distributed learning (TCLearn). IEEE Access 7 (2019), 181789–181799.Google ScholarCross Ref
- [53] , , and . 2020. A security-oriented architecture for federated learning in cloud environments. In Web, Artificial Intelligence and Network Applications (WAINA’20). 730–741.Google Scholar
- [54] , , , , and . 2020. Secure and provenance enhanced Internet of Health Things framework: A blockchain managed federated learning approach. IEEE Access 8 (2020), 205071–205087.Google ScholarCross Ref
- [55] , , , , , and . 2019. Federated uncertainty-aware learning for distributed hospital EHR data. https://arxiv.org/abs/1910.12191. arXiv:1910.12191.Google Scholar
- [56] , , , , , and . 2018. LoAdaBoost: Loss-based AdaBoost federated machine learning on medical data. https://arxiv.org/abs/1811.12629. arXiv:1811.12629.Google Scholar
- [57] , , , and . 2020. A federated learning framework for privacy-preserving and parallel training. https://arxiv.org/abs/2001.09782. arXiv:2001.09782.Google Scholar
- [58] , , , , and . 2020. Achieving privacy-preserving federated learning with irrelevant updates over e-health applications. In 2020 IEEE International Conference on Communications (ICC’20). 1–6.Google Scholar
- [59] , , , , , , , , and . 2021. Dynamic fusion-based federated learning for Covid-19 detection. IEEE Internet of Things Journal (2021), 1–8.Google Scholar
- [60] , , , , , , , , , , , , , , , , , , , and . 2021. Federated semi-supervised learning for Covid region segmentation in chest CT using multi-national data from China, Italy, Japan. Medical Image Analysis 70 (2021), 101992.Google ScholarCross Ref
- [61] , , and . 2020. Decentralized knowledge acquisition for mobile internet applications. Springer World Wide Web (2020).Google Scholar
- [62] , , and . 2020. FedMAX: Mitigating activation divergence for accurate and communication-efficient federated learning. https://arxiv.org/abs/2004.03657. arXiv:2004.03657.Google Scholar
- [63] , , , , and . 2020. Variation-aware federated learning with multi-source decentralized medical image data. IEEE Journal of Biomedical and Health Informatics (2020), 1–14.Google Scholar
- [64] , , , and . 2020. FedHome: Cloud-edge based personalized federated learning for in-home health monitoring. IEEE Transactions on Mobile Computing (2020), 1–14.Google ScholarCross Ref
- [65] and . 2019. Securing distributed gradient descent in high dimensional statistical learning. ACM SIGMETRICS Performance Evaluation Review 47 (2019), 83–84.Google ScholarDigital Library
- [66] , , , , , and . 2020. A training-integrity privacy-preserving federated learning scheme with trusted execution environment. Elsevier Information Sciences 522 (2020), 69–79.Google ScholarCross Ref
- [67] , , , and . 2016. No More Chewy Centers: The Zero Trust Model of Information Security,
Technical Report , Forrester Research.Google Scholar - [68] , , , , , , , , and . 2021. A security awareness and protection system for 5G smart healthcare based on zero-trust architecture. IEEE Internet of Things Journal 8 (2021), 10248–10263.Google ScholarCross Ref
- [69] , , and . 2017. Deep models under the GAN: Information leakage from collaborative deep learning. In SIGSAC Conference on Computer and Communications Security (SIGSAC’17).603–618.Google Scholar
- [70] , , , , , , , , and . 2017. Practical secure aggregation for privacy-preserving machine learning. In SIGSAC Conference on Computer and Communications Security (CCS’17).1175–1191.Google Scholar
- [71] , , , and . 2019. Privacy-preserving distributed machine learning based on secret sharing. In Information and Communications Security (ICICS’19). 684–702.Google Scholar
- [72] , , , and . 2019. PPD-DL: Privacy-preserving decentralized deep learning. In Artificial Intelligence and Security (ICAIS’19).273–282.Google Scholar
- [73] . 2008. Differential privacy: A survey of results. Lecture Notes in Computer Science 4978 (2008), 1–19.Google ScholarCross Ref
- [74] , , , , , , , , , , and . 2019. Privacy-preserving federated brain tumour segmentation. In Machine Learning in Medical Imaging (MLMI’19).133–141.Google Scholar
- [75] , , and . 2019. Federated and differentially private learning for electronic health records. https://arxiv.org/abs/1911.05861. arXiv:1911.05861.Google Scholar
- [76] , , , , , , and . 2019. A hybrid approach to privacy-preserving federated learning. In 12th ACM Workshop on Artificial Intelligence and Security (AISec’19).1–11.Google Scholar
- [77] , , , and . 2019. An efficient federated learning scheme with differential privacy in mobile edge computing. In Machine Learning and Intelligent Communications (MLICOM’19).538–550.Google Scholar
- [78] , , , , , and . 2019. Privacy-preserving tensor factorization for collaborative health data analysis. In 28th ACM International Conference on Information and Knowledge Management (CIKM’19).1291–1300.Google Scholar
- [79] and . 2019. Federated learning with Bayesian differential privacy. In IEEE International Conference on Big Data (Big Data’19).2587–2596.Google Scholar
- [80] , , , , , and . 2020. Multi-site fMRI analysis using privacy-preserving federated learning and domain adaptation: Abide results. Medical Image Analysis 65 (2020), 101765.Google ScholarCross Ref
- [81] , , , , and . 2020. Preserving differential privacy in deep neural networks with relevance-based adaptive noise imposition. Elsevier Neural Networks 125 (2020), 131–141.Google ScholarCross Ref
- [82] , , , , , and . 2019. Efficient and privacy-enhanced federated learning for industrial artificial intelligence. IEEE Transactions on Industrial Informatics (2019), 1–11.Google Scholar
- [83] , , , and . 2020. EaSTFLy: Efficient and secure ternary federated learning. Elsevier Computers and Security 94 (2020), 101824.Google ScholarCross Ref
- [84] and . 2019. Federated machine learning with anonymous random hybridization (FeARH) on medical records. https://arxiv.org/abs/2001.09751. arXiv:2001.09751.Google Scholar
- [85] , , , and . 2019. Stochastic channel-based federated learning for medical data privacy preserving. https://arxiv.org/abs/1910.11160. arXiv:1910.11160.Google Scholar
- [86] , , , , , and . 2020. FedNER: Privacy-preserving medical named entity recognition with federated learning. https://arxiv.org/abs/2003.09288. arXiv:2003.09288.Google Scholar
- [87] and . 2020. Privacy-sensitive parallel split learning. In International Conference on Information Networking (ICOIN’20). 7–9.Google Scholar
- [88] , , , , , and . 2016. CryptoNets: Applying Neural Networks to Encrypted Data with High Throughput and Accuracy.
Technical Report , Microsoft Research.Google Scholar - [89] , , , , , , and . 2016. Deep learning with differential privacy. In 2016 ACM SIGSAC Conference on Computer and Communications Security (CCS’16). ACM, 2016, 308–318.Google ScholarDigital Library
- [90] , , , , , and . 2018. Privacy-preserving patient similarity learning in a federated environment: Development and analysis. JMIR Medical Informatics 6 (2018), e20.Google ScholarCross Ref
- [91] , , , , , and . 2019. Patient clustering improves efficiency of federated machine learning to predict mortality and hospital stay time using distributed electronic medical records. Elsevier Journal of Biomedical Informatics 99 (2019), 103291.Google ScholarDigital Library
- [92] and . 2019. Privacy-preserving federated data sharing. In 18th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS’19).638–646.Google Scholar
- [93] , , and . 2019. Confederated machine learning on horizontally and vertically separated medical data for large-scale health system intelligence. https://arxiv.org/abs/1910.02109. arXiv:1910.02109.Google Scholar
- [94] , , , , , , , , , , , and . 2020. Overcoming the barriers that obscure the interlinking and analysis of clinical data through harmonization and incremental learning. IEEE Open Journal of Engineering in Medicine and Biology 1 (2020), 83–90.Google ScholarCross Ref
- [95] , , and . 2019. A call for deep-learning healthcare. Nature Medicine 25 (2019), 14–15.Google ScholarCross Ref
- [96] , , , , , , , , , and . 2016. Mimic-III, a freely accessible critical care database. Scientific Data 3 (2016), 160035.Google ScholarCross Ref
- [97] , , , , , and . 2018. The eICU collaborative research database, a freely available multi-center database for critical care research. Scientific Data 5 (2018), 180178.Google ScholarCross Ref
- [98] , , , , , and . 2021. When machine learning meets privacy: A survey and outlook. ACM Computing Surveys 54 (2021).Google Scholar
Index Terms
- Federated Learning for Healthcare: Systematic Review and Architecture Proposal
Recommendations
Review on security of federated learning and its application in healthcare
AbstractArtificial intelligence (AI) has led to a high rate of development in healthcare, and good progress has been made on many complex medical problems. However, there is a lack of patient electronic medical records standards and legal and ...
Highlights- Several models in the development of federated learning are presented.
- We ...
Federated Learning for Electronic Health Records
In data-driven medical research, multi-center studies have long been preferred over single-center ones due to a single institute sometimes not having enough data to obtain sufficient statistical power for certain hypothesis testings as well as predictive ...
Health Outcomes and Healthcare Efficiencies Associated with the Use of Electronic Health Records in Hospital Emergency Departments: a Systematic Review
AbstractHealthcare organisations and governments have invested heavily in electronic health records in anticipation that they will deliver improved health outcomes for consumers and efficiencies across emergency departments. Despite such investment, ...
Comments