The global healthcare sector is facing a crisis: there are not enough healthcare practitioners to handle the demands of patients. This situation will be exacerbated in the future as the population grows and demographics change. Artificial Intelligence (AI), through its ability to assist with diagnostics, can enhance the efficiency and quality of healthcare. However, before AI can be applied to patient data, it must be shown to be safe and reliable. The Focus Group on “AI for Health” (FG-AI4H), a collaboration between the World Health Organization and International Telecommunications Union, is developing a standardized framework in which AI can be tested. Fraunhofer HHI is contributing to the development of this framework.
AI has the ability to learn from digital health data (e.g., images, sensor measurements, and electronic health records) and to assist healthcare practitioners with detection, diagnosis, and medical decision-making. This can enhance the efficiency of the healthcare system. However, AI algorithms are highly complex and their performance depends on the quality of training data. If AI algorithms are poorly designed and/or training data are biased or incomplete, errors or problematic results can occur. Therefore, before an AI can be safely applied to digital health data, it must be rigorously tested. Currently, however, there is no universally agreed upon method to conduct AI testing. To address this Achilles heel of AI systems, Fraunhofer HHI is contributing to FG-AI4H. According to Thomas Wiegand, Executive Director of Fraunhofer HHI, Professor at TU Berlin, and Chair of the Focus Group, the project is at the interface of “ethics, health, and technology.”
The procedure of FG-AI4H is multi-tiered. First, topics (i.e., specific health problems) and related data (e.g., annotated imagery) are selected. The criteria for selection include: whether the topic affects a large part of the population, whether the topic can benefit from the use of AI, and whether sufficient data are available (e.g., whether there are there sufficient data for training an AI, whether there are sufficient undisclosed data for testing an AI, whether the data are of high quality and ethical source, and whether the data come from varying sources). For the selected topics, communities of stakeholders are established. Thus far, ten of these topic groups have formed, covering ophthalmology (retinal imaging diagnostics), snakebite and snake identification, and beyond. These topic groups converge expertise and datasets, choose benchmarking tasks, and coordinate the benchmarking process. For the latter, training data are made available to AI developers. The resulting AI algorithms are evaluated over the Focus Group’s online platform using undisclosed test data. In addition to providing organizational support to the Focus Group Fraunhofer HHI is lending technical support: expertise in explainable AI (XAI), AI evaluation criteria and quality metrics, and data privacy preservation.
Although FG-AI4H has an official duration of two years (2018–20), momentum will likely drive the project beyond 2020. Furthermore, the impact of the resulting standardizations will benefit medical practitioners and patients well into the future.