Mitigating the Risks of Artificial Intelligence in Medicine

 

Health systems and professionals are awash in predictions that artificial intelligence (AI) will create safer and high-quality care. In medicine, AI can refer to the use of computers and software to do the work for or complement the efforts of physicians or staff engaged in tasks that are labor-intensive (e.g., review of all the medical literature on a topic), require high reliability (e.g., surgical procedures) or complicated (e.g., review of radiologic images). 

What makes these computers intelligent goes beyond human programming of fixed complex algorithms and includes machine learning (ML), with which computers are able to take in new data autonomously, learn and adapt to new data so that their outputs are more precise and valid. For example, an AI/ML software program designed to read mammograms is constantly comparing the algorithm for detection (or, sometimes, lack of detection) of a breast tumor in a mammogram to actual clinical outcomes, adapting the software to improve the sensitivity and specificity of the program.

As exciting as breakthroughs in AI may appear, my experience informs my view that we should be cautious about our romance with artificial intelligence. An over-reliance or blind acceptance of the benefits of AI could lead to the unintended consequences of degraded safety and quality in health care.

Recently, we learned about how AI went wrong with the Boeing 737 Max airplane. According to the New York Times, Boeing had a design challenge in building a larger capacity airplane that required them to develop an automated system that stabilized the plane in flight. Initially, it had limited use and pilots were aware. However, when the test pilots discovered new challenges to flying the plane, engineers learned from the test pilots and adapted the software program, changing how sensor data were used, to what they assumed would be better flights.

Engineers saw this modification as small and relatively unremarkable. So, Boeing never notified commercial pilots of the changes. As a result, pilots were not adequately trained to manage complications associated with the automatic system.

Understanding its life-threatening consequences for air travel, it’s not hard to see how AI/ML could lead to unsafe and lower quality care when implemented in health care. First, if the software is constantly learning and adapting, physicians might not be trained to consider the impact of these changes, potentially creating catalysts for sentinel events.

Next, there is the possibility that the computer could learn incorrectly. Like all predictive modeling, no matter how many data points or instances are observed, there could be bias in the output. For an example, a program may be learning and adjusting output based on a dataset that predominantly features white men but the validity of the outcome – in this case, a diagnosis – might be frequently harmful when the AI/ML is applied to another group, such as African American women.

The Food & Drug Administration (FDA) appears to have a full recognition of the problem with AI/ML, particularly as it concerns “software as a medical device,” also known as SaMD. SaMD is software intended to be used for one or more medical purposes that perform these purposes without being part of a hardware medical device though SaMD may use data collected by hardware medical devices.

The FDA expects that SaMD must go through the same pre-market application and approval process as other medical devices. However, things get more complicated for AI/ML SaMD.  Since AI/ML-based algorithms and models upon which the SaMD are based are constantly learning and adapting to new data, the safety and efficacy profile of a SaMD that was initially approved by the FDA might differ from the current state of the SaMD.

So, what’s the solution? Perhaps it is to require a pre-marketing application and approval for changes in the software of AI/ML SaMD

While the FDA has not yet approved regulations to manage post marketing changes, they offer some guidance for when a new pre-market application needs to be submitted. Originally written in anticipation of regulations around AI/ML SaMD, this guidance assigns risks associated with the adaption of the software. The higher the risk of the changes (e.g., when the SaMD is being used to treat and diagnose a critical or serious condition), the more it becomes essential to submit a premarket application for the software modifications.

Until the FDA establishes true regulation around AI/ML SaMD and maybe thereafter (see Boeing case above), AI/ML SaMD remains a potential threat to the quality and safety of care. As a result, health care institutions who have adopted AI/ML SaMD devices should create an AI/Ml management committee that would:

  • Keep an inventory of AI/ML;

  • Monitor the uses of AI/ML SaMD across the institutions and;

  • Do regular performance testing against controls. 

AI/ML offer great hope for improving care but to the extent that practitioners are ignorant of or choose to ignore the adaptations occurring over the lifecycle of the ‘black box’ of AI/MLSaMD, we run the risk of crashing the plane.

RELATED: EconTalk Podcast - Russ Roberts and Eric Topol on Deep Medicine

 
C