Understand AI research in medicine in one slide

B.OSTON — For GPT, scientific journals have become something of a mad lib game. Artificial intelligence can now detect _____ and quickly tell the difference between ______ and _____. But which of these studies really matter? How can clinicians classify them?

At a recent AI conference, Rahul Deo, chief medical officer at Atman Health and Associate Physician at Brigham and Women’s, summarized the issue in a single slide. The most risky and most influential studies get far less attention these days than others.

The AI ​​models with the greatest impact are those that figure out how to replace the most complex physician tasks with automation. These studies are followed by research that takes steps toward that goal, including models for predicting patient risk, clinical decision support models, and language models for automating rote paperwork. And at the bottom, there’s an “everything else” category, for studies that might seem impressive but haven’t really gotten the attention they deserve.

“I think that was the case at the scene.” [need] Otherwise, you’ll just find yourself doing what I think is the bottom line: fancy paper to impress your co-workers,” he says with a small laugh. told STAT.

“I think that’s the trend.” [doing] Some bubbles never have the ability to come out and really affect anything,” he added.

This is Deo’s hierarchy of medical AI research, ranked from riskiest and most impactful to least impactful.

Move complex provider tasks to automated systems

Replacing the primary work performed by doctors with a machine learning-powered workforce will be a major transformation in healthcare. It could reduce healthcare costs or expand access to treatment in areas where doctors are in short supply. “Of course the risk is so great,” he said, that it would require strict regulation by the Food and Drug Administration. “The model should be really good,” he added. Trusting machine learning models to output medical decisions and execute them with no one in the way is a “complete high stakes game.”

If AI replaces human doctors, healthcare will become much more scalable, potentially reaching people who are currently without it. This is the idea behind Martin Shkreli’s Dr. Gupta.AI, which has tested his ChatGPT on US medical licensing exam questions and many other attempts at language modeling in medicine. But for now, that dream faces major hurdles, especially in the areas of replicating human empathy and reading human cues such as body language. Physicians might pick up on these signals and rephrase questions or add personalized context, but it’s difficult for AI to do this without prompting.

Current AI language models are also incapable of reasoning and logic. “They most often don’t have an underlying first-principles model of what’s going on, so they can do things that people think, ‘No medical student would make that mistake. There is a lot of potential,” says Deo. “That’s a challenge, at least with the models and architectures that exist today, but there can always be some degree of risk.”

There are no good examples of research in this area yet. The main reason is that there are many technical hurdles that must be overcome before it becomes possible. But while empathy, reasoning, and decision-making are very difficult, Deo believes that everything else in medicine will be very possible if AI receives reliable input data. “Most things are either algorithmic or you want to be algorithmic,” he said.

Rapid iterative learning of optimal care approaches

Training an AI to make a doctor’s decisions would require vast amounts of data, which Deo says is almost non-existent at this time.

“When you look at most of the evidence in most fields, cardiology is probably one of the best evidence, but the sheer amount of it is just expert opinion. said. “Because it’s very, very expensive to get data in a randomized clinical trial setting.” , an AI can systematically learn from a data stream, but health outcome data is much harder to obtain and much harder to train on.

“Why is this group not doing as well as this group?” Deo asked. “There are probably hundreds of thousands of questions like that.” But at the pace at which clinical trials can be conducted, “it will take a thousand years to get to that point,” he says.

Even where clinical trials of disease treatments have been conducted, there are gaps in the data. Because clinical trials and large collections of medical records only collect data on a certain number of possible variables, the exact cause of a particular outcome may not be known. Trials cannot collect an infinite amount of data to identify the ultimate cause of disparities, so it is very difficult to say why one subgroup of patients performed worse than another. . Whether it’s an unmeasured biomarker or a social determinant of health such as where someone lives or access to food or transportation, AI can only learn from inputs and outputs, and is estimated along the way. The cause cannot be learned. , which has led to bias in AI algorithms in the past.

Training an AI to make good decisions based on current health outcome data is currently “extremely difficult, if not impossible, because of the bias,” Deo said. . “This is not enough [and that]this lack is biased. All these statistical nightmares make statistics very complicated. AI doctors will be missing an important part of the “med school” curriculum if ways to fill this gap are not researched.

Sorting Current AI Model Charts From Presentation Slides -- Healthtech Coverage by STAT
Overview of different categories of AI models and their relevance in the medical AI field by Rahul Deo, presented at the 2023 MIT-MGB AI Cures conference in Cambridge, Massachusetts. Courtesy: Rahul Deo

Another Eye: Anticipatory, Overreading Diagnostic Research

Research into clinical decision support algorithms falls into the following categories for Deo: These AI tools do not make decisions by themselves, but serve as additional eyes. It is already gaining traction in clinical practice, such as computer-assisted mammography, which is now used in patient care.

“[It’s] This kind of thinking that doctors will continue to do exactly the same things they have been doing. But we have a machine out there that says, ‘Buddy, you might have missed this,’ or, ‘Hey, you might want to see this before that,'” Deo said. rice field.

The redemption path for these tools is not clear. Moreover, it is difficult to trust the algorithm if the model is a black box and the doctor cannot know exactly what the algorithm sees. It saves time and potentially improves results, but as evidenced by Epic’s controversial sepsis algorithm, implementing this kind of algorithm in the real world comes with additional financial costs and liabilities. incur costs. “There’s less risk, but maybe less overall gain,” Deo said.

new risk markers

A side category of clinical decision support algorithms are models that point out which people are at higher risk for something, often using wearables and other devices to digitally Collect clues.

However, the direct-to-consumer nature of these technologies poses major workflow challenges that the healthcare system does not yet understand. “My toilet seat taught me this,” Deo said. Research on ‘digital biomarkers’ and ‘home hospital’ monitoring using AI models is innovative and may require less inference than other types of AI research. But without the infrastructure to integrate this kind of data into the traditional healthcare system, it cannot impact the healthcare system.

Less monotonous work

Thanks to AI, doctors now have even more tools that eliminate tasks like responding to patient portal messages. These “low risk” tasks don’t require much risk in exchange for time savings, but developing these capabilities won’t advance the state of AI. Still, these AI use cases are popular, and tech companies like Microsoft and Epic are getting into the business with pilot programs in large university health systems.

But these new features raise questions about the line between “high-stakes” and “high-stakes” activities. Patients may not mind AI helping them schedule appointments or reminding them of what they can and can’t take, but when they hear that AI is writing their medical records, Some people feel insecure.

According to Deo, tools to ease the tedium have existed for a long time. The machine automatically calculates the width, axis and angle of the electrocardiogram and spits out all relevant statistics. This is no longer something that anyone does by hand. He noted that the healthcare system has always defined acceptable amounts of risk in different areas. Speech-to-text dictation tools contain errors even when they use humans for transcription. And while clinicians often ask if scribes and medical students are allowed to take medical histories, take notes, and start exams, doctors often ask that all the work done by medical students does not start over. Just select the most important locations and double check them.

“There are many people with medical expertise who have already contributed to some of this issue, but I’m sure not all have been verified verbatim,” Deo said. . “People choose these low-risk locations because they know they’re more likely to be hired because they’re less concerned about what kind of liability they’ll have. No, there are just few.”

Impress journal editors, reviewers, and members of the research section

At the bottom of Deo’s risk-reward hierarchy is all of the rest of AI research. People are excited to see AI introduced into their field and applied to familiar problems, but the novelty wears off after a while, Deo said. Many of these models start to fall apart when the rubber comes into contact with the road. What are people using now? Are they going to change what they are doing? What is the risk if there are false positives or false negatives in the model? How will it be met? Who will pay for it? Will it save you money?

Many of the models created for academic curiosity do not take these questions into account and will die out because there are so many obstacles to the ability to introduce anything into clinical practice, Deo said. said.

“From an academic point of view it’s just tedious and this paper doesn’t get any higher. It’s not considered academic, it’s a fundamental problem with how we fund and promote our research.” ‘ said Deo.

Deo’s calls to action can be integrated into clinical workflows where researchers have a measurable impact on patient outcomes, and can be tested at partner institutions to prove models are not overfitted to specific populations. It’s about choosing what you can positively validate. .

If researchers don’t start training expensive models with downstream validation in mind, “there’s really not much value because you have to start over,” says Deo. Without a way for others to use that model, “it’s just a proof of concept at best.”

This story is part of a series that explores the use of artificial intelligence in medicine and the practice of patient data exchange and analysis. This project is supported by funding from the Gordon & Betty Moore Foundation.

Source link

Leave a Reply

Your email address will not be published. Required fields are marked *