Concern about OpenAI Whisper Medical Transcription Accuracy

Despite warnings from OpenAI against using their voice to text technology in high-stakes applications, over 30,000 medical professionals have employed OpenAI’s Whisper-based tools to transcribe patient visits. In order to maintain privacy, the original audio files are deleted. That’s already a red flag in any healthcare audit trail. 

On top of that, shocking percentages of inaccuracies in these AI-generated notes are raising alarms about potential risks to patient safety and medical record integrity. While fine-tuned for medical terminology, OpenAI has acknowledged the AI may still “confabulate” or make up information. 

In some of the data reviews from outside consultants, error rates in the AI transcripts ranged anywhere from 10 to 60% plus. As a statistical range, the width of that gap is astounding and I’m mystified that it’s not being talked about more vocally. Especially by the impacted patients and providers that were on the higher end of the error range. 

As Whisper and similar AI tools become more prevalent in healthcare, it’s vital to validate their accuracy, maintain human oversight, and ensure sensitive patient data is protected. The convenience of AI must be balanced with the utmost standards of safety and reliability. 

Building trust in these systems is vital and misses like this stand to set back thinking around possibilities in industries outside of healthcare.

In an upcoming webinar I’m doing with Larry Trotter II, we’re going to talk about the risks of rushing into AI deployments without securing them first. It takes significant work in health companies to ensure they’re meeting member or patient expectations while also maintaining the letter of the law under HHS and FTC regulations. 

In a prep conversation we recently had, we talked about the Whisper news and Larry was quick to point out that OpenAI clearly states the tool is not intended for high-risk use cases, like health care transcription.

Video Transcript:

Keith Boswell: So Larry, I’ve got a little bit of an audible question, but I think you probably read about it just because I know it was in the news this week. About Open AI and the Whisper transcription errors that they’re uncovering.

So Open AI’s technology for voice to text is called Whisper. Because it was so promising in the earliest days, was adopted by thousands of healthcare organizations. And now they’re coming back in and they’re auditing those transcripts and they’re, in some cases, some offices have transcript error rates as high as 60 percent wrong being captured.

Notes being added about things that were not discussed. I mean, hallucination of the worst kind, that you, just imagine like this can’t still be happening.

It seems like it’s really happening in mass with Whisper. And I have a feeling this is going to be a little bit of an albatross around Open AI, because so many people have put so much trust in this technology. They’re running it in the clinics. Uh, what do you mean? 

Larry Trotter: I wouldn’t blame Open AI for that because they have a disclaimer on their website that our systems hallucinate essentially provide accurate information. I think this goes back to what I was saying, where these systems are definitely going to revolutionize the industry, but we need to make sure that they’re safe when we’re bringing them on or bringing them within the ecosystem.

And this is a clear example where providers and other types of institutions are adopting these devices, but not aren’t really doing risk assessments on these devices and this is what we’re finding.

And with any type of technology, I mean heck, We still have problems with the internet, right? And that’s been going on for how long you have to verify the information that’s provided at the least. If you’re not going to do a risk assessment, which I don’t recommend, but with any type of technology, you have to do your due diligence with the outcome.

Keith Boswell: Yeah, you’re absolutely right. And that’s a great point. OpenAI definitely lets people know where the technology is. It’s all in their disclaimers. But I also, it’s a really powerful point that so many of these organizations, when they hear automatic transcription, like they’re not even thinking, let’s check it.

They’re just jumping to “that would save me so much time”. 

Larry Trotter: Right. Yes. That’s it. It’s pretty much about efficiency and because of the burnout within health care. That’s one of the biggest reasons for adopting AI and the repercussions aren’t considered as well.