Automatic Speech Recognition Error Correction on ICU Clinical Narration Dataset

CS 224N Final Project

Automatic Speech Recognition (ASR) in clinical settings is gaining popularity as clear communication is critical in healthcare delivery. Scenarios in intensive care units (ICUs) are more complex, involving obscure medical terminology and noisy environments. Corrections or adaptations to certain domains are needed to make the narrations more reliable.

In this paper, we aimed to create an ASR error corrector using a small dataset of nurse-corrected ICU-clinical narration transcribed by Whisper. Given the limited data, we augmented the Mtsamples dataset and pretrained a ConstDecoder model on our augmented dataset, finetuning the model on our own nurse-annotated ICU narration correction dataset. Our findings show that our model is able to outperform baselines and reduce the WER by up to 16%, proving the superiority of our approach, and confirming the model's ability to be a reliable and effective error corrector in the ICU.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
README.md		README.md
augment_dataset.py		augment_dataset.py
augment_dataset_video.py		augment_dataset_video.py
data_reader.py		data_reader.py
eval.py		eval.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Automatic Speech Recognition Error Correction on ICU Clinical Narration Dataset

About

Releases

Packages

Languages

adamsunn/medicalerrorcorrection

Folders and files

Latest commit

History

Repository files navigation

Automatic Speech Recognition Error Correction on ICU Clinical Narration Dataset

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages