MASA (Machine Learning Assisted Swallowing Assessment) is an innovative project designed to leverage the power of machine learning (ML) to transform the way swallowing assessments are performed, with an aim to improve quality of care. The primary goal of this project is to harness ML capabilities in spectral analysis of human voice during swallowing assessments to classify the assessment state as per the screening standard (at our site it is TOR-BSST©, however many other screening tests are used at other centers), a recognized tool for assessing swallowing disorders particularly after acute stroke.
This project involves the application of advanced techniques such as Convolutional Neural Networks (CNNs) initially, with plans to expand to Visual Transformers (ViT) in the future. The use of these cutting-edge technologies is aimed at capturing and understanding the intricate nuances of human voice during swallowing assessments that are potentially missed in traditional assessments.
The algorithms developed under MASA aim to bring efficiency, accuracy and scalability in the assessment process, potentially enabling clinicians to make more informed decisions regarding patient treatment and management.
Please also note the MASA supplementary material methods accompanying our publication in Frontiers in Neuroscience.
Develop a robust machine learning model capable of performing spectral analysis on human voice during swallowing assessments. Validate the model's performance against screening test labeling. Improve the quality and efficiency of swallowing assessments, particularly in the context of post-acute stroke care. Explore the incorporation of advanced technologies such as Visual Transformers to enhance the model's capabilities. Getting Started Refer to the Getting Started Guide for instructions on how to install, run, and use this project.
This repository uses Docker containers to run the machine learning notebooks. Prior to starting, we recommend installing and updating Docker to the latest version. If you are using Linux, make sure you set up the nvidia container toolkit to be able to use GPU acceleration. This repository was tested on Windows 11 and Ubuntu 22.04.
You can use the Makefile to automate the process of building and running the notebooks.
To build the tensorflow docker:
make build-tensorflow
To run the tensorflow docker:
make run-tensorflow
Please note the docker images will automatically start jupyter lab servers as this repository mainly relies on python notebooks for the ML experiments. You can similarly run the Pytorch docker images.
After running the Docker Image, open Jupyter Lab in your browser and go the sample base notebook
If you have prepared the dataset and placed it under the Audio Data folder (see the next section on preparing the dataset), you should be able to simply run the CNN notebook without modifications.
This repository provides preprocessing code to convert raw audio signals to spectrogram images.
The Audio Processing Toolkit is designed to facilitate the loading, processing, and analysis of .wav
audio files. This toolkit can handle multiple audio transformations like Mel spectrograms and Superlets. It also has utilities for batch processing of audio files, categorizing them into 'Pass' and 'Fail' based on certain conditions.
- Load
.wav
audio files from a specified directory - Apply Superlet and Mel transformations
- Export processed data as images
- Generate histograms for 'Pass' and 'Fail' sound file lengths
- Supports custom epoch durations, sample rates, and more
pip install -r requirements.txt
To use the toolkit, you will need to edit the main.py
file. Here are the primary areas you might want to customize:
Edit the following lines to specify the directory containing the .wav
files you wish to process and where the output should be saved:
path = "Audio Data/For Processing"
output_path = "Audio Data/Outputs/"
You can customize the epoch duration, overlap, and other parameters by editing these lines:
epoch_duration = 0.5
overlap = 0.5
min_power = 0
target_sampling_rate = 22050
output_type = 'mel'
After configuring, run main.py
to start the processing. This will create the dataset that will be used in the deep learning notebooks included in this repository.
- Dataset Parent Directory
- Train
- Pass
- Participant_1
- spectrogram1.png
- spectrogram2.png
- ...
- Participant_2
- spectrogram1.png
- spectrogram2.png
- Participant_1
- Fail
- Participant_3
- spectrogram1.png
- spectrogram2.png
- Participant_3
- Pass
- Test
- Pass
- Participant_5
- spectrogram1.png
- spectrogram2.png
- Participant_5
- Fail
- Participant_7
- spectrogram1.png
- spectrogram2.png
- Participant_7
- Pass
- Train
Please feel free to raise an issue for any queries, suggestions, or discussions.
This project is a stepping stone towards improving the quality of patient care by leveraging the potential of AI/machine learning in neurologic care. Specifically, we are starting with audio as a biomarker for swallowing in stroke patients. We welcome you to join us on this exciting journey.
- Lab Members (continuing and joining, these are affiliations, see disclaimer):
- Rami Saab, University of Toronto Medicine, AI4QI in Stroke, University of Toronto, Med, Sunnybrook Research Institute
- Dr. Arjun Balachandar, Neurology, University of Toronto, ML4QI in Stroke
- Eptehal Nashnoush, MSc, University of Toronto Data Science, Datathon co-founder, T-CAIREM, Health Quality Ontario, Sunnybrook Research Institute
- Hamza Mahdi, Western University, Medicine, Sunnybrook Research Institute
- Rishit Dagli, Computer Science, University of Toronto
This GitHub repository is for educational purposes only and does not represent expert medical judgment or assessment. The tools and information provided herein are not intended for clinical use and should not be relied upon for medical decision-making. No duty of care is assumed by the contributors, and all individuals associated with this project are absolved of any medical-legal burden. The views and work expressed in this repository do not reflect the official stance of any affiliated academic/other institutions or hospitals where we work or study. Additionally, any open-source software or code written by other authors is attributed to those respective authors - thank you. Use at your own risk.