With the emergence of the Omicron variant we will soon (mid January 2022 peak in 15.01.2022) see a spike in cases including increased hospitalizations and a need for rapid antigen tests, according to the latest models. New studies also show that current tests are less accurate in detecting Omicron and the recommendation ist to use 2 tests instead of one which further increases demand for tests.
However with a doubling of cases every 3.5 days it is hardly imaginable how demand for tests can keep up.
Research over the last year showed that detecting Corona from cough sounds alone is possible, but so far no app making use of this technology was made available to the public.
But the latest research has also shown, that Omicron affects the upper airways more than the lung which might be the reason for less sevarity and might result in decreased accurecy when only relying on cough data. Therefore a focous should be put on other sounds like voice and breathing as well!
The development of a publically available Covid test using only the microphone in widly available Android and iOS phones could help reduce the impact the Omicron wave by detecting infection earlier.
A simple model (not for cough data yet) was trained and imported into an Android app. The model used in the app is just an proof of concept and needs to be replaced with a model able to detect real covid cough data. The App needs to be extended to record cough sounds and put them through the model returning a positiv/negative result.
-
Covid cough Classification on GitHub
- convolutional neural network (CNN)
- Uses MobileNet an already trained image classification network here used for transfer learning
- I guess using a network trained on spectrograms instead of images would be better for transfer learning. TODO: Find such a network and compare results
- is trained on MEL spectograms
- Not sure why to transform the spectrogram into a ML spectrograms since the mel scale is only relevant to adjust the sound to human hearing, which we don't need if only the computer is "hearing" and analysing the cough sound.
- A similar well documented instrument classification project named Musical Genre Classification on GitHub is available with an easy to understand article explaining the concept.
-
CNN-Audio-Classifier-with-Keras-Tensorflow
- transfear learning done using the ESC-50 dataset containing 2000 environmental audio recordings
- mel spectograms
- Dataset of sounds of symptoms associated with respiratory sickness this is not a covid cough dataset!
- Wiki page
-
COUGHVID: REDME and Code for data pre-processing]
- 67.7% of COVID-19 patients exhibit a “dry cough”
- COUGHVID
- public dataset
- over 25,000 crowdsourced cough recordings
- size of 1.3 GB
- more data may be available on request [email protected]
-
- Review Paper
-
AI4COVID-19: AI enabled preliminary diagnosis for COVID-19 from cough samples via an app
- Android-App (AI4COVID-19) seems not available to public
- App sends data to the cloud for analysis, which has the adventage of being able to keep the model up to date for all users. But the disadventage of requiring an internet connection.
- They use "transfer leraning" to make up for missing cough sound data
-
COVID-19 Artificial Intelligence Diagnosis Using Only Cough Recordings
- asymptomatic detection
- "transfer learning" on alzheimers dataset, showing improvements in accuracy
-
- "transfer learning" CNN trained on regular speech dataset
- three types of sounds used: Cough, digits from 0 to 9, word “Ommmmmmm”, with “m” sound extending for 12 seconds
- 0.99 second long raw audio files used
-
COUGHVID crowdsourcing dataset, a corpus for the study of large-scale cough analysis algorithms
-
- good review of other Papers (see the summary table 1) with the realization that: -> "No accurate model for diagnosing COVID-19 disease symptoms exists. Implementing a deep CNN model along with multi-feature channels (De-noising Auto Encoder, GFCC, and IMFCC) leads to better results"
- using voice, dry cough, and breath results in better accuracy (95.45%) and performance compared to cough only (see table 4)
- show differnt methods for Augmentation of the data like shifting pitch adding bg noise. (see section 3.1.2.)
- regularization techniques like dropout (see section 4.)
-
Exploring Automatic Diagnosis of COVID-19 from Crowdsourced Respiratory Sound Data
- the mobile app gathers data from single individuals up to every two days, allowing for potential tracking of disease progression
- data release only with one-to-one legal agreements. See website
- because of the limited data size they used shallow (e.g. Logistic Regression) instead of deep classefiers (e.g. CNN, RNN)
-
EXPLORING AUTOMATIC COVID-19 DIAGNOSIS VIA VOICE AND SYMPTOMS FROM CROWDSOURCED DATA
- covid19-sounds on demand database with 55.000 (2000 positive Covid) sound samples. Ask for access here: [email protected] and set up a sharing agreements for the data
- taking symptoms like loss of smell/taste into account, increases performance from 77% to 79%
- We start with a new Android Studio project. Using the "Basic Activity template", API level 23 Marshmallow (for >95% device coverage) and Java as the programming language.
- Preparing the data. Offten this step is the hardest since building the model is easy when using tools like kares which provide all the parts to train a model.
- Train the model with Kares in Google Colab resulting in a .h5 and .tflite file. See the SimpleExampleOfTrainigATesnsorflowModel.ipynb for details.
- Add the functionality of running pretrained models on android following this guide and this github repo
- Created an asset folder and add the tflite file you trained with Google Colab and downloaded in the previous steps. https://stackoverflow.com/questions/18302603/where-to-place-the-assets-folder-in-android-studio
- ...
- transfer lerning looks like a must
- the cough sounds must be cropped to have the same length for training and detection!
- 'selective Training' idealy we collect personalised cough data of the user before he gets covid to reduce the false positive rate of the app. Gender, age, ... or just use user recordings to classefy the user and train a better personalized model with trining data similar to the user.
- Put disclaimers with the accuracy of the test, using graphics comparing the accuracy with rapid antigen and PCR tests for comparison
- Inform user on what sound is best for detection and dicurrage users with bg noise or other respiratory deseases to use the app since its not clear if it works well for them.
- Avoid Text as much as possible and only use GIF's/Animations so the test can be used by everyone without a language barrier.
- Output should include the confidence of the model and the information that the disclaimer that the results can be wrong even if confidence is high. Also It should be very simple by presenting a probability of having covid and giving the user the option to see more detailed data of his recording analysis.
- A combination of cloud based analysis when an internet connection is available and a on device analysis tool for offline use would be ideal.
- According to Andrew Ng famous ML lecture
- CNNs are good for image detection but RNNs are better for sounds
- larger network and more data are the 2 main factors for improving the network
- ReLU speeds up training compared to sigmoid activation function, but sigmoid should be used for the last(output) layer since we only have 0 or 1 as an output
- Hyperparameters are Alpha (Learning rate), # of iterations of Gradient Descent, find the right number of hidden layers, # of hidden Units (nodes per layer), which activation function in which layer, momentum, min-batch size, regularization parameters, .... -> use trial and error and itterate to find optimum.
- train/dev/test set should have a ration of 60%/20%/20% when dealing with limited amount of data as in our case of covid sounds.
- Make sure that dev and test set come from the same distribution but it is ok if training set comes from an other distribution e.g. for the sake of more data
- If the result has high bias(underfitting) and/or high variance(overfitting) try: bigger network (until bias shrinks), train longer (never hurts), different nural network arcitecture, more data and regularization (in case of high variance).
- Data Preperation has 3 main steps:
- Cleaning data to remove missing data, noise, ...
- Data Transformation and normalization: Normaly we standardize so that the standarddiviation is 1. Attention! If you later add new data the normalization and standardization must be the same as for the previous data!
- Data Reduction: remove duplicates, remove data you dont need for your analysis, corrolation analysis (removes data which are so simmilar that removing them doesnt change the result we want), forward-backward-attribut selection (train ML model with and without the data ans check if prediction quality is affected. If it has no affect the data can be removed), forward-attribut selection(start with one attribute and add more until learning doesnt get better)
- To make a more informet decision on what to remove, use Principal Component Analysis (PCA). Remember to standardize the data first to avoide vastly different variance between the dimensions. Variance should be 1 for all dimensions. Data Cutoff normaly set so that only the PCA's are used, that explain 99% of the differences.