-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Benchmark For Audio Feature Extraction Libraries #22
Labels
Comments
liufeigit
added
discussion
Open-ended discussion for developers and users
functionality
New function
and removed
good first issue
Good for newcomers
labels
Apr 23, 2023
liufeigit
changed the title
Benchmark For Audio Feature Extraction Library
Benchmark For Audio Feature Extraction Libraries
Apr 24, 2023
Why Essentia results are not present in the comparison? |
@LorenzoMonni I have tested it before. Is this what you need? |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Benchmark
Introduce
In the field of deep learning for audio, the mel spectrogram is the most commonly used audio feature. The performance of mel spectrogram features can be benchmarked and compared using audio feature extraction libraries such as the following:
There are many factors that can affect the performance evaluation results, including CPU architecture, operating system, compilation system, selection of basic linear algebra libraries, and usage of project APIs, all of which can have a certain impact on the evaluation results.
For the most common mel features in the audio field, the major performance bottlenecks are FFT computation, matrix computation, and multi-threaded parallel processing, while minor bottlenecks include algorithmic business implementation and Python packaging.
Scripts
If you want to compare and test multiple libraries, you can use:
run_xxx.py
calls, numberIf you want to test a single library, you can use:
If you want to see more usage instructions, you can execute
python run_xxx.py --help
Notice
In the field of audio, libraries related to audio feature extraction have their own functional characteristics and provide different types of features. This evaluation does not aim to test all the performance comparisons of their feature extraction in detail. However, as the mel spectrum is one of the most important and fundamental features, all of these libraries support it.
There are many factors that can affect the performance evaluation results, such as CPU architecture, operating system, compilation system, choice of basic linear algebra library, and the usage of project APIs, which will have a certain impact on the evaluation results. In order to be as fair as possible and to better reflect actual business needs, the following conditions are based on in this evaluation:
Warn
Performance
Base benchmark
Use audioFlux/torchaudio/librosa script, for AMD/Intel/M1 CPUs and Linux/macOS operating system.
The time required to calculate the mel-spectrogram for 1000 sample data according to a TimeStep of 1/5/10/100/500/1000/2000/3000. Where fft_len=2048, slide_len=512, sampling_rate=32000.
Linux - AMD
Linux - Intel
macOS - Intel
macOS - M1
Summarize
In summary, from the performance comparison results of the three libraries, librosa takes the most time, which is also in line with common sense.
On linux/amd processors, audioflux is slightly faster than torchaudio, but slightly slower on linux/intel.
On the macOS system, for large-size sample data, audioflux is faster than torchaudio, and intel is more obvious than m1; for small-size sample data, torchaudio is faster than audioflux.
The text was updated successfully, but these errors were encountered: