PoC's for speech recognition and speaker diarization.
- rtsr_en.py: PoC using AssemblyAI WebSocket API (english only)
- rtsr_de.py: PoC using OpenAI Whisper (de, probably multilingual)
Additionally, a handful of prototypes were created using various technologies:
- librosa
- NVIDIA NeMo
- Tensorflow + Keras Model
- Mel Spectrogram CNN