Integrated Gradients for Frame-Level Classification Tasks #1282

david-gimeno · 2024-05-16T10:22:36Z

🚀 Feature

Integrated Gradients not only for single classification task, but also for those that are based on frame-level classification, such as speech recognition, active speaker detection, etc.

Additional context

In the case of automatic lipreading, we have a tensor of shape (T, H, W, C), where T, H, W, and C refer to the time (no. of frames), height of the image, width of the image and channels of the image, respectively. I would like to apply Integrated Gradients so the targets of the model can be a sequence, i.e., an integer for each frame of the sequence. Each integer would represent the corresponding text token composing the final transcription.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integrated Gradients for Frame-Level Classification Tasks #1282

Integrated Gradients for Frame-Level Classification Tasks #1282

david-gimeno commented May 16, 2024 •

edited

Loading

Integrated Gradients for Frame-Level Classification Tasks #1282

Integrated Gradients for Frame-Level Classification Tasks #1282

Comments

david-gimeno commented May 16, 2024 • edited Loading

🚀 Feature

Additional context

david-gimeno commented May 16, 2024 •

edited

Loading