Chord decoder confidence #431

hpx7 · 2019-05-13T14:17:38Z

The CRFChordRecognitionProcessor uses the Viterbi algorithm to compute the most likely chord sequence at the frame level. I'm wondering if there's a way during decoding to obtain confidence levels per frame about chord possibilities.

In particular, is there an easy way to obtain marginal probabilities per chord label from the CNNChordFeatureProcessor feature vectors? From a bit of research it looks like the forward-backward algorithm may be useful for this, but I'm not exactly sure how we would apply it here.

The text was updated successfully, but these errors were encountered:

hpx7 · 2019-05-25T23:18:15Z

So I guess we can achieve this in a frame-local way by using the learned emission probabilities in the CRF and applying a softmax function:

from madmom.models import CHORDS_CFCRF
from madmom.ml.crf import ConditionalRandomField
from madmom.features import CNNChordFeatureProcessor
import numpy as np

def softmax(x):
  return np.exp(x) / sum(np.exp(x))

crf = ConditionalRandomField.load(CHORDS_CFCRF[0])
feats = CNNChordFeatureProcessor()('audio-file.wav')
probabilities = [softmax(np.dot(feat, crf.W)) for feat in feats]

However if we wanted to take the transition probabilities etc into account we would probably need to apply the forward-backward algorithm. Is this something that could be added to the ConditionalRandomField class?

superbock · 2019-05-27T06:22:01Z

Ping @fdlm since he is the chord/CRF guy.

fdlm · 2019-05-27T11:13:24Z

Yes, this is correct. If you want to consider transition probabilities, you need the forward-backward algorithm, and yes, it could be added to the ConditionalRandomField class, if anyone is willing to implement it 😃

hpx7 · 2019-05-28T01:42:22Z

I took a stab at partial implementation (master...hpx7:patch-1) but I'm stuck on a few things I haven't been able to figure out yet:

I'm not sure how to properly apply the bias and final potentials
I couldn't get it to return reasonable results until I realized that the transition probabilities in the CRF (self.A) were not normalized. After applying a softmax it started behaving more normally. Am I doing something wrong by normalizing here? Maybe this is a difference between HMMs and CRFs that I'm missing?
The algorithm appears to be suffering from underflow problems for long sequences. Is there something you did for the Viterbi implementation to help avoid this? (I notice the uses of addition instead of multiplication, which was confusing to me, is that related somehow?)

Sorry for all the questions, I'm just posting them as I work through understanding all this as it's a new area for me.

fdlm · 2019-06-15T22:50:32Z

add the bias at every step, add the final potential at the last step.
self.A is not a transition probability matrix, but a transition potential matrix. Potentials are not normalized in any way. The model behaving "more normally" might be a coincidence, but there is probably another bug in the code.
Yes, it does. When computing the Viterbi path, you can operate in log-space (hence + and not *). For the forward/backward algorithm, you usually normalize at every step and keep track of the normalization constant to get the correct final answer in the end.

You can check out my CRF code for Theano here: https://github.com/fdlm/Spaghetti/blob/master/spaghetti/layers.py (_get_forward_output_for)
Maybe it helps you.

hpx7 · 2019-08-06T14:08:55Z

Thanks for the pointers @fdlm. I got some time to look into this more.

The only way I could figure out how to normalize at each step was to apply the softmax function (since we have to account for negative potentials). However I'm not sure how to calculate a normalization constant with softmax and apply it to correct the answer at the end.

I've gotten rid of softmaxing self.A and am incorporating the bias and final potentials. The results are looking mostly reasonable now (comparable to the viterbi output) but I'm not sure what the best way to verify that the implementation is correct or not.

You can see my changes here: master...hpx7:patch-1. Is there anything you'd like to see prior to me making a PR?

fdlm · 2019-09-17T23:33:53Z

Hey, thanks for your work. The best way to verify that your implementation is correct is to test it against an existing one. Since the chord recognition model you're using was actually trained using the code here: https://github.com/fdlm/Spaghetti/blob/master/spaghetti/layers.py and only later translated to a pure Python/Numpy implementation, this would be the way to go. I understand that it's probably a lot of work, especially to get Theano running again :), but to be honest, I'm not willing to merge code that I'm not 100% sure it's correct. I still believe there's something wrong with the softmaxing, but I don't have time to debug it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chord decoder confidence #431

Chord decoder confidence #431

hpx7 commented May 13, 2019

hpx7 commented May 25, 2019

superbock commented May 27, 2019

fdlm commented May 27, 2019

hpx7 commented May 28, 2019

fdlm commented Jun 15, 2019 •

edited

Loading

hpx7 commented Aug 6, 2019

fdlm commented Sep 17, 2019

Chord decoder confidence #431

Chord decoder confidence #431

Comments

hpx7 commented May 13, 2019

hpx7 commented May 25, 2019

superbock commented May 27, 2019

fdlm commented May 27, 2019

hpx7 commented May 28, 2019

fdlm commented Jun 15, 2019 • edited Loading

hpx7 commented Aug 6, 2019

fdlm commented Sep 17, 2019

fdlm commented Jun 15, 2019 •

edited

Loading