meaning of token probebility #289

ofersa · 2022-12-18T11:28:41Z

ofersa
Dec 18, 2022

Hi,

I try to understand the probability mechanism of a token.
in the python whisper project, the probability is an average of the whole segment and it should be less then -1.0 in order to be good enough.
in the whisper.cpp project the probability is per token that is usually one word and it range between 0.0 and 1.0 in order to be good enough. the values of whisper.cpp looks much more compliant with the audio.
how is it calculated, and why are the differ ?

Thanks
Ofer

ggerganov · 2022-12-19T18:04:41Z

ggerganov
Dec 19, 2022
Maintainer

Currently, the token probability is simply the soft-max of the logits produced by the decoder:

whisper.cpp/whisper.cpp

Lines 1825 to 1844 in 1d716d6

    
           struct ggml_tensor * logits = ggml_mul_mat(ctx0, model.d_te, cur); 
        
           // logits -> probs 
        
           cur = ggml_dup(ctx0, logits); 
        
           cur = ggml_soft_max(ctx0, cur); // in-place 
        
           // run the computation 
        
           { 
        
               struct ggml_cgraph gf = {}; 
        
               gf.n_threads = n_threads; 
        
               ggml_build_forward_expand(&gf, cur); 
        
               ggml_graph_compute       (ctx0, &gf); 
        
           } 
        
           logits_out.resize(N*n_vocab); 
        
           memcpy(logits_out.data(), ggml_get_data(logits), sizeof(float)*N*n_vocab); 
        
           probs_out.resize(N*n_vocab); 
        
           memcpy(probs_out.data(), ggml_get_data(cur), sizeof(float)*N*n_vocab);

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

meaning of token probebility #289

{{title}}

Replies: 1 comment

{{title}}

Select a reply

meaning of token probebility #289

ofersa Dec 18, 2022

Replies: 1 comment

ggerganov Dec 19, 2022 Maintainer

ofersa
Dec 18, 2022

ggerganov
Dec 19, 2022
Maintainer