Skip to content

Using CR-CTC to train, decoder results are all start and end character. #1780

Answered by yaozengwei
masterjade7 asked this question in Q&A
Discussion options

You must be logged in to vote

If you use reduction="mean", in torch.nn.functional.ctc_loss, you can divide the final cr_loss by the total number of frames over the batch, could use encoder_out_lens.sum().item().
For CTC loss, I would suggest using reduction="sum", in torch.nn.functional.ctc_loss, and divide that by the total number of frames over the batch. This would make the relative scale consistency with our settings.

Like this (see the commented part):

    def forward_cr_ctc(
        self,
        encoder_out: torch.Tensor,
        encoder_out_lens: torch.Tensor,
        targets: torch.Tensor,
        target_lengths: torch.Tensor,
    ) -> Tuple[torch.Tensor, torch.Tensor]:
        """Compute CTC loss with consis…

Replies: 1 comment 14 replies

Comment options

You must be logged in to vote
14 replies
@masterjade7
Comment options

@yaozengwei
Comment options

@masterjade7
Comment options

@yaozengwei
Comment options

Answer selected by yfyeung
@masterjade7
Comment options

@yaozengwei
Comment options

@masterjade7
Comment options

@yaozengwei
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants