Bug during finetuning training #104

asusdisciple · 2023-11-28T15:11:44Z

I am currently running a training approach on 4xV100 with 32 GB. I am using a dataset in the style of LJSpeech and the the
finetuning training script. When I use a batch size of 8, I get OOM errors at some point. But when I reduce the batch size to 4
this error appears right in the beginning. Do you have an idea why changing the batch size to 4 could lead to this error?

Traceback (most recent call last):
  File "/raid/me/projects/StyleTTS2/train_finetune.py", line 714, in <module>
    main()
  File "/raid/me/projects/StyleTTS2/venv/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/raid/me/projects/StyleTTS2/venv/lib/python3.10/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/raid/me/projects/StyleTTS2/venv/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/raid/me/projects/StyleTTS2/venv/lib/python3.10/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/raid/me/projects/StyleTTS2/train_finetune.py", line 396, in main
    y_rec_gt_pred = model.decoder(en, F0_real, N_real, s)
  File "/raid/me/projects/StyleTTS2/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/raid/me/projects/StyleTTS2/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/raid/me/projects/StyleTTS2/venv/lib/python3.10/site-packages/torch/nn/parallel/data_parallel.py", line 185, in forward
    outputs = self.parallel_apply(replicas, inputs, module_kwargs)
  File "/raid/me/projects/StyleTTS2/venv/lib/python3.10/site-packages/torch/nn/parallel/data_parallel.py", line 200, in parallel_apply
    return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
  File "/raid/me/projects/StyleTTS2/venv/lib/python3.10/site-packages/torch/nn/parallel/parallel_apply.py", line 110, in parallel_apply
    output.reraise()
  File "/raid/me/projects/StyleTTS2/venv/lib/python3.10/site-packages/torch/_utils.py", line 694, in reraise
    raise exception
RuntimeError: Caught RuntimeError in replica 0 on device 0.
Original Traceback (most recent call last):
  File "/raid/me/projects/StyleTTS2/venv/lib/python3.10/site-packages/torch/nn/parallel/parallel_apply.py", line 85, in _worker
    output = module(*input, **kwargs)
  File "/raid/me/projects/StyleTTS2/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/raid/me/projects/StyleTTS2/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/raid/me/projects/StyleTTS2/Modules/hifigan.py", line 458, in forward
    F0 = self.F0_conv(F0_curve.unsqueeze(1))
  File "/raid/me/projects/StyleTTS2/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/raid/me/projects/StyleTTS2/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1568, in _call_impl
    result = forward_call(*args, **kwargs)
  File "/raid/me/projects/StyleTTS2/venv/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 310, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "/raid/me/projects/StyleTTS2/venv/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 306, in _conv_forward
    return F.conv1d(input, weight, bias, self.stride,
RuntimeError: Given groups=1, weight of size [1, 1, 3], expected input[1, 284, 1] to have 1 channels, but got 284 channels instead

Changing the batch size to 3 results in this error in the last line:

RuntimeError: Given groups=1, weight of size [1, 1, 3], expected input[1, 221, 1] to have 1 channels, but got 221 channels instead

Apparently there is an issue with the expected input?

The text was updated successfully, but these errors were encountered:

devidw · 2023-11-28T17:04:08Z

I would double-check that all you dataset examples have a minimum length of one second.

stevenhillis · 2023-11-28T22:32:28Z

Your effective batch size of 4 gets spread across 4 machines: local batch size of 1. I'd bet there's an indiscriminate squeeze() somewhere that's eroding your batch dimension.

yl4579 · 2023-11-28T23:15:57Z

@stevenhillis Yeah the code is honestly very funky with a bunch of hardcoding and brute solutions to dimensions so it only supports batch size smaller than 2.

@asusdisciple I’d recommend you lower max_len to fit batch size 8 to 4 GPUs so each GPU has at least 2 samples, otherwise it won’t work.

asusdisciple · 2023-11-29T08:21:47Z

already decreased hop size and max len. I definately have no samples below 3 seconds, but some are about 30 seconds long, probably its because of that. By the way I wondered why the memory consumption is so high? with a batch size of 8 this would means 2 files per gpu. I mean I know wav-lm has a memory complexity of n^4 but I am still wondering about this.

yl4579 · 2023-11-29T08:34:22Z

@asusdisciple If the problem happens after joint_epoch, you should also lower the SLM adversarial training min_len and max_len under slmadv_params.

dabsdamoon-h · 2023-11-30T06:08:28Z

@stevenhillis @yl4579 FYI, you might figure out it already and not sure whether it's all of them, but just let you know that squeeze() code occurs at slmadv.py so might want to do something like below:

def forward(
        self,
        iters,
        y_rec_gt,
        y_rec_gt_pred,
        waves,
        mel_input_length,
        ref_text,
        ref_lengths,
        use_ind,
        s_trg,
        ref_s=None
    ):

    text_mask = length_to_mask(ref_lengths).to(ref_text.device)
    bert_dur = self.model.bert(ref_text, attention_mask=(~text_mask).int())
    if self.multilingual:
        bert_dur = bert_dur.last_hidden_state
    d_en = self.model.bert_encoder(bert_dur).transpose(-1, -2) 
    
    if use_ind and np.random.rand() < 0.5:
        s_preds = s_trg
    else:
        num_steps = np.random.randint(3, 5)
        if ref_s is not None:
            s_preds = self.sampler(
                noise = torch.randn_like(s_trg).unsqueeze(1).to(ref_text.device), 
                embedding=bert_dur,
                embedding_scale=1,
                features=ref_s, # reference from the same speaker as the embedding
                embedding_mask_proba=0.1,
                num_steps=num_steps
            ).squeeze(1)
        else:
            s_preds = self.sampler(
                noise = torch.randn_like(s_trg).unsqueeze(1).to(ref_text.device), 
                embedding=bert_dur,
                embedding_scale=1,
                embedding_mask_proba=0.1,
                num_steps=num_steps
            ).squeeze(1)
        
    s_dur = s_preds[:, 128:]
    s = s_preds[:, :128]
    
    d, _ = self.model.predictor(
        d_en,
        s_dur, 
        ref_lengths, 
        torch.randn(ref_lengths.shape[0], ref_lengths.max(), 2).to(ref_text.device), 
        text_mask
    )
    
    bib = 0

    output_lengths = []
    attn_preds = []
    
    # differentiable duration modeling
    for _s2s_pred, _text_length in zip(d, ref_lengths):

        _s2s_pred_org = _s2s_pred[:_text_length, :]

        _s2s_pred = torch.sigmoid(_s2s_pred_org)
        _dur_pred = _s2s_pred.sum(axis=-1)

        l = int(torch.round(_s2s_pred.sum()).item())
        t = torch.arange(0, l).unsqueeze(0).expand((len(_s2s_pred), l)).to(ref_text.device)
        loc = torch.cumsum(_dur_pred, dim=0) - _dur_pred / 2

        h = torch.exp(-0.5 * torch.square(t - (l - loc.unsqueeze(-1))) / (self.sig)**2)

        out = torch.nn.functional.conv1d(
            _s2s_pred_org.unsqueeze(0), 
            h.unsqueeze(1), 
            padding=h.shape[-1] - 1, groups=int(_text_length)
        )[..., :l]
        attn_preds.append(F.softmax(out.squeeze(), dim=0))

        output_lengths.append(l)

    max_len = max(output_lengths)
    
    with torch.no_grad():
        t_en = self.model.text_encoder(ref_text, ref_lengths, text_mask)
        
    s2s_attn = torch.zeros(len(ref_lengths), int(ref_lengths.max()), max_len).to(ref_text.device)
    for bib in range(len(output_lengths)):
        s2s_attn[bib, :ref_lengths[bib], :output_lengths[bib]] = attn_preds[bib]

    asr_pred = t_en @ s2s_attn

    _, p_pred = self.model.predictor(
        d_en,
        s_dur, 
        ref_lengths, 
        s2s_attn, 
        text_mask
    )
    
    mel_len = max(int(min(output_lengths) / 2 - 1), self.min_len // 2)
    mel_len = min(mel_len, self.max_len // 2)
    
    # get clips
    
    en = []
    p_en = []
    sp = []
    
    F0_fakes = []
    N_fakes = []
    
    wav = []

    for bib in range(len(output_lengths)):
        mel_length_pred = output_lengths[bib]
        mel_length_gt = int(mel_input_length[bib].item() / 2)
        if mel_length_gt <= mel_len or mel_length_pred <= mel_len:
            continue

        sp.append(s_preds[bib])

        random_start = np.random.randint(0, mel_length_pred - mel_len)
        en.append(asr_pred[bib, :, random_start:random_start+mel_len])
        p_en.append(p_pred[bib, :, random_start:random_start+mel_len])

        # get ground truth clips
        random_start = np.random.randint(0, mel_length_gt - mel_len)
        y = waves[bib][(random_start * 2) * 300:((random_start+mel_len) * 2) * 300]
        wav.append(torch.from_numpy(y).to(ref_text.device))
        
        if len(wav) >= self.batch_percentage * len(waves): # prevent OOM due to longer lengths
            break

    if len(sp) < 1:
        print("No clips found")
        return None
        
    sp = torch.stack(sp)
    wav = torch.stack(wav).float()
    en = torch.stack(en)
    p_en = torch.stack(p_en)
    
    F0_fake, N_fake = self.model.predictor.F0Ntrain(p_en, sp[:, 128:])
    y_pred = self.model.decoder(en, F0_fake, N_fake, sp[:, :128])
    
    # discriminator loss
    if (iters + 1) % self.skip_update == 0:
        if np.random.randint(0, 2) == 0:
            wav = y_rec_gt_pred
            use_rec = True
        else:
            use_rec = False

        crop_size = min(wav.size(-1), y_pred.size(-1))
        if use_rec: # use reconstructed (shorter lengths), do length invariant regularization
            if wav.size(-1) > y_pred.size(-1):
                real_GP = wav[:, : , :crop_size]
                out_crop = self.wl.discriminator_forward(real_GP.detach().squeeze(1))
                out_org = self.wl.discriminator_forward(wav.detach().squeeze(1))
                loss_reg = F.l1_loss(out_crop, out_org[..., :out_crop.size(-1)])

                if np.random.randint(0, 2) == 0:
                    d_loss = self.wl.discriminator(real_GP.detach().squeeze(1), y_pred.detach().squeeze(1)).mean()
                else:
                    d_loss = self.wl.discriminator(wav.detach().squeeze(1), y_pred.detach().squeeze(1)).mean()
            else:
                real_GP = y_pred[:, : , :crop_size]
                out_crop = self.wl.discriminator_forward(real_GP.detach().squeeze(1))
                out_org = self.wl.discriminator_forward(y_pred.detach().squeeze(1))
                loss_reg = F.l1_loss(out_crop, out_org[..., :out_crop.size(-1)])

                if np.random.randint(0, 2) == 0:
                    d_loss = self.wl.discriminator(wav.detach().squeeze(1), real_GP.detach().squeeze(1)).mean()
                else:
                    d_loss = self.wl.discriminator(wav.detach().squeeze(1), y_pred.detach().squeeze(1)).mean()
            
            # regularization (ignore length variation)
            d_loss += loss_reg

            out_gt = self.wl.discriminator_forward(y_rec_gt.detach().squeeze(1))
            out_rec = self.wl.discriminator_forward(y_rec_gt_pred.detach().squeeze(1))

            # regularization (ignore reconstruction artifacts)
            d_loss += F.l1_loss(out_gt, out_rec)

        else:
            d_loss = self.wl.discriminator(wav.detach().squeeze(1), y_pred.detach().squeeze(1)).mean()
    else:
        d_loss = 0
        
    # generator loss
    gen_loss = self.wl.generator(y_pred.squeeze(1))
    
    gen_loss = gen_loss.mean()
    
    return d_loss, gen_loss, y_pred.detach().cpu().numpy()

yl4579 · 2023-11-30T23:31:25Z

@dabsdamoon-h It’d be great if you could make a PR that fixes the squeeze issue so it allows a batch size of 1 and gradient accumulation.

effusiveperiscope · 2024-01-04T04:44:50Z

For me the error occurs with F0_curve when a batch has a size 1, I solved it by specifying the squeeze dimension classifier_out.squeeze(2) under JDCNet's forward() in Utils/JDC/model.py and removing F0_real = F0_real.unsqueeze(0) at line 403 in train_first.py. (Not sure why the unsqueeze is there considering it is not used on F0_real when the pitch extractor/decoder are used the first two times.)

* Create emo_gen.py * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update server.py, fix bugs in func get_text() and infer(). (yl4579#52) * Extract get_text() and infer() from webui.py. (yl4579#53) * Extract get_text() and infer() from webui.py. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * add emo emb * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * init emo gen * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * init emo * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * init emo * Delete bert/bert-base-japanese-v3 directory * Create .gitkeep * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Create add_punc.py * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix bug in bert_gen.py (yl4579#54) * Update README.md * fix bug in models.py (yl4579#56) * 更新 models.py * Fix japanese cleaner (yl4579#61) * 初步，睡觉明天继续写（ * 好好好放错分支了，熬夜是大忌 * [pre-commit.ci] pre-commit autoupdate (yl4579#55) * [pre-commit.ci] pre-commit autoupdate updates: - [github.com/pre-commit/pre-commit-hooks: v4.4.0 → v4.5.0](pre-commit/pre-commit-hooks@v4.4.0...v4.5.0) * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Create tokenizer_config.json * update preprocess_text.py：过滤一个音频匹配多个文本的情况 (yl4579#57) * update preprocess_text.py：过滤音频不存在的情况 (yl4579#58) * 修复日语cleaner和bert * better * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: Stardust·减 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Sora <[email protected]> * Apply Code Formatter Change * Add config.yml for global configuration. (yl4579#62) * Add config.yml for global configuration. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix bug in webui.py. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Rename config.yml to default_config.yml. Add ./config.yml to gitignore. * Add config.py to parse config.yml * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Update webui.py (yl4579#65) * Update webui.py: 1. Add auto translation from Chinese to Japanese. 2. Start to use config.py in webui.py to set config instead of using the command line. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fix (yl4579#68) * 加上ー * fix * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Update infer.py and webui.py. Supports loading and inference models of 1.1.1 version. (yl4579#66) * Update infer.py and webui.py. Supports loading and inference models of 1.1.1 version. * Update config.json * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fix bug in translate.py (yl4579#69) * Supports loading and inference models of 1.1、1.0.1、1.0 version. (yl4579#70) * Supports loading and inference models of 1.1、1.0.1、1.0 version. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Delete useless file in OldVersion --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Update japanese.py (yl4579#71) Handling JA long pronunciations * 使用配置文件配置bert_gen.py, preprocess_text.py, resample.py (yl4579#72) * Update bert_gen.py, preprocess_text.py, resample.py. Support using config.yml in these files. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update bert_gen.py * Update bert_gen.py, fix bug. --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Delete bert/bert-base-japanese-v3 directory * Create config.json * Create tokenizer_config.json * Create vocab.txt * Update server.py. 支持多版本多模型 (yl4579#76) * Update server.py. 支持多版本多模型 * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Dev webui (yl4579#77) * 申请pr (yl4579#75) * 2023/10/11 update 界面优化 * Update webui.py 翻译英文页面为中文 * Update train_ms.py 单卡训练 * 加入图片 * Update extern_subprocess.py * Update asr_transcript.py * Update asr_transcript.py * Update asr_transcript.py * Update extern_subprocess.py * Update asr_transcript.py * Update asr_transcript.py * Update asr_transcript.py * Update all_process.py * Update extern_subprocess.py * Update all_process.py * Update all_process.py * Update asr_transcript.py * Update extern_subprocess.py * Update webui.py * Create re_matching.py * Update webui.py * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update all_process.py * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update all_process.py * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update all_process.py * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update asr_transcript.py * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Pack 'update' functions into a module * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update all_process.py * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update asr_transcript.py * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update extern_subprocess.py * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update all_process.py * Update asr_transcript.py * Update webui.py * Add files via upload * Update extern_subprocess.py * Update all_process.py * Update asr_transcript.py * Update bert_gen.py * Update extern_subprocess.py * Update preprocess_text.py * Update re_matching.py * Update resample.py * Update update_status.py * Update update_status.py * Update webui.py * Update all_process.py * Update preprocess_text.py * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update train_ms.py --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Stardust·减 <[email protected]> Co-authored-by: innnky <[email protected]> * Delete all_process.py * Delete asr_transcript.py * Delete extern_subprocess.py --------- Co-authored-by: spicysama <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: innnky <[email protected]> * Create config.json * Create preprocessor_config.json * Create vocab.json * Delete emotional/wav2vec2-large-robust-12-ft-emotion-msp-dim/.gitkeep * Update emo_gen.py * Delete add_punc.py * add emotion_clustering.i * Apply Code Formatter Change * Update models.py * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update preprocess_text.py (yl4579#78) * Update preprocess_text.py. 检测重复以及不存在的音频 (yl4579#79) * Handle Janpanese long pronunciations (yl4579#80) * Handle Janpanese long pronunciations * Update japanese.py * Update japanese.py * Use unified phonemes for Japanese long vowel (yl4579#82) * Use an unified phoneme for Japanese long vowel `symbol.py` has not been updated to ensure compatibility with older version models. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * 增加一个按钮，点击后可以按句子切分，添加“|” (yl4579#81) * Update re_matching.py * Update webui.py * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fix phonemer bug (yl4579#83) * Fix phonemer bug * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fix long vowel handler bug (yl4579#84) * Fix long vowel handler bug * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * 加入整合包管理器的特性：长文本合成可以自定义句间段间停顿 (yl4579#85) * Update webui.py * Update re_matching.py * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Update train_ms.py * fix' * Update cleaner.py * add en * add en * Update english.py * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add en * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add en * add en * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add en * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * 更新 README.md * 更新 README.md * 更新 README.md * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Change phonemer to pyopenjtalk (yl4579#86) * Change phonemer to pyopenjtalk * 修改为openjtalk便于安装 --------- Co-authored-by: Stardust·减 <[email protected]> * 更新 english.py * Fix english_bert_mock.py. (yl4579#87) * Add punctuation execptions (yl4579#88) * Add punctuation execptions * Ellipses exceptions * remove get bert * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix bug in oldVersion. (yl4579#89) * Update requirements.txt * change to large * rollback requirements.txt * Feat: Enable 1.1.1 models using fix-ver infer. (yl4579#91) * Feat: Enable 1.1.1 models using fix-ver infer. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Add Japanese accent (high-low) (yl4579#90) * Add punctuation execptions * Ellipses exceptions * Add Japanese accent * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Do not replace iteration mark (yl4579#92) * Add punctuation execptions * Ellipses exceptions * Add Japanese accent * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Do not replace iteration mark --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fix: fix import error in oldVersion (yl4579#93) * Refactor: reusing model loading in webui.py and server.py. (yl4579#94) * Feat: Enable using config.yml in train_ms.py (yl4579#96) * 更新 emo_gen.py * Change emo_gen.py (yl4579#97) * Fix emo_gen bugs * Add multiprocess * Fix queue (yl4579#98) * Fix emo_gen bugs * Add multiprocess * Del var * Fix queue * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fix training bugs (yl4579#99) * Updatge cluster notebook * Fix train * Fix filename * Update infer.py (yl4579#100) * Update infer.py * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Add reference audio (yl4579#101) * Add reference audio * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update * Update * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Stardust·减 <[email protected]> * Fix: fix 1.1.1-fix (yl4579#102) * Fix infer bug (yl4579#103) * Feat: Add server_fastapi.py. (yl4579#104) * Feat: Add server_fastapi.py. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix: Update requirements.txt. --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fix: requirements.txt. (yl4579#105) * Swith to deberta-v3-large (yl4579#106) * Swith to deberta-v3-large * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Feat: Update config.py. (yl4579#107) * Feat: Update config.py. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Dev fix (yl4579#108) * fix bugs when deploying * fix bugs when deploying * fix bugs when deploying * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Revert "Dev fix (yl4579#108)" (yl4579#109) This reverts commit 685e18a10498d602b1a9a26079340d11925646f0. * Dev fix (yl4579#110) * fix bugs when deploying * fix bugs when deploying * fix bugs when deploying * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix fixed bugs * fix fixed bugs * fix fixed bug 3 * fix fixed bug 4 * fix fixed bug 5 * fix * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Add emo vec quantizer (yl4579#111) Co-authored-by: Stardust·减 <[email protected]> * Clean req and gitignore (yl4579#112) * Clean req and gitignore * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Switch to deberta-v2-large-japanese (yl4579#113) * Switch to deberta-v2-large-japanese * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fix emo bugs (yl4579#114) * Fix english (yl4579#115) * Remove emo (yl4579#117) * Don't train codebook * Remove emo * Update * Update * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Merge dev into no-emo (yl4579#122) * [pre-commit.ci] pre-commit autoupdate (yl4579#95) * [pre-commit.ci] pre-commit autoupdate updates: - [github.com/astral-sh/ruff-pre-commit: v0.0.292 → v0.1.1](astral-sh/ruff-pre-commit@v0.0.292...v0.1.1) - [github.com/psf/black: 23.9.1 → 23.10.0](psf/black@23.9.1...23.10.0) * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Don't train codebook (yl4579#116) * Update requirements.txt * Update english_bert_mock.py * Fix: server_fastapi.py (yl4579#118) * Fix: server_fastapi.py * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fix: don't print debug logging. (yl4579#119) * Fix: don't print debug logging. * Feat: support emo_gen config * Fix config * Apply Code Formatter Change * 更新，修正bug (yl4579#121) * Feat: Update infer.py preprocess_text.py server_fastapi.py. * Fix resample.py. Maintain same directory structure in out_dir as in_dir. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update resample.py * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Update server_fastapi.py to no-emo ver * Update config.py, no emo config --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: OedoSoldier <[email protected]> Co-authored-by: Stardust·减 <[email protected]> Co-authored-by: Stardust-minus <[email protected]> * Update train_ms.py * Update latest version info (yl4579#124) --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: jiangyuxiaoxiao <[email protected]> Co-authored-by: AkitoLiu <[email protected]> Co-authored-by: Stardust-minus <[email protected]> Co-authored-by: OedoSoldier <[email protected]> Co-authored-by: spicysama <[email protected]> Co-authored-by: innnky <[email protected]> Co-authored-by: YYuX-1145 <[email protected]>

yl4579 closed this as completed Nov 28, 2023

Kreevoz mentioned this issue Nov 30, 2023

Fine tune exception RuntimeError: Given groups=1, weight of size [1, 1, 3], expected input[1, 100, 1] to have 1 channels, but got 100 channels instead #115

Closed

Sobsz mentioned this issue Dec 19, 2023

Fix batch size 1 by specifying squeeze dims #166

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug during finetuning training #104

Bug during finetuning training #104

asusdisciple commented Nov 28, 2023 •

edited

Loading

devidw commented Nov 28, 2023

stevenhillis commented Nov 28, 2023

yl4579 commented Nov 28, 2023

asusdisciple commented Nov 29, 2023

yl4579 commented Nov 29, 2023

dabsdamoon-h commented Nov 30, 2023

yl4579 commented Nov 30, 2023

effusiveperiscope commented Jan 4, 2024

Bug during finetuning training #104

Bug during finetuning training #104

Comments

asusdisciple commented Nov 28, 2023 • edited Loading

devidw commented Nov 28, 2023

stevenhillis commented Nov 28, 2023

yl4579 commented Nov 28, 2023

asusdisciple commented Nov 29, 2023

yl4579 commented Nov 29, 2023

dabsdamoon-h commented Nov 30, 2023

yl4579 commented Nov 30, 2023

effusiveperiscope commented Jan 4, 2024

asusdisciple commented Nov 28, 2023 •

edited

Loading