[Bug] VITS gpu utilization #3710

maryawwm · 2024-04-28T05:55:29Z

Describe the bug

im training VITS model (Persian and English language) my dataset is consists of audio clips from 1 to 25s.Im training it on a A100 GPU but most of the time gpu memory is not even half and its utilization is not as i expect.

To Reproduce

i modified my code based on this script in coqui library:

https://github.com/coqui-ai/TTS/blob/dev/recipes/multilingual/vits_tts/train_vits_tts_phonemes.py

and these are the parameters that i set:
audio_config = VitsAudioConfig(
sample_rate=16000,
win_length=1024,
hop_length=256,
num_mels=80,
mel_fmin=0,
mel_fmax=None,
)

vitsArgs = VitsArgs(
use_language_embedding=True,
embedded_language_dim=2,
use_speaker_embedding=True,
use_sdp=False,
)

config = VitsConfig(
model_args=vitsArgs,
audio=audio_config,
run_name="A6_vits_multi_language_10_spk_5_ordibehesht",
use_speaker_embedding=True,
batch_size=48,
eval_batch_size=32,
batch_group_size=128,
num_loader_workers=12,
num_eval_loader_workers=8,
precompute_num_workers=12,
run_eval=True,
test_delay_epochs=-1,
epochs=1000,
text_cleaner="multilingual_cleaners",
use_phonemes=True,
phoneme_language=None,
phonemizer="multi_phonemizer",
phoneme_cache_path=os.path.join(output_path, "phoneme_cache"),
compute_input_seq_cache=True,
print_step=25,
use_language_weighted_sampler=True,
print_eval=False,
mixed_precision=True,
output_path=output_path,
datasets=dataset_config,
cudnn_enable=True,
cudnn_benchmark=True,
cudnn_deterministic=True

Expected behavior

higher gpu utilization and faster training time

Logs

one of my steps log:

[1m   --> TIME: 2024-04-27 09:15:52 -- STEP: 124/3006 -- GLOBAL_STEP: 1750125�[0m
     | > loss_disc: 2.7141058444976807  (2.7415779617524914)
     | > loss_disc_real_0: 0.2915174067020416  (0.22191733380238854)
     | > loss_disc_real_1: 0.2596714198589325  (0.2545961029827594)
     | > loss_disc_real_2: 0.25090914964675903  (0.2519173812601836)
     | > loss_disc_real_3: 0.2509034276008606  (0.2488831561659612)
     | > loss_disc_real_4: 0.2618330121040344  (0.24871416005396074)
     | > loss_disc_real_5: 0.23049794137477875  (0.2413994044726414)
     | > loss_0: 2.7141058444976807  (2.7415779617524914)
     | > grad_norm_0: tensor(2.3359, device='cuda:0')  (tensor(4.0910, device='cuda:0'))
     | > loss_gen: 1.8159717321395874  (1.9762149626208896)
     | > loss_kl: 5.008370399475098  (42.11719334894611)
     | > loss_feat: 1.7703579664230347  (2.0269679972721693)
     | > loss_mel: 30.50223731994629  (41.7430907526324)
     | > loss_duration: 9.647953033447266  (2.5745641668477357)
     | > amp_scaler: 256.0  (509.9354838709682)
     | > loss_1: 48.74489212036133  (90.438032304087)
     | > grad_norm_1: tensor(73.7072, device='cuda:0')  (tensor(215.5241, device='cuda:0'))
     | > current_lr_0: 0.0002 
     | > current_lr_1: 0.0002 
     | > step_time: 5.8922  (3.467874986510123)
     | > loader_time: 0.006  (0.005929248948251048)

Environment

- TTS version : 0.17.8
- python : 3.9.18
- pytorch : 2.1.1
- os : Linux
- gpu : A100

Additional context

No response

stale · 2024-06-05T03:11:09Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. You might also look our discussion channels.

maryawwm added the bug Something isn't working label Apr 28, 2024

stale bot added the wontfix This will not be worked on but feel free to help. label Jun 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] VITS gpu utilization #3710

[Bug] VITS gpu utilization #3710

maryawwm commented Apr 28, 2024

stale bot commented Jun 5, 2024

[Bug] VITS gpu utilization #3710

[Bug] VITS gpu utilization #3710

Comments

maryawwm commented Apr 28, 2024

Describe the bug

To Reproduce

Expected behavior

Logs

Environment

Additional context

stale bot commented Jun 5, 2024