Improve training speed by allow padding to batch-wise max length instead of model supported max length #737

zodiacg · 2023-05-25T03:13:27Z

Is your feature request related to a problem? Please describe.
I'm running A2T augmented training with codes from QData/TextAttack-A2T. As mentioned in QData/Textattack-A2T Issue#1, the training speed of the PLM model is extremely slow.

After some checking I notices in textattack.Trainer.training_step(), the input text is padded to the max length the model supported, instead of the max length in the batch. This causes much more computation overhead, and probably unnecessary.

I wrote an inherited Trainer class to override the training_step method which simply changed the padding parameter of tokenizer to True and the training speed is much faster.

Describe the solution you'd like
Improve the training speed of transformers models when using textattack.Trainer.
Previous test shows that on a 3090 card, when training SNLI task a clean epoch took 3 hours.

Describe alternatives you've considered
By allow the tokenizer to only pad the input text to the max length in the batch instead of the max length supported by the corresponding model. By doing so a clean epoch took 20 minutes on the same condition.

Additional context

Is there any specific necessary to pad the input text to the max length supported by model?
Similar consideration may also be applied to HuggingFaceModelWrapper

The text was updated successfully, but these errors were encountered:

jxmorris12 added the enhancement New feature or request label Jul 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve training speed by allow padding to batch-wise max length instead of model supported max length #737

Improve training speed by allow padding to batch-wise max length instead of model supported max length #737

zodiacg commented May 25, 2023 •

edited

Loading

Improve training speed by allow padding to batch-wise max length instead of model supported max length #737

Improve training speed by allow padding to batch-wise max length instead of model supported max length #737

Comments

zodiacg commented May 25, 2023 • edited Loading

zodiacg commented May 25, 2023 •

edited

Loading