You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
I'm running A2T augmented training with codes from QData/TextAttack-A2T. As mentioned in QData/Textattack-A2T Issue#1, the training speed of the PLM model is extremely slow.
After some checking I notices in textattack.Trainer.training_step(), the input text is padded to the max length the model supported, instead of the max length in the batch. This causes much more computation overhead, and probably unnecessary.
I wrote an inherited Trainer class to override the training_step method which simply changed the padding parameter of tokenizer to True and the training speed is much faster.
Describe the solution you'd like
Improve the training speed of transformers models when using textattack.Trainer.
Previous test shows that on a 3090 card, when training SNLI task a clean epoch took 3 hours.
Describe alternatives you've considered
By allow the tokenizer to only pad the input text to the max length in the batch instead of the max length supported by the corresponding model. By doing so a clean epoch took 20 minutes on the same condition.
Additional context
Is there any specific necessary to pad the input text to the max length supported by model?
Similar consideration may also be applied to HuggingFaceModelWrapper
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
I'm running A2T augmented training with codes from QData/TextAttack-A2T. As mentioned in QData/Textattack-A2T Issue#1, the training speed of the PLM model is extremely slow.
After some checking I notices in
textattack.Trainer.training_step()
, the input text is padded to the max length the model supported, instead of the max length in the batch. This causes much more computation overhead, and probably unnecessary.I wrote an inherited Trainer class to override the
training_step
method which simply changed thepadding
parameter of tokenizer toTrue
and the training speed is much faster.Describe the solution you'd like
Improve the training speed of
transformers
models when usingtextattack.Trainer
.Previous test shows that on a 3090 card, when training SNLI task a clean epoch took 3 hours.
Describe alternatives you've considered
By allow the tokenizer to only pad the input text to the max length in the batch instead of the max length supported by the corresponding model. By doing so a clean epoch took 20 minutes on the same condition.
Additional context
HuggingFaceModelWrapper
The text was updated successfully, but these errors were encountered: