Training with XLM-RoBERTa #11

ssdorsey · 2020-02-25T07:17:07Z

Hi, has anybody looked into training a version of udify with XLM-RoBERTa? Seems like it could help with the low-resource languages in multilingual BERT so I'm planning on giving it a go if nobody else has already.

Hyperparticle · 2020-02-25T18:29:47Z

That's a good idea. Now that I see that Hugggingface has added support for it, it should be straightforward to add support here. I might get around to it, but feel free to try it yourself.

Training on my single GPU might take a while. 🙂

ssdorsey · 2020-02-25T19:40:19Z

I've got a couple spare 2080ti that should do the trick. I've never used AllenNLP before so I'm a little unfamiliar with how all these config files work. If you could give me some general guidance on what I would have to update in the code I'm happy to take a crack at it and share my results.

Hyperparticle · 2020-02-26T01:56:59Z

The first thing to do would be to add the latest transformers release to requirements.txt which has XLM RoBERTa here and here. Then it should be imported into udify/modules/bert_pretrained.py and replace BertTokenizer/BertModel/BertConfig wherever necessary. Finally, copy config/ud/multilingual/udify_bert_finetune_multilingual.json and modify it to point to xlm-roberta-base instead of bert-base-multilingual-cased (and a new vocab.txt file which can be extracted from the pretrained model archive).

There might be a few details I missed, but I think that's most of it. I also highly recommend using a debugger inside udify/modules/bert_pretrained.py to see what the data looks like.

Thanks for offering your help!

ssdorsey · 2020-02-26T18:53:59Z

Thanks, I'll take a crack at it

ssdorsey · 2020-03-02T17:54:01Z

Followed the steps you outlined and modified a few other things as well (i.e. modifying the special tokens and tokenizer) but I keep running into allennlp errors that I can't quite sort. I have plenty compute available if anybody else manages to get this running but I don't think I'll be able to.

ssdorsey · 2020-03-12T03:52:33Z

Update: came back to this and figured it out. Just had to deal with the differences between how pytorch_pretrained_bert and transformers dealt with model outputs. Training now.

Hyperparticle · 2020-03-17T04:57:42Z

How's the training going? Any problems?

ssdorsey · 2020-03-19T17:52:29Z

I had a few issues with gradient explosion. Had to take a couple days off but I'm getting back at it now to see if I can get it going again.

ArijRB · 2021-03-17T16:51:26Z

Hey, I am trying to train a version of udify with a bert-like model, I was wondering if you got any updates for the changes needed ?
Thank you in advance
@ssdorsey @Hyperparticle

prashantkodali · 2021-09-13T11:16:05Z

@ssdorsey or anybody else,

were you able to train the model? Looking to do the same with XLM-R but if you have any experience that you can share, it would be really helpful. TIA.

shaked571 · 2022-04-05T13:32:53Z

Any of you might have some updates regarding the training?

guptabhinav49 · 2022-05-10T19:20:55Z

@ssdorsey can you once tell us how exactly did you handle the outputs finally? I'm getting this error even after changing the config files.
RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling ``cublasCreate(handle)\``

I'm trying to use "ai4bharat/indic-bert" as pre-trained model. The procedure will be very similar to what you would have done in XLM-RoBERTa.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training with XLM-RoBERTa #11

Training with XLM-RoBERTa #11

ssdorsey commented Feb 25, 2020

Hyperparticle commented Feb 25, 2020

ssdorsey commented Feb 25, 2020

Hyperparticle commented Feb 26, 2020

ssdorsey commented Feb 26, 2020

ssdorsey commented Mar 2, 2020

ssdorsey commented Mar 12, 2020

Hyperparticle commented Mar 17, 2020

ssdorsey commented Mar 19, 2020

ArijRB commented Mar 17, 2021

prashantkodali commented Sep 13, 2021

shaked571 commented Apr 5, 2022

guptabhinav49 commented May 10, 2022

Training with XLM-RoBERTa #11

Training with XLM-RoBERTa #11

Comments

ssdorsey commented Feb 25, 2020

Hyperparticle commented Feb 25, 2020

ssdorsey commented Feb 25, 2020

Hyperparticle commented Feb 26, 2020

ssdorsey commented Feb 26, 2020

ssdorsey commented Mar 2, 2020

ssdorsey commented Mar 12, 2020

Hyperparticle commented Mar 17, 2020

ssdorsey commented Mar 19, 2020

ArijRB commented Mar 17, 2021

prashantkodali commented Sep 13, 2021

shaked571 commented Apr 5, 2022

guptabhinav49 commented May 10, 2022