Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training with XLM-RoBERTa #11

Open
ssdorsey opened this issue Feb 25, 2020 · 12 comments
Open

Training with XLM-RoBERTa #11

ssdorsey opened this issue Feb 25, 2020 · 12 comments

Comments

@ssdorsey
Copy link

Hi, has anybody looked into training a version of udify with XLM-RoBERTa? Seems like it could help with the low-resource languages in multilingual BERT so I'm planning on giving it a go if nobody else has already.

@Hyperparticle
Copy link
Owner

That's a good idea. Now that I see that Hugggingface has added support for it, it should be straightforward to add support here. I might get around to it, but feel free to try it yourself.

Training on my single GPU might take a while. 🙂

@ssdorsey
Copy link
Author

I've got a couple spare 2080ti that should do the trick. I've never used AllenNLP before so I'm a little unfamiliar with how all these config files work. If you could give me some general guidance on what I would have to update in the code I'm happy to take a crack at it and share my results.

@Hyperparticle
Copy link
Owner

The first thing to do would be to add the latest transformers release to requirements.txt which has XLM RoBERTa here and here. Then it should be imported into udify/modules/bert_pretrained.py and replace BertTokenizer/BertModel/BertConfig wherever necessary. Finally, copy config/ud/multilingual/udify_bert_finetune_multilingual.json and modify it to point to xlm-roberta-base instead of bert-base-multilingual-cased (and a new vocab.txt file which can be extracted from the pretrained model archive).

There might be a few details I missed, but I think that's most of it. I also highly recommend using a debugger inside udify/modules/bert_pretrained.py to see what the data looks like.

Thanks for offering your help!

@ssdorsey
Copy link
Author

Thanks, I'll take a crack at it

@ssdorsey
Copy link
Author

ssdorsey commented Mar 2, 2020

Followed the steps you outlined and modified a few other things as well (i.e. modifying the special tokens and tokenizer) but I keep running into allennlp errors that I can't quite sort. I have plenty compute available if anybody else manages to get this running but I don't think I'll be able to.

@ssdorsey
Copy link
Author

Update: came back to this and figured it out. Just had to deal with the differences between how pytorch_pretrained_bert and transformers dealt with model outputs. Training now.

@Hyperparticle
Copy link
Owner

How's the training going? Any problems?

@ssdorsey
Copy link
Author

I had a few issues with gradient explosion. Had to take a couple days off but I'm getting back at it now to see if I can get it going again.

@ArijRB
Copy link

ArijRB commented Mar 17, 2021

Hey, I am trying to train a version of udify with a bert-like model, I was wondering if you got any updates for the changes needed ?
Thank you in advance
@ssdorsey @Hyperparticle

@prashantkodali
Copy link

@ssdorsey or anybody else,

were you able to train the model? Looking to do the same with XLM-R but if you have any experience that you can share, it would be really helpful. TIA.

@shaked571
Copy link

Any of you might have some updates regarding the training?

@guptabhinav49
Copy link

@ssdorsey can you once tell us how exactly did you handle the outputs finally? I'm getting this error even after changing the config files.
RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling ``cublasCreate(handle)\``

I'm trying to use "ai4bharat/indic-bert" as pre-trained model. The procedure will be very similar to what you would have done in XLM-RoBERTa.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants