-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to implement prefix tuning with BartForConditionalGeneration? #29
Comments
I am rather confussing too, consider this. |
Hi, @CYBruce, from my understanding, the implementation here used the kv cache mechanism to implement the prefix appending work. This is very tricky, and it is coupled with the Since this implementation is a little bit inconvenient in some cases, I 've made another implementation of Li&Liang 2021 prefix tuning and THUDM/P-Tuning V2 based on opendelta that is model agnostic (opendelta's implementation are dependent to specific model). Since this elegant and useful project seems to be inactive, I am considering to redistribute this as opendelta_plus or opendelta_v2, something like that, putting it under my lab to keep it up to date, while respecting the Apache License 2.0 license. I will add some newer sota delta tuning (peft) methods to extend it, including the The insider version of opendelta_plus has been deployed to my lab and integrated to our WIP high level peft library called BoGuan_YueQu (with some technologies extending the web based selection and name based addressing in this opendelta library, like unifying many peft libs into one interface). We plan to opensource BoGuan_YueQu and opendelta_plus later. Stay tuned for our repositories and pypi projects! We are another THU lab focusing on AI, here's our link https://github.com/THU-CVML. |
Thank you for the awesome work. Currently, I am trying to implement prefix-tuning experiments with BART. The original code provided by the author is a total mess.
Then I found your work here. However, I cannot find enough docs for the usage. For example, I dont know how the run experiment with
PrefixModel
you provided. And I checked the source code , I haven't figure outhow it works.
Could you please give me more information about that?
The text was updated successfully, but these errors were encountered: