Controlled Text Generation Project

Our project addresses the challenge of controlled text generation in large language models, which can produce unwanted or biased content. Guided generation with smaller version of larger models such as GPT2-XL and GPT2-Base with GPT2-Base being a reward model [1] has recently shown great capability in generating controlled text.

Guided generation approaches only fine-tune smaller models and keep the larger model parameters frozen. However, finetuning the smaller model in isolation without taking the larger model parameters into account can result in text degeneration in longer sequences. We intend to measure this gap and propose to meta-train a GPT2-Base model as a reward model using RL algorithms like PPO [2], DPO [3] etc. by using larger model logits.

For a given larger model, which we call Large, we will have next token xt for given prompt x0:t−1 as: xt = arg max z PLarge(z|x0:t)

For a fine-tuned expert model, which we call Expert, we will have next token xt for given prompt x0:t−1 as: xt = arg max z PExpert(z|x0:t)

We intend to make the generation as the following with β as a hyper-parameter: xt = arg max z PLarge(z|x0:t)+βPExpert(z|x0:t)

Additionally, we can use a GPT2-Base as an antiexpert termed Base: xt = arg max z PBase(z|x0:t)

Now, the final formulation will become with α as a hyper-parameter: xt = arg max z PLarge(z|x0:t) + βPExpert(z|x0:t) − αPBase(z|x0:t)

We will meta-train the expert model with the test time generation algorithm as outlined in equations 1 and 2 using PPO [2] and DPO [3]. We will need to run a hyperparameter search for α and β. Moreover, we will also change the formulations of the above equation to test which approach works. We will be using GPT2-Large as our Large model and GPT2-Base for the Expert and Base models.

We use TRL Library with Transformers Library to do the training.

References

[1] Reward Augmented Decoding

[2] PPO

[3] DPO

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
data		data
examples		examples
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
custom_logits_processor.py		custom_logits_processor.py
environment.yml		environment.yml
model.py		model.py
run_baseline.sh		run_baseline.sh
run_baseline_ft.py		run_baseline_ft.py
run_eval.py		run_eval.py
run_eval.sh		run_eval.sh
run_guided_training.sh		run_guided_training.sh
run_guided_training_dpo.py		run_guided_training_dpo.py
run_guided_training_ppo.py		run_guided_training_ppo.py
run_guided_training_ppo_sd.py		run_guided_training_ppo_sd.py
sync_unity.sh		sync_unity.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Controlled Text Generation Project

References

About

Releases

Packages

Contributors 2

Languages

License

razor08/Detoxification

Folders and files

Latest commit

History

Repository files navigation

Controlled Text Generation Project

References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages