Skip to content

The reproduct of the paper - Aligner: Achieving Efficient Alignment through Weak-to-Strong Correction

Notifications You must be signed in to change notification settings

AlignInc/aligner-replication

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Aligner-Reproduced

Align-Inc uses the Aligner technology developed by Peking University, training a lightweight Aligner based on Gemma-2B, and applying it to our specific business practices. Notably, the Aligner we replicated achieved marvelous results on AlpacaEval. See below for details.

Results

Using the techniques mentioned in the paper, we trained Aligner based on Gemma-2B and successfully improved the performance of Qwen-72B-Chat , Claude3-Opus and GPT-4 on AlpacaEval. After being corrected by our Aligner model, Qwen-72B-Chat's LC win rate was enhanced to 36.7% , with its responses averaging 1812 tokens, whereas the LC win rate of Claude3-Opus was enhanced to 41.8% , with an average response length of 1669 tokens.

Surprisingly, GPT-4's LC win rate increased to 58.3% , making it the Top Performer on the AlpacaEval.

Citing Aligner

This repository is the reproduction of the paper - Aligner: Achieving Efficient Alignment through Weak-to-Strong Correction. You can cite it in your publications if you find Aligner useful.

@article{ji2024aligner,
  title={Aligner: Achieving efficient alignment through weak-to-strong correction},
  author={Ji, Jiaming and Chen, Boyuan and Lou, Hantao and Hong, Donghai and Zhang, Borong and Pan, Xuehai and Dai, Juntao and Yang, Yaodong},
  journal={arXiv preprint arXiv:2402.02416},
  year={2024}
}

About

The reproduct of the paper - Aligner: Achieving Efficient Alignment through Weak-to-Strong Correction

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.3%
  • Shell 0.7%