-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to Reproduce Results for LongBench #27
Comments
Hi, For Llama-3 Instruct models, please add the prompt template as shown here. We've updated the code in pred_long_bench.py accordingly. Please give it a try, and feel free to ask if you have any questions! Thanks! |
The In any case, I can confirm that our Table 8 results in KIVI are reproducible via the two scripts (Llama 3 baseline, Llama 3 with KIVI-2). For your convenience, here's the task summary: Llama-3-8B-Instruct Baseline
Llama-3-8B-Instruct with KIVI-2bit
(Note we excluded |
Hello,
I ran the code provided for LongBench using the Llama-3-8B-Instruct model but couldn't reproduce the results reported in Table 8 of your paper. Specifically, the full precision baseline model's score for Qasper in my run is 32.11, while the reported score is 44.24.
I used the following command to run the model:
python pred_long_bench.py --model_name_or_path meta-llama/Meta-Llama-3-8B-Instruct --k_bits 16 --v_bits 16
Is there anything I might be missing?
The text was updated successfully, but these errors were encountered: