-
Notifications
You must be signed in to change notification settings - Fork 14
Issues: ScandEval/ScandEval
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[MODEL EVALUATION REQUEST] Qwen/Qwen2-72B-Instruct
model evaluation request
Request to evaluate a model and add it to the leaderboard(s)
#460
opened Jun 8, 2024 by
saattrupdan
8 tasks done
[MODEL EVALUATION REQUEST] Qwen/Qwen2-72B
model evaluation request
Request to evaluate a model and add it to the leaderboard(s)
#459
opened Jun 8, 2024 by
saattrupdan
8 tasks done
[BUG] circular dependency - Something isn't working
outlines>=0.0.37
bug
#457
opened Jun 8, 2024 by
farup
[MODEL EVALUATION REQUEST] AI-Sweden-Models/Llama-3-8B
large model (>7B)
This model has more than 7B parameters, requiring more than an RTX 4090 GPU to evaluate.
model evaluation request
Request to evaluate a model and add it to the leaderboard(s)
#456
opened Jun 6, 2024 by
saattrupdan
3 of 8 tasks
[MODEL EVALUATION REQUEST] NorwAI/NorwAI-Mixtral-8x7B-instruct
large model (>7B)
This model has more than 7B parameters, requiring more than an RTX 4090 GPU to evaluate.
model evaluation request
Request to evaluate a model and add it to the leaderboard(s)
#455
opened Jun 5, 2024 by
saattrupdan
3 of 8 tasks
[MODEL EVALUATION REQUEST] NorwAI/NorwAI-Mixtral-8x7B
large model (>7B)
This model has more than 7B parameters, requiring more than an RTX 4090 GPU to evaluate.
model evaluation request
Request to evaluate a model and add it to the leaderboard(s)
#454
opened Jun 5, 2024 by
saattrupdan
3 of 8 tasks
[MODEL EVALUATION REQUEST] NorwAI/NorwAI-Mistral-7B-instruct
model evaluation request
Request to evaluate a model and add it to the leaderboard(s)
small model (<7B)
This model has less than 7B parameters, so can be evaluated on an RTX 4090 GPU or smaller.
#453
opened Jun 5, 2024 by
saattrupdan
3 of 8 tasks
[MODEL EVALUATION REQUEST] NorwAI/NorwAI-Mistral-7B
model evaluation request
Request to evaluate a model and add it to the leaderboard(s)
small model (<7B)
This model has less than 7B parameters, so can be evaluated on an RTX 4090 GPU or smaller.
#452
opened Jun 5, 2024 by
saattrupdan
3 of 8 tasks
[MODEL EVALUATION REQUEST] NorwAI/NorwAI-Mistral-7B-pretrain
model evaluation request
Request to evaluate a model and add it to the leaderboard(s)
small model (<7B)
This model has less than 7B parameters, so can be evaluated on an RTX 4090 GPU or smaller.
#451
opened Jun 5, 2024 by
saattrupdan
3 of 8 tasks
[MODEL EVALUATION REQUEST] NorwAI/NorwAI-Mixtral-8x7B-one-third-epoch
large model (>7B)
This model has more than 7B parameters, requiring more than an RTX 4090 GPU to evaluate.
model evaluation request
Request to evaluate a model and add it to the leaderboard(s)
#450
opened Jun 5, 2024 by
saattrupdan
3 of 8 tasks
[MODEL EVALUATION REQUEST] NorwAI/NorwAI-Llama2-7B
model evaluation request
Request to evaluate a model and add it to the leaderboard(s)
small model (<7B)
This model has less than 7B parameters, so can be evaluated on an RTX 4090 GPU or smaller.
#449
opened Jun 5, 2024 by
saattrupdan
3 of 8 tasks
[MODEL EVALUATION REQUEST] Phi-3-mini-4k-instruct
model evaluation request
Request to evaluate a model and add it to the leaderboard(s)
small model (<7B)
This model has less than 7B parameters, so can be evaluated on an RTX 4090 GPU or smaller.
#446
opened Jun 3, 2024 by
Mikeriess
8 tasks done
[FEATURE REQUEST] Gemeni benchmarks
enhancement
New feature or request
#443
opened May 30, 2024 by
kim-borgen
[MODEL EVALUATION REQUEST] mistralai/Mistral-7B-v0.3
model evaluation request
Request to evaluate a model and add it to the leaderboard(s)
small model (<7B)
This model has less than 7B parameters, so can be evaluated on an RTX 4090 GPU or smaller.
#440
opened May 23, 2024 by
mhenrichsen
1 of 8 tasks
[FEATURE REQUEST] Benchmarking on private data
enhancement
New feature or request
#438
opened May 22, 2024 by
Mikeriess
[BENCHMARK DATASET REQUEST] NorBench
benchmark dataset request
Request to add a new benchmark dataset
#435
opened May 15, 2024 by
Mikeriess
1 of 8 tasks
[MODEL EVALUATION REQUEST] lightonai/alfred-40b-1023
large model (>7B)
This model has more than 7B parameters, requiring more than an RTX 4090 GPU to evaluate.
model evaluation request
Request to evaluate a model and add it to the leaderboard(s)
#434
opened May 9, 2024 by
TheLounger
4 of 8 tasks
[BUG] Generation does not terminate on single newline
bug
Something isn't working
#432
opened May 6, 2024 by
iPieter
[BENCHMARK DATASET REQUEST] dutch-cola
benchmark dataset request
Request to add a new benchmark dataset
#419
opened Apr 24, 2024 by
BramVanroy
1 of 8 tasks
[FEATURE REQUEST] Support seq-to-seq architectures
enhancement
New feature or request
#418
opened Apr 24, 2024 by
saattrupdan
[BUG] Outlines version clash with vLLM
bug
Something isn't working
#414
opened Apr 23, 2024 by
saattrupdan
[BUG] Memory leak when benchmarking multiple generative models with multiple GPUs
bug
Something isn't working
#413
opened Apr 23, 2024 by
saattrupdan
[MODEL EVALUATION REQUEST] allenai/OLMo-1.7-7B-hf
model evaluation request
Request to evaluate a model and add it to the leaderboard(s)
small model (<7B)
This model has less than 7B parameters, so can be evaluated on an RTX 4090 GPU or smaller.
#407
opened Apr 19, 2024 by
saattrupdan
8 tasks done
Add human evaluations
model evaluation request
Request to evaluate a model and add it to the leaderboard(s)
#395
opened Apr 16, 2024 by
saattrupdan
1 of 57 tasks
Previous Next
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.