Skip to content

Issues: ScandEval/ScandEval

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Label
Filter by label
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Milestones
Filter by milestone
Assignee
Filter by who’s assigned
Sort

Issues list

[MODEL EVALUATION REQUEST] Qwen/Qwen2-72B-Instruct model evaluation request Request to evaluate a model and add it to the leaderboard(s)
#460 opened Jun 8, 2024 by saattrupdan
8 tasks done
[MODEL EVALUATION REQUEST] Qwen/Qwen2-72B model evaluation request Request to evaluate a model and add it to the leaderboard(s)
#459 opened Jun 8, 2024 by saattrupdan
8 tasks done
[BUG] circular dependency - outlines>=0.0.37 bug Something isn't working
#457 opened Jun 8, 2024 by farup
[MODEL EVALUATION REQUEST] AI-Sweden-Models/Llama-3-8B large model (>7B) This model has more than 7B parameters, requiring more than an RTX 4090 GPU to evaluate. model evaluation request Request to evaluate a model and add it to the leaderboard(s)
#456 opened Jun 6, 2024 by saattrupdan
3 of 8 tasks
[MODEL EVALUATION REQUEST] NorwAI/NorwAI-Mixtral-8x7B-instruct large model (>7B) This model has more than 7B parameters, requiring more than an RTX 4090 GPU to evaluate. model evaluation request Request to evaluate a model and add it to the leaderboard(s)
#455 opened Jun 5, 2024 by saattrupdan
3 of 8 tasks
[MODEL EVALUATION REQUEST] NorwAI/NorwAI-Mixtral-8x7B large model (>7B) This model has more than 7B parameters, requiring more than an RTX 4090 GPU to evaluate. model evaluation request Request to evaluate a model and add it to the leaderboard(s)
#454 opened Jun 5, 2024 by saattrupdan
3 of 8 tasks
[MODEL EVALUATION REQUEST] NorwAI/NorwAI-Mistral-7B-instruct model evaluation request Request to evaluate a model and add it to the leaderboard(s) small model (<7B) This model has less than 7B parameters, so can be evaluated on an RTX 4090 GPU or smaller.
#453 opened Jun 5, 2024 by saattrupdan
3 of 8 tasks
[MODEL EVALUATION REQUEST] NorwAI/NorwAI-Mistral-7B model evaluation request Request to evaluate a model and add it to the leaderboard(s) small model (<7B) This model has less than 7B parameters, so can be evaluated on an RTX 4090 GPU or smaller.
#452 opened Jun 5, 2024 by saattrupdan
3 of 8 tasks
[MODEL EVALUATION REQUEST] NorwAI/NorwAI-Mistral-7B-pretrain model evaluation request Request to evaluate a model and add it to the leaderboard(s) small model (<7B) This model has less than 7B parameters, so can be evaluated on an RTX 4090 GPU or smaller.
#451 opened Jun 5, 2024 by saattrupdan
3 of 8 tasks
[MODEL EVALUATION REQUEST] NorwAI/NorwAI-Mixtral-8x7B-one-third-epoch large model (>7B) This model has more than 7B parameters, requiring more than an RTX 4090 GPU to evaluate. model evaluation request Request to evaluate a model and add it to the leaderboard(s)
#450 opened Jun 5, 2024 by saattrupdan
3 of 8 tasks
[MODEL EVALUATION REQUEST] NorwAI/NorwAI-Llama2-7B model evaluation request Request to evaluate a model and add it to the leaderboard(s) small model (<7B) This model has less than 7B parameters, so can be evaluated on an RTX 4090 GPU or smaller.
#449 opened Jun 5, 2024 by saattrupdan
3 of 8 tasks
[MODEL EVALUATION REQUEST] Phi-3-mini-4k-instruct model evaluation request Request to evaluate a model and add it to the leaderboard(s) small model (<7B) This model has less than 7B parameters, so can be evaluated on an RTX 4090 GPU or smaller.
#446 opened Jun 3, 2024 by Mikeriess
8 tasks done
[FEATURE REQUEST] Gemeni benchmarks enhancement New feature or request
#443 opened May 30, 2024 by kim-borgen
[BUG] <Error with SweDN benchmark> bug Something isn't working
#442 opened May 28, 2024 by Knobi333
[MODEL EVALUATION REQUEST] mistralai/Mistral-7B-v0.3 model evaluation request Request to evaluate a model and add it to the leaderboard(s) small model (<7B) This model has less than 7B parameters, so can be evaluated on an RTX 4090 GPU or smaller.
#440 opened May 23, 2024 by mhenrichsen
1 of 8 tasks
[FEATURE REQUEST] Benchmarking on private data enhancement New feature or request
#438 opened May 22, 2024 by Mikeriess
[BENCHMARK DATASET REQUEST] NorBench benchmark dataset request Request to add a new benchmark dataset
#435 opened May 15, 2024 by Mikeriess
1 of 8 tasks
[MODEL EVALUATION REQUEST] lightonai/alfred-40b-1023 large model (>7B) This model has more than 7B parameters, requiring more than an RTX 4090 GPU to evaluate. model evaluation request Request to evaluate a model and add it to the leaderboard(s)
#434 opened May 9, 2024 by TheLounger
4 of 8 tasks
[BUG] Generation does not terminate on single newline bug Something isn't working
#432 opened May 6, 2024 by iPieter
[BENCHMARK DATASET REQUEST] dutch-cola benchmark dataset request Request to add a new benchmark dataset
#419 opened Apr 24, 2024 by BramVanroy
1 of 8 tasks
[BUG] Outlines version clash with vLLM bug Something isn't working
#414 opened Apr 23, 2024 by saattrupdan
[MODEL EVALUATION REQUEST] allenai/OLMo-1.7-7B-hf model evaluation request Request to evaluate a model and add it to the leaderboard(s) small model (<7B) This model has less than 7B parameters, so can be evaluated on an RTX 4090 GPU or smaller.
#407 opened Apr 19, 2024 by saattrupdan
8 tasks done
Add human evaluations model evaluation request Request to evaluate a model and add it to the leaderboard(s)
#395 opened Apr 16, 2024 by saattrupdan
1 of 57 tasks
ProTip! Type g i on any issue or pull request to go back to the issue listing page.