ScandEval / ScandEval Public

Notifications You must be signed in to change notification settings
Fork 14
Star 66

Code
Issues 80
Pull requests 2
Discussions
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Security
Insights

Issues: ScandEval/ScandEval

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

80 Open 234 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

[MODEL EVALUATION REQUEST] Qwen/Qwen2-72B-Instruct model evaluation request

Request to evaluate a model and add it to the leaderboard(s)

#460 opened Jun 8, 2024 by saattrupdan

8 tasks done

[MODEL EVALUATION REQUEST] Qwen/Qwen2-72B model evaluation request

Request to evaluate a model and add it to the leaderboard(s)

#459 opened Jun 8, 2024 by saattrupdan

8 tasks done

[BUG] circular dependency - outlines>=0.0.37 bug

Something isn't working

#457 opened Jun 8, 2024 by farup

[MODEL EVALUATION REQUEST] AI-Sweden-Models/Llama-3-8B large model (>7B)

This model has more than 7B parameters, requiring more than an RTX 4090 GPU to evaluate.

model evaluation request

Request to evaluate a model and add it to the leaderboard(s)

#456 opened Jun 6, 2024 by saattrupdan

3 of 8 tasks

[MODEL EVALUATION REQUEST] NorwAI/NorwAI-Mixtral-8x7B-instruct large model (>7B)

This model has more than 7B parameters, requiring more than an RTX 4090 GPU to evaluate.

model evaluation request

Request to evaluate a model and add it to the leaderboard(s)

#455 opened Jun 5, 2024 by saattrupdan

3 of 8 tasks

[MODEL EVALUATION REQUEST] NorwAI/NorwAI-Mixtral-8x7B large model (>7B)

This model has more than 7B parameters, requiring more than an RTX 4090 GPU to evaluate.

model evaluation request

Request to evaluate a model and add it to the leaderboard(s)

#454 opened Jun 5, 2024 by saattrupdan

3 of 8 tasks

[MODEL EVALUATION REQUEST] NorwAI/NorwAI-Mistral-7B-instruct model evaluation request

Request to evaluate a model and add it to the leaderboard(s)

small model (<7B)

This model has less than 7B parameters, so can be evaluated on an RTX 4090 GPU or smaller.

#453 opened Jun 5, 2024 by saattrupdan

3 of 8 tasks

[MODEL EVALUATION REQUEST] NorwAI/NorwAI-Mistral-7B model evaluation request

Request to evaluate a model and add it to the leaderboard(s)

small model (<7B)

This model has less than 7B parameters, so can be evaluated on an RTX 4090 GPU or smaller.

#452 opened Jun 5, 2024 by saattrupdan

3 of 8 tasks

[MODEL EVALUATION REQUEST] NorwAI/NorwAI-Mistral-7B-pretrain model evaluation request

Request to evaluate a model and add it to the leaderboard(s)

small model (<7B)

This model has less than 7B parameters, so can be evaluated on an RTX 4090 GPU or smaller.

#451 opened Jun 5, 2024 by saattrupdan

3 of 8 tasks

[MODEL EVALUATION REQUEST] NorwAI/NorwAI-Mixtral-8x7B-one-third-epoch large model (>7B)

This model has more than 7B parameters, requiring more than an RTX 4090 GPU to evaluate.

model evaluation request

Request to evaluate a model and add it to the leaderboard(s)

#450 opened Jun 5, 2024 by saattrupdan

3 of 8 tasks

[MODEL EVALUATION REQUEST] NorwAI/NorwAI-Llama2-7B model evaluation request

Request to evaluate a model and add it to the leaderboard(s)

small model (<7B)

This model has less than 7B parameters, so can be evaluated on an RTX 4090 GPU or smaller.

#449 opened Jun 5, 2024 by saattrupdan

3 of 8 tasks

[MODEL EVALUATION REQUEST] Phi-3-mini-4k-instruct model evaluation request

Request to evaluate a model and add it to the leaderboard(s)

small model (<7B)

This model has less than 7B parameters, so can be evaluated on an RTX 4090 GPU or smaller.

#446 opened Jun 3, 2024 by Mikeriess

8 tasks done

[FEATURE REQUEST] Gemeni benchmarks enhancement

New feature or request

#443 opened May 30, 2024 by kim-borgen

[BUG] <Error with SweDN benchmark> bug

Something isn't working

#442 opened May 28, 2024 by Knobi333

[MODEL EVALUATION REQUEST] mistralai/Mistral-7B-v0.3 model evaluation request

Request to evaluate a model and add it to the leaderboard(s)

small model (<7B)

This model has less than 7B parameters, so can be evaluated on an RTX 4090 GPU or smaller.

#440 opened May 23, 2024 by mhenrichsen

1 of 8 tasks

[FEATURE REQUEST] Benchmarking on private data enhancement

New feature or request

#438 opened May 22, 2024 by Mikeriess

[BENCHMARK DATASET REQUEST] NorBench benchmark dataset request

Request to add a new benchmark dataset

#435 opened May 15, 2024 by Mikeriess

1 of 8 tasks

[MODEL EVALUATION REQUEST] lightonai/alfred-40b-1023 large model (>7B)

This model has more than 7B parameters, requiring more than an RTX 4090 GPU to evaluate.

model evaluation request

Request to evaluate a model and add it to the leaderboard(s)

#434 opened May 9, 2024 by TheLounger

4 of 8 tasks

[BUG] Generation does not terminate on single newline bug

Something isn't working

#432 opened May 6, 2024 by iPieter

[BENCHMARK DATASET REQUEST] dutch-cola benchmark dataset request

Request to add a new benchmark dataset

#419 opened Apr 24, 2024 by BramVanroy

1 of 8 tasks

[FEATURE REQUEST] Support seq-to-seq architectures enhancement

New feature or request

#418 opened Apr 24, 2024 by saattrupdan

[BUG] Outlines version clash with vLLM bug

Something isn't working

#414 opened Apr 23, 2024 by saattrupdan

[BUG] Memory leak when benchmarking multiple generative models with multiple GPUs bug

Something isn't working

#413 opened Apr 23, 2024 by saattrupdan

[MODEL EVALUATION REQUEST] allenai/OLMo-1.7-7B-hf model evaluation request

Request to evaluate a model and add it to the leaderboard(s)

small model (<7B)

This model has less than 7B parameters, so can be evaluated on an RTX 4090 GPU or smaller.

#407 opened Apr 19, 2024 by saattrupdan

8 tasks done

Add human evaluations model evaluation request

Request to evaluate a model and add it to the leaderboard(s)

#395 opened Apr 16, 2024 by saattrupdan

1 of 57 tasks

Previous 1 2 3 4 Next

Previous Next

ProTip! Type g i on any issue or pull request to go back to the issue listing page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly