Running 7b models on the benchmark #269
Replies: 9 comments 12 replies
-
Hi @wendlerc it really depends on the implementation and hardware. I did the Scandinavian segment in less than an hour on a v100 (just to give you some approximate) and that was using 16bit using the naive transformers implementation. @Muennighoff might have a better estimate or suggestions for how to do it fast. |
Beta Was this translation helpful? Give feedback.
-
I think I might have a training-free method to turn any autoregressive model into a sentence embedding model and would like to do a quick evaluation. If there is a representative subset of tasks that people particularly care about that would also be helpful, or if anybody wants to get on board & help running evals. |
Beta Was this translation helpful? Give feedback.
-
I have specifically designed the Scandinavian embedding benchmark to be small. For testing purposes, I would probably go for that one. If you have your method wrapped in an Alternatively, you use the table on the front page to select ~10 random datasets with relatively few samples. |
Beta Was this translation helpful? Give feedback.
-
Will implement the encode method tomorrow & keep you posted. |
Beta Was this translation helpful? Give feedback.
-
As promised, here is a quick and dirty implementation: https://github.com/wendlerc/llama2-embeddings Batch size is currently hard-coded and also a maximum sequence length in the tokenization step. |
Beta Was this translation helpful? Give feedback.
-
I guess maybe the method needs some more work. |
Beta Was this translation helpful? Give feedback.
-
How does it compare to raw LLama2? You might want to add this approach as well: btw this seems more like a discussion that an issue so will just move it over. |
Beta Was this translation helpful? Give feedback.
-
I did some basic tests with echoembeddings and without training they perform similar to the method I proposed. I.e., taking the sum alone does not do the trick, it gets slightly better though. |
Beta Was this translation helpful? Give feedback.
-
How long does it take to run a llama2-size model over the benchmark?
Best,
Chris
Beta Was this translation helpful? Give feedback.
All reactions