Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with Evaluation ArguAna #89

Open
yeliusf opened this issue Oct 3, 2023 · 5 comments
Open

Issue with Evaluation ArguAna #89

yeliusf opened this issue Oct 3, 2023 · 5 comments

Comments

@yeliusf
Copy link

yeliusf commented Oct 3, 2023

Thanks for publishing the nice work. I want to evaluate Arguana by following your comments:
python examples/evaluate_model.py --model_name hkunlp/instructor-large --output_dir outputs --task_name ArguAna --result_file results

But I face the following issue:

  File "[/export/home/Instructor/mteb/abstasks/AbsTaskRetrieval.py](https://github.com/xlang-ai/instructor-embedding/blob/654f34cbd777d0dcb0d401ea3d7ccdbeeb3b259c/evaluation/MTEB/mteb/abstasks/AbsTaskRetrieval.py#L745C26-L745C26)", line 746, in <listcomp>
    [instruction, (corpus["title"][i] + self.sep + corpus["text"][i]).strip()]
TypeError: 'list' object is not callable
[W CudaIPCTypes.cpp:15] Producer process has been terminated before all shared CUDA tensors released. See Note [Sharing CUDA tensors]

I tried a fix by changing your code to:

            sentences = [
                [instruction,
                 (corpus["title"][i] + self.sep + corpus["text"][i]).strip()
                 if "title" in corpus else corpus["text"][i].strip()]
                for i in range(len(corpus["text"]))
            ]

But face a new issue:

  File "/opt/conda/envs/instructor/lib/python3.8/site-packages/datasets/table.py", line 1059, in from_file
    table = _memory_mapped_arrow_table_from_file(filename)
  File "/opt/conda/envs/instructor/lib/python3.8/site-packages/datasets/table.py", line 66, in _memory_mapped_arrow_table_from_file
    pa_table = opened_stream.read_all()
  File "pyarrow/ipc.pxi", line 699, in pyarrow.lib.RecordBatchReader.read_all
  File "pyarrow/error.pxi", line 115, in pyarrow.lib.check_status
OSError: Expected to be able to read 5197224 bytes for message body, got 5197216
[W CudaIPCTypes.cpp:15] Producer process has been terminated before all shared CUDA tensors released. See Note [Sharing CUDA tensors]

Could you take a look into this? Thanks!

@hongjin-su
Copy link
Collaborator

Hi, could you help to print out corpus["title"][i], self.sep and their types? They are expected to be strings.

@ParishadBehnam
Copy link

Hello. I have the same problem.

sentences = [
                [instruction, (corpus["title"][i] + self.sep + corpus["text"][i]).strip()]
                (corpus["title"][i] + self.sep + corpus["text"][i]).strip()
                if "title" in corpus
                else corpus["text"][i].strip()
                for i in range(len(corpus["text"]))
            ]

This code snippet leads to an error. Could you please double-check if it is written as you intended?

@wenhaoy-0428
Copy link

@ParishadBehnam @yeliusf I encountered the same issue, the easy workaround is to DO NOT use pip install -e . to install the mteb package. Instead, use pip install mteb.

@Xavier1999-Chen
Copy link

@ParishadBehnam @yeliusf I encountered the same issue, the easy workaround is to DO NOT use pip install -e . to install the mteb package. Instead, use pip install mteb.

it works on me :)

@ashokrajab
Copy link
Contributor

ashokrajab commented Nov 24, 2023

@ all,
the following error is due to a recent change that from my contribution

    [instruction, (corpus["title"][i] + self.sep + corpus["text"][i]).strip()]
TypeError: 'list' object is not callable
[W CudaIPCTypes.cpp:15] Producer process has been terminated before all shared CUDA tensors released. See Note [Sharing CUDA tensors]

I have raised a PR #92 to fix the same.

Waiting for a response from the repo maintainers to merge the PR. As a temporary fix, until this PR is merged, kindly include the commit in the PR in your branch to resolve the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants