pair classification inconsistencies #582

SaitejaUtpala · 2024-04-26T13:53:20Z

class AbsTaskPairClassification(AbsTask):
    """Abstract class for PairClassificationTasks
    The similarity is computed between pairs and the results are ranked. Average precision
    is computed to measure how well the methods can be used for pairwise pair classification.

    self.load_data() must generate a huggingface dataset with a split matching self.metadata_dict["eval_splits"], and assign it to self.dataset. It must contain the following columns:
        sent1: list[str]
        sent2: list[str]
        labels: list[int]
    """

    def __init__(self, **kwargs):
        super().__init__(**kwargs)

    def _evaluate_monolingual(self, model, dataset, split="test", **kwargs):
        data_split = dataset[split][0]  # This causes error because it just gets first row of the split
        logging.getLogger(
            "sentence_transformers.evaluation.PairClassificationEvaluator"
        ).setLevel(logging.WARN)
        evaluator = PairClassificationEvaluator(
            data_split["sent1"], data_split["sent2"], data_split["labels"], **kwargs
        )

I am working #581 dataset, I found couple of potential issues in 'AbsTaskPairClassification' with

data_split = dataset[split][0] in _evaluate_monolingual, This causes error because it just gets first row of the split and doesn't whole dataset
PairClassificationEvaluator(
data_split["sent1"], data_split["sent2"], data_split["labels"], **kwargs
)
also it expects 'sent1', 'sent2', 'labels' instead of 'sentence1', 'sentence2' and 'label' (standard followed in STS and BiText Mining task)

The text was updated successfully, but these errors were encountered:

loicmagne · 2024-04-26T14:22:55Z

It's not an error but this is a legacy of how the initial pair classification datasets were formatted, for example TwitterSemEval2015:

>>> d = load_dataset('mteb/twittersemeval2015-pairclassification')
Downloading data: 100%|██████████████████████████████████████████████████████████████████████████| 313k/313k [00:00<00:00, 1.35MB/s]
Generating test split: 100%|███████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  7.39 examples/s]
>>> d
DatasetDict({
    test: Dataset({
        features: ['sent1', 'sent2', 'labels'],
        num_rows: 1
    })
})

There is a single row, where each row contains a list of sentences. I agree this isn't a very good format and the naming is inconsistent with other tasks so it might make sense to change it

SaitejaUtpala mentioned this issue Apr 26, 2024

Add Indic xnli pair classification #581

Closed

10 tasks

imenelydiaker assigned loicmagne Apr 26, 2024

KennethEnevoldsen changed the title ~~pair classification inconsistencies and bugs~~ pair classification inconsistencies Apr 29, 2024

KennethEnevoldsen added the enhancement New feature or request label Apr 29, 2024

loicmagne added good first issue Good for newcomers help wanted Extra attention is needed and removed help wanted Extra attention is needed labels May 1, 2024

dokato mentioned this issue Jun 17, 2024

Fix pair classification inconsistency #945

Merged

10 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pair classification inconsistencies #582

pair classification inconsistencies #582

SaitejaUtpala commented Apr 26, 2024

loicmagne commented Apr 26, 2024

pair classification inconsistencies #582

pair classification inconsistencies #582

Comments

SaitejaUtpala commented Apr 26, 2024

loicmagne commented Apr 26, 2024