You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
from textattack.augmentation import Augmenter
from textattack.transformations import WordSwapEmbedding
from textattack.constraints.semantics import WordEmbeddingDistance
from textattack.constraints.grammaticality import PartOfSpeech
from textattack.constraints.pre_transformation import RepeatModification, StopwordModification
from textattack.shared import AttackedText
text_sample = "woody , what happened ?"
num_words_to_swap = len(AttackedText(text_sample).words) -1 # minus as what is a stop word
max_candidates = 50
num_samples = max_candidates**num_words_to_swap
print('max num_samples:', num_samples)
# Define constraints to ensure quality of perturbations
constraints = [StopwordModification(),RepeatModification()]
constraints.append(WordEmbeddingDistance(min_cos_sim=0.5))
constraints.append(PartOfSpeech(allow_verb_noun_swap=True))
# Define the transformation method
transformation = WordSwapEmbedding(
max_candidates=50 # Number of candidates to generate per word
)
# Combine transformation and constraints in an Augmenter
augmenter = Augmenter(
transformation=transformation,
constraints=constraints,
pct_words_to_swap=1, # Percentage of words to swap per perturbation
transformations_per_example=num_samples # Number of perturbations to generate per input
)
perturbations = augmenter.augment(text_sample)
actural_num_samples = len(perturbations)
print('actural_num_samples: ',actural_num_samples)
Which gives me the output:
max num_samples: 2500
actural_num_samples: 532
But when I delete the RepeatModification constraint the other constraints and code remains the same:
constraints = [StopwordModification()]
gives me the output:
max num_samples: 2500
actural_num_samples: 277
Expected behavior
I expect that easing the constraint should increase the num_samples, but it shows the opposite.
Is there anything I misunderstood or is there a bug?
System Information (please complete the following information):
To Reproduce
Run following code ...
Which gives me the output:
But when I delete the RepeatModification constraint the other constraints and code remains the same:
constraints = [StopwordModification()]
gives me the output:
Expected behavior
I expect that easing the constraint should increase the num_samples, but it shows the opposite.
Is there anything I misunderstood or is there a bug?
System Information (please complete the following information):
torch==2.3.0, transformers==4.40.1
The text was updated successfully, but these errors were encountered: