Releases: QData/TextAttack
v0.3.10
What's Changed
- Fix faster-alzantot recipe references by @marcorosa in #776
- Add back transcription augmentation method by @skorzewski in #767
- Polish dependencies and support python3.11 by @marcorosa in #780
- Update update_test_outputs.py by @qiyanjun in #781
- Add support for prompt augmentation by @k-ivey in #766
- Increase the swap file size of the GitHub actions runner by @k-ivey in #755
- Consistent word swap by @k-ivey in #752
- update docs with missing api by @qiyanjun in #757
- Typo corrections in installation docs by @dmlls in #759
- Do not use pipeline to achieve faster generation of Chinese mask repl… by @liuyuyan2717 in #778
- Rename BERT constraint to SBERT by @k-ivey in #763
- Word Swap Qwerty Failure Bug Fix by @jstzwj in #761
- disable tests while compute issues are resolved by @jxmorris12 in #779
New Contributors
- @dmlls made their first contribution in #759
- @marcorosa made their first contribution in #776
- @liuyuyan2717 made their first contribution in #778
- @jstzwj made their first contribution in #761
- @skorzewski made their first contribution in #767
Full Changelog: v0.3.9...v0.3.10
v0.3.9
this release mainly is about
- #747 fixing CSVlogger missing df issue
- #748 reverting one goal_func change due to the "textattack attack" errors
- #719 extending textattack into Chinese language
What's Changed
- fix command help str :-) by @jxmorris12 in #703
- Clean up formatting in HTML tables by @Arrrlex in #707
- Extra quality metrics by @gmurro in #695
- format after #695 by @jxmorris12 in #710
- Extend Chinese Attack by @Hanyu-Liu-123 in #719
- add in tutorials and reference for Chinese Textattack by @qiyanjun in #744
- fix potential bug in the filter_by_labels_ method of the Dataset class by @wenh06 in #746
- Fixed a batch_size bug in attack_args.py by @Falanke21 in #735
- Fix the problem of text output from T5 model by @plasmashen in #709
- Bump transformers from 4.27.4 to 4.30.0 by @dependabot in #740
- Fixed syntax and import issues in the example of Attack API by @eldorabdukhamidov in #734
- hard label classification by @cogeid in #635
- fixing the csvlogger missing DF issues by @qiyanjun in #747
- Fix pytest errors - due to goal_func by @qiyanjun in #748
- format update by @qiyanjun in #749
- Stanza test and notebooks minor fix by @qiyanjun in #750
New Contributors
- @Arrrlex made their first contribution in #707
- @gmurro made their first contribution in #695
- @Falanke21 made their first contribution in #735
- @eldorabdukhamidov made their first contribution in #734
Full Changelog: v0.3.8...v0.3.9
v0.3.8
#689: Add more type annotations and do some code cleanup in AttackedText
notably removed some code that did Chinese word segmentation because it did not properly
support words_from_text, which caused issues with various transformations.
#691: Optimize comparison between two AttackedText objects (thanks @plasmashen!)
#693: Fix bug with writing parameters twice in AttackedText (thanks @89x98!)
#700: Lots of miscellaneous bug fixes and some helper function implementation
#701: Fix bugs with loading TedTalk translation dataset, using T5, seq2sick/text-to-text goal functions
v0.3.7
- Update dependency:
transformers>=4.21.0
- Update dependency:
datasets==2.4.0
- Update optional dependency:
sentence_transformers==2.2.0
- Update optional dependency:
gensim==4.1.2
- Update optional dependency:
tensorflow==2.7.0
(Thanks @VijayKalmath !!!!) - Miscellaneous fixes for new packages to update things and remove warning messages
- Fix logging attack args to W&B #647 (thanks @VijayKalmath)
- Fix bug with word_swap_masked_lm #649 (thanks @Hanyu-Liu-123)
- Fix small issues with
textattack train
#653 (thanks @VijayKalmath) - Fix issue with PWWS #654 (thanks @VijayKalmath)
- Update recipe for FasterGeneticAlgorithm to match paper #656 (thanks @VijayKalmath)
- Update adversarial dataset generation logic #657 (thanks @VijayKalmath)
- Update dataset_args to correctly set dataset_split #659 (thanks @VijayKalmath)
- Add logic for loading SQUAD via HuggingFaceDataset class #660 (thanks @VijayKalmath)
- Fix ANSI color-printing #662
- Make GreedyWordSwapWIR and related search methods more query-efficient under the presence of pre-transformation constraints #665 and #674 (thanks @VijayKalmath)
- Save attack summary table as JSON (thanks @VijayKalmath -- great feature add!!)
- Fix typo and update numpy #671 and #672 (thanks @JohnGiorgi -- and welcome!)
- Finish CLARE attack #675 (thanks @Hanyu-Liu-123 and @VijayKalmath)
- Add repr for better user experience with GoalFunctionResult #676 (thanks @VijayKalmath)
- Better exception handling in WordSwapChangeNumber ((thanks @dangne -- and welcome!!)
- Various other typo and bug fixes
Thanks to everyone who contributed to TextAttack this summer, and a special shoutout once more to @VijayKalmath for all the hard work and attention to detail. Glad to see TextAttack so healthy 🙂
v0.3.5
- #644:
- Ability to specify device via
TA_DEVICE
env variable - New constraint,
MaxNumWordsModified
- Tracks previous AttackedText during attack to allow for reconstruction of chain of modifications
- Change
GreedyWordSwapWIR
to allow passing of specific unk token - Formatting updates to new Black version
- fix Universal Sentence Encoder from TF breakage
- fix Flair to new API (thanks @VijayKalmath for the help!)
- Ability to specify device via
- Bump version to 0.3.5
- #623 Fix quotation bug, thanks @donggrant
- #613 and others, fix dependencies
- #609 Only initialize embeddings when needed :) thanks to @duesenfranz
- #591 fix a bug with CLARE
v0.3.4 New Metric module / Multiple Bug fix / New transformation and Update Augmentation
What's Changed
- [CODE] Keras parallel attack fix - Issue #499 by @sanchit97 in #515
- Bump tensorflow from 2.4.2 to 2.5.1 in /docs by @dependabot in #517
- Add a high level overview diagram to docs by @cogeid in #519
- readtheDoc fix by @qiyanjun in #522
- Add new attack recipe A2T by @jinyongyoo in #523
- Fix incorrect
__eq__
method ofAttackedText
intextattack/shared/attacked_text.py
by @wenh06 in #509 - Fix a bug when running textattack eval with --num-examples=-1 by @dangne in #521
- New metric module to improve flexibility and intuitiveness - moved from #475 by @sanchit97 in #514
- Update installation.md to add FAQ on installation by @qiyanjun in #535
- Fix dataset-split bug by @Hanyu-Liu-123 in #533
- Update by @Hanyu-Liu-123 in #541
- add custom dataset API use example in doc by @qiyanjun in #543
- Fix logger initiation bug by @Hanyu-Liu-123 in #539
- Updated Tutorial 0 to use the Rotten Tomatoes dataset instead of the … by @srujanjoshi in #542
- Back translation transformation by @cogeid in #534
- Fixed a bug in the allennlp tutorial by @donggrant in #546
- Logger bug fix by @ankitgv0 in #551
- add "textattack[tensorflow]" option in all tutorials by @qiyanjun in #559
- Fix CLARE Extra Character Bug by @Hanyu-Liu-123 in #556
- Fix metric-module Issue#532 by @sanchit97 in #540
- Add API docstrings for back translation by @cogeid in #563
- Fixed the "no attribute" error from #536 by @ankitgv0 in #552
- Enhance augment function by @Hanyu-Liu-123 in #531
- fix read-the-doc installation issue / clean up and add new docstrings for recently added classes/packages by @qiyanjun in #569
New Contributors
- @wenh06 made their first contribution in #509
- @dangne made their first contribution in #521
- @srujanjoshi made their first contribution in #542
- @donggrant made their first contribution in #546
- @ankitgv0 made their first contribution in #551
Full Changelog: v0.3.3...v0.3.4
v0.3.3 Multiple Bug fix
-
Merge pull request #508 from QData/example_bug_fix
-
Merge pull request #505 from QData/s3-model-fix
-
Merge pull request #503 from QData/multilingual-doc
-
Merge pull request #502 from QData/Notebook-10-bug-fix
-
Merge pull request #500 from QData/docstring-rework-missing
-
Merge pull request #497 from QData/dependabot/pip/docs/tensorflow-2.4.2
-
Merge pull request #495 from QData/readthedoc-fix
v0.3.2 Bug Fixes
Multiple bug fixes:
-
Merge pull request #473 from cogeid/file-redirection-fix
-
Merge pull request #469 from xinzhel/allennlp_doc
-
Merge pull request #477 from cogeid/Fix-RandomSwap-and-RandomSynonymI…
-
Merge pull request #484 from QData/update-torch-version
-
Merge pull request #490 from QData/scipy-version-plus-two-doc-updates
-
Merge pull request #420 from QData/multilingual
-
Merge pull request #495 from QData/readthedoc-fix
v0.3.0 Updated API and Bug Fixes
New Updated API
We have added two new classes called Attacker
and Trainer
that can be used to perform adversarial attacks and adversarial training with full logging support and multi-GPU parallelism. This is intended to provide an alternative way of performing attacks and training for custom models and datasets.
Attacker
: Running Adversarial Attacks
Below is an example use of Attacker
to attack BERT model finetuned on IMDB dataset using TextFooler method. AttackArgs
class is used to set the parameters of the attacks, including the number of examples to attack, CSV file to log the results, and the interval at which to save checkpoint.
More details about Attacker
and AttackArgs
can be found here.
Trainer
: Running Adversarial Training
Previously, TextAttack supported adversarial training in a limited manner. Users could only train models using the CLI command, and not every aspects of training was available for tuning.
Trainer
class introduces an easy way to train custom PyTorch/Transformers models on a custom dataset. Below is an example where we finetune BERT on IMDB dataset with an adversarial attack called DeepWordBug.
Dataset
Previously, datasets passed to TextAttack were simply expected to be an iterable of (input, target)
tuples. While this offers flexibility, it prevents users from passing key information about the dataset that TextAttack can use to provide better experience (e.g. label names, label remapping, input column names used for printing).
We instead explicitly define Dataset
class that users can use or subclass for their own datasets.
Bug Fixes:
v0.2.15: CLARE Attack, Custom Word Embedding, and bug fixes!
CLARE Attack (#356, #392)
We have added a new attack proposed by "Contextualized Perturbation for Textual Adversarial Attack" (Li et al., 2020). There's also a corresponding augmenter recipe using CLARE. Thanks to @Hanyu-Liu-123, @cookielee77.
Custom Word Embedding (#333, #399)
We have added support for custom word embedding via AbstractWordEmbedding
, WordEmbedding
, GensimWordEmbedding
fromtextattack.shared
. These three classes allow users to use their own custom word embeddings for transformations and constraints that require custom word embeddings. Thanks @tsinggggg and @alexander-zap for contributing!
Bug Fixes and Changes
- We fixed a bug that caused TextAttack to report fewer number of average queries than what it should be reporting (#350, thanks @ a1noack).
- Update the dataset split used to evaluate robustness during adversarial training (#361, thanks @Opdoop).
- Updated default parameters for TextBugger recipe (#373)
- Fixed an issue with TextBugger by updating the default method used to segment text into words to work with homoglyphs. (#376, thanks @lethaiq!)
- Updated
ModelWrapper
to not requireget_grad
method to be defined. (#381) - Fixed an issue with
WordSwapMaskedLM
that was causing words with lowest probability to be picked first. (#396)