Skip to content
This repository has been archived by the owner on Nov 16, 2023. It is now read-only.

Releases: microsoft/nlp-recipes

Text Summarization Models

30 Mar 15:09
21a6e09
Compare
Choose a tag to compare

Text Summarization

In this release, we support both abstractive and extractive text summarization.

New Model: UniLM

UniLM is a state of the art model developed by Microsoft Research Asia (MSRA). The model is pre-trained on a large unlabeled natural language corpus (English Wikipedia and BookCorpus) and can be fine-tuned on different types of labeled data for various NLP tasks like text classification and abstractive summarization.

Supported Models

  • unilm-large-cased
  • unilm-base-cased

For more info about UniLM, please refer to the following:

Thanks to the UniLM team, Li Dong, Nan Yang, Wenhui Wang, Furu Wei, Xiaodong Liu, Yu Wang, Jianfeng Gao, Ming Zhou, Hsiao-Wuen Hon, for their great work and support for the integration.

New Model: BERTSum

BERTSum is an encoder architecture designed for text summarization. It can be used together with different decoders to support both extractive and abstractive summarization.

Supported Models

  • bert-base-uncased (extractive and abstractive)
  • distilbert-base-uncased (extractive)

Thanks to the original authors Yang Liu and Mirella Lapata for their great contribution.

All model implementations support distributed training and multi-GPU inferencing. For abstractive summarization, we also support mixed-precision training and inference.

NLP 2020.01

25 Jan 15:48
fcd62f0
Compare
Choose a tag to compare
Merge pull request #540 from microsoft/staging

Staging to Master

NLP 2019.12

04 Dec 15:40
621ed95
Compare
Choose a tag to compare
NLP 2019.12 Pre-release
Pre-release

This release integrates Hugging face transformers library.

NLP 2019.10

04 Oct 16:53
Compare
Choose a tag to compare
NLP 2019.10 Pre-release
Pre-release
Staging to master to add the latest fixes (#503)

* update mlflow version to match the other azureml versions

* Update generate_conda_file.py

* added temporary

* doc: update github url references

* docs: update nlp recipes references

* Minor bug fix for text classification of multi languages notebook

* remove bert and xlnet notebooks

* remove obsolete tests and links

* Add missing tmp directories.

* fix import error and max_nodes for the cluster

* Minor edits.

* Attempt to fix test device error.

* Temporarily pin transformers version

* Remove gpu tags temporarily

* Test whether device error also occurs for SequenceClassifier.

* Revert temporary changes.

* Revert temporary changes.

Initial release

19 Sep 19:00
8fb28e0
Compare
Choose a tag to compare
Initial release Pre-release
Pre-release
Merge pull request #413 from microsoft/staging

Staging