Skip to content
This repository has been archived by the owner on Nov 16, 2023. It is now read-only.

Commit

Permalink
Staging to master to add the latest fixes (#503)
Browse files Browse the repository at this point in the history
* update mlflow version to match the other azureml versions

* Update generate_conda_file.py

* added temporary

* doc: update github url references

* docs: update nlp recipes references

* Minor bug fix for text classification of multi languages notebook

* remove bert and xlnet notebooks

* remove obsolete tests and links

* Add missing tmp directories.

* fix import error and max_nodes for the cluster

* Minor edits.

* Attempt to fix test device error.

* Temporarily pin transformers version

* Remove gpu tags temporarily

* Test whether device error also occurs for SequenceClassifier.

* Revert temporary changes.

* Revert temporary changes.
  • Loading branch information
miguelgfierro authored and saidbleik committed Nov 30, 2019
1 parent 967abcd commit ed04438
Show file tree
Hide file tree
Showing 22 changed files with 125 additions and 3,199 deletions.
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,8 @@ The following is a list of related repositories that we like and think are usefu
|[AzureML-BERT](https://github.com/Microsoft/AzureML-BERT)|End-to-end recipes for pre-training and fine-tuning BERT using Azure Machine Learning service.|
|[MASS](https://github.com/microsoft/MASS)|MASS: Masked Sequence to Sequence Pre-training for Language Generation.|
|[MT-DNN](https://github.com/namisan/mt-dnn)|Multi-Task Deep Neural Networks for Natural Language Understanding.|
|[UniLM](https://github.com/microsoft/unilm)|Unified Language Model Pre-training.|



## Build Status
Expand Down
10 changes: 5 additions & 5 deletions SETUP.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,9 +47,9 @@ You can learn how to create a Notebook VM [here](https://docs.microsoft.com/en-u
We provide a script, [generate_conda_file.py](tools/generate_conda_file.py), to generate a conda-environment yaml file
which you can use to create the target environment using the Python version 3.6 with all the correct dependencies.

Assuming the repo is cloned as `nlp` in the system, to install **a default (Python CPU) environment**:
Assuming the repo is cloned as `nlp-recipes` in the system, to install **a default (Python CPU) environment**:

cd nlp
cd nlp-recipes
python tools/generate_conda_file.py
conda env create -f nlp_cpu.yaml

Expand All @@ -62,7 +62,7 @@ Click on the following menus to see how to install the Python GPU environment:

Assuming that you have a GPU machine, to install the Python GPU environment, which by default installs the CPU environment:

cd nlp
cd nlp-recipes
python tools/generate_conda_file.py --gpu
conda env create -n nlp_gpu -f nlp_gpu.yaml

Expand All @@ -79,7 +79,7 @@ Assuming that you have an Azure GPU DSVM machine, here are the steps to setup th

2. Install the GPU environment.

cd nlp
cd nlp-recipes
python tools/generate_conda_file.py --gpu
conda env create -n nlp_gpu -f nlp_gpu.yaml

Expand Down Expand Up @@ -110,7 +110,7 @@ Running the command tells pip to install the `utils_nlp` package from source in

> It is also possible to install directly from Github, which is the best way to utilize the `utils_nlp` package in external projects (while still reflecting updates to the source as it's installed as an editable `'-e'` package).
> `pip install -e [email protected]:microsoft/nlp.git@master#egg=utils_nlp`
> `pip install -e [email protected]:microsoft/nlp-recipes.git@master#egg=utils_nlp`
Either command, from above, makes `utils_nlp` available in your conda virtual environment. You can verify it was properly installed by running:

Expand Down
2 changes: 1 addition & 1 deletion docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@
# The full version, including alpha/beta/rc tags
release = VERSION

prefix = "NLP"
prefix = "NLPRecipes"

# -- General configuration ---------------------------------------------------

Expand Down
4 changes: 2 additions & 2 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@
NLP Utilities
===================================================

The `NLP repository <https://github.com/Microsoft/NLP>`_ provides examples and best practices for building NLP systems, provided as Jupyter notebooks.
The `NLP repository <https://github.com/microsoft/nlp-recipes>`_ provides examples and best practices for building NLP systems, provided as Jupyter notebooks.

The module `utils_nlp <https://github.com/microsoft/nlp/tree/master/utils_nlp>`_ contains functions to simplify common tasks used when developing and
The module `utils_nlp <https://github.com/microsoft/nlp-recipes/tree/master/utils_nlp>`_ contains functions to simplify common tasks used when developing and
evaluating NLP systems.

.. toctree::
Expand Down
8 changes: 4 additions & 4 deletions examples/entailment/entailment_xnli_bert_azureml.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@
"from azureml.core.runconfig import MpiConfiguration\n",
"from azureml.core import Experiment\n",
"from azureml.widgets import RunDetails\n",
"from azureml.core.compute import ComputeTarget\n",
"from azureml.core.compute import ComputeTarget, AmlCompute\n",
"from azureml.exceptions import ComputeTargetException\n",
"from utils_nlp.azureml.azureml_utils import get_or_create_workspace, get_output_files"
]
Expand Down Expand Up @@ -169,7 +169,7 @@
"except ComputeTargetException:\n",
" print(\"Creating new compute target: {}\".format(cluster_name))\n",
" compute_config = AmlCompute.provisioning_configuration(\n",
" vm_size=\"STANDARD_NC6\", max_nodes=1\n",
" vm_size=\"STANDARD_NC6\", max_nodes=NODE_COUNT\n",
" )\n",
" compute_target = ComputeTarget.create(ws, cluster_name, compute_config)\n",
" compute_target.wait_for_completion(show_output=True)\n",
Expand Down Expand Up @@ -524,9 +524,9 @@
"metadata": {
"celltoolbar": "Tags",
"kernelspec": {
"display_name": "Python 3",
"display_name": "Python (nlp_gpu_transformer_bug_bash)",
"language": "python",
"name": "python3"
"name": "nlp_gpu_transformer_bug_bash"
},
"language_info": {
"codemirror_mode": {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -175,7 +175,7 @@
"metadata": {},
"source": [
"This step downloads the pre-trained [AllenNLP](https://allennlp.org/models) pretrained model and registers the model in our Workspace. The pre-trained AllenNLP model we use is called Bidirectional Attention Flow for Machine Comprehension ([BiDAF](https://www.semanticscholar.org/paper/Bidirectional-Attention-Flow-for-Machine-Seo-Kembhavi/007ab5528b3bd310a80d553cccad4b78dc496b02\n",
")) It achieved state-of-the-art performance on the [SQuAD](https://rajpurkar.github.io/SQuAD-explorer/) dataset in 2017 and is a well-respected, performant baseline for QA. AllenNLP's pre-trained BIDAF model is trained on the SQuAD training set and achieves an EM score of 68.3 on the SQuAD development set. See the [BIDAF deep dive notebook](https://github.com/microsoft/nlp/examples/question_answering/bidaf_deep_dive.ipynb\n",
")) It achieved state-of-the-art performance on the [SQuAD](https://rajpurkar.github.io/SQuAD-explorer/) dataset in 2017 and is a well-respected, performant baseline for QA. AllenNLP's pre-trained BIDAF model is trained on the SQuAD training set and achieves an EM score of 68.3 on the SQuAD development set. See the [BIDAF deep dive notebook](https://github.com/microsoft/nlp-recipes/examples/question_answering/bidaf_deep_dive.ipynb\n",
") for more information on this algorithm and AllenNLP implementation."
]
},
Expand Down
3 changes: 0 additions & 3 deletions examples/text_classification/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,5 @@ The following summarizes each notebook for Text Classification. Each notebook pr
|Notebook|Environment|Description|Dataset|
|---|---|---|---|
|[BERT for text classification on AzureML](tc_bert_azureml.ipynb) |Azure ML|A notebook which walks through fine-tuning and evaluating pre-trained BERT model on a distributed setup with AzureML. |[MultiNLI](https://www.nyu.edu/projects/bowman/multinli/)|
|[XLNet for text classification with MNLI](tc_mnli_xlnet.ipynb)|Local| A notebook which walks through fine-tuning and evaluating a pre-trained XLNet model on a subset of the MultiNLI dataset|[MultiNLI](https://www.nyu.edu/projects/bowman/multinli/)|
|[BERT for text classification of Hindi BBC News](tc_bbc_bert_hi.ipynb)|Local| A notebook which walks through fine-tuning and evaluating a pre-trained BERT model on Hindi BBC news data|[BBC Hindi News](https://github.com/NirantK/hindi2vec/releases/tag/bbc-hindi-v0.1)|
|[BERT for text classification of Arabic News](tc_dac_bert_ar.ipynb)|Local| A notebook which walks through fine-tuning and evaluating a pre-trained BERT model on Arabic news articles|[DAC](https://data.mendeley.com/datasets/v524p5dhpj/2)|
|[Text Classification of MultiNLI Sentences using Multiple Transformer Models](tc_mnli_transformers.ipynb)|Local| A notebook which walks through fine-tuning and evaluating a number of pre-trained transformer models|[MultiNLI](https://www.nyu.edu/projects/bowman/multinli/)|
|[Text Classification of Multi Language Datasets using Transformer Model](tc_multi_languages_transformers.ipynb)|Local|A notebook which walks through fine-tuning and evaluating a pre-trained transformer model for multiple datasets in different language|[MultiNLI](https://www.nyu.edu/projects/bowman/multinli/) <br> [BBC Hindi News](https://github.com/NirantK/hindi2vec/releases/tag/bbc-hindi-v0.1) <br> [DAC](https://data.mendeley.com/datasets/v524p5dhpj/2)

0 comments on commit ed04438

Please sign in to comment.