Models and utilities for Arabic NLP.
The code is organized as a Python module. We recommend using Docker for development.
Please note if you need to install additional dependencies, you'll need to alter the requirements.txt
and rebuild the docker image:
docker build -f Dockerfile -t "parsertongue/arabic-nlp:latest" .
If you're only using the existing dependencies, you can simply download the published Docker image:
docker pull "parsertongue/arabic-nlp:latest"
To run code interactively in the iPython interpreter:
docker run -it -v $PWD:/app parsertongue/arabic-nlp:latest ipython
To run code interactively in a Jupyter notebook, run the following command and open your browser to localhost:8889:
docker run -it -v $PWD:/app -p 8889:9999 parsertongue/arabic-nlp:latest launch-notebook
Tests are written by extending the TestCase
class from the unittest
module in the Python standard library. All tests can be found in the tests
directory.
All tests can be run using the following command:
docker run -it -v $PWD:/app "parsertongue/arabic-nlp:latest" test-all
Tests are written by extending the TestCase
class from the unittest
module in the Python standard library. All tests can be foun
d in the tests
directory.
All tests can be run using the following command:
docker run -it -v $PWD:/app "parsertongue/arabic-nlp:latest" test-all
To run just the unit tests, run the following command:
docker run -it -v $PWD:/app "parsertongue/arabic-nlp:latest" green -vvv
The code makes use of Python type hints. To perform type checking, run the following command:
docker run -it -v $PWD:/app "parsertongue/arabic-nlp:latest" mypy --ignore-missing-imports --follow-imports=skip --strict-optional /app
The snippet below will run an experiment using toy data:
docker run -it -v $PWD:/app parsertongue/arabic-nlp:latest cdd-base-model --config /app/tests/toy-data/test-experiment-config.yml
To build a Singularity image from a published Docker image, run the following command:
singularity pull cdd.sif docker://parsertongue/arabic-nlp:latest
## HPC
To test the image on the UA HPC with GPU acceleration, first request an interactive session:
```bash
qsub -I \
-N interactive-gpu \
-W group_list=mygroupnamehere \
-q standard \
-l select=1:ncpus=2:mem=16gb:ngpus=1 \
-l cput=3:0:0 \
-l walltime=1:0:0
The following modules are necessary:
module load singularity
module load cuda11
NOTE: not all clusters have the cuda11
module installed.
singularity shell --nv --no-home /path/to/your.sif
To check that the GPU is found, run the following command:
nvidia-smi
Finally, confirm that PyTorch can locate the GPU:
python -c "import torch;print(f'CUDA?:\t{torch.cuda.is_available()}');print(f'GPU:\t{torch.cuda.get_device_name(0)}')"
NOTE: These instructions assume you've already installed Singularity (>= v3.7.0) locally on a Linux system.
First, we'll build a docker image:
docker build -f Dockerfile -t "parsertongue/arabic-nlp:latest" .
Next, we'll use the docker-daemon
bootstrap agent to build a Singularity image from this docker image:
sudo singularity build arabic-nlp.sif docker-daemon://parsertongue/arabic-nlp:latest
Finally, we'll test that image works:
singularity exec --nv arabic-nlp.sif train-arabert --help