Skip to content

Penta GPU training scripts on Antarctic Research Centre servers

Wei Ji edited this page Jul 30, 2020 · 1 revision

Step 1 - Activate virtualenv

need cuda && cd ~/Documents/github/deepbedmap/ && conda activate deepbedmap && pipenv shell

Step 2 - Hyperparameter training frenzy

# On Tara (2 x Tesla V100s)
CUDA_VISIBLE_DEVICES=0 jupyter nbconvert --ExecutePreprocessor.timeout=None --execute srgan_train.ipynb --to notebook --output model/logs/srgan_train_device0.ipynb &
CUDA_VISIBLE_DEVICES=1 jupyter nbconvert --ExecutePreprocessor.timeout=None --execute srgan_train.ipynb --to notebook --output model/logs/srgan_train_device1.ipynb &

# On Kahutea (2 x Tesla P100s)
CUDA_VISIBLE_DEVICES=0 jupyter nbconvert --ExecutePreprocessor.timeout=None --execute srgan_train.ipynb --to notebook --output model/logs/srgan_train_device2.ipynb &
CUDA_VISIBLE_DEVICES=1 jupyter nbconvert --ExecutePreprocessor.timeout=None --execute srgan_train.ipynb --to notebook --output model/logs/srgan_train_device3.ipynb &

# On SGEES001 (1 x Tesla V100)
CUDA_VISIBLE_DEVICES=0 jupyter nbconvert --ExecutePreprocessor.timeout=None --execute srgan_train.ipynb --to notebook --output model/logs/srgan_train_device4.ipynb &

Step 3 - Monitoring

watch -n 0.5 nvidia-smi

See also full experiment logs at https://www.comet.ml/weiji14/deepbedmap