Code for the paper "Mehta, S. V., Patil, D., Chandar, S., & Strubell, E. (2023). An Empirical Investigation of the Role of Pre-training in Lifelong Learning. The Journal of Machine Learning Research 24 (2023)"
Python 3.6, PyTorch 1.7.0, transformers 2.9.0
Conda can be used to set up a virtual environment with Python 3.6 in which you can sandbox dependencies required for our implementation:
-
Create a Conda environment with Python 3.6
conda create -n lll python=3.6
-
Activate the Conda environment. (You will need to activate the Conda environment in each terminal in which you want to run our implementation).
conda activate lll
-
Visit http://pytorch.org/ and install the PyTorch 1.7.0 package for your system.
conda install pytorch==1.7.0 cudatoolkit=11.0 -c pytorch
-
Install other requirements
pip install -r requirements.txt
That's it! You're now ready to reproduce our results.
- First create the data directory:
mkdir data
- To download the data for Split CIFAR-100 and Split CIFAR-50 experiments, run:
./scripts/download_vision_data.sh cifar100
- To download data for 5-dataset experiments, run:
./scripts/download_vision_data.sh 5data
To run the vision experiments and create the necessary model checkpoints for random initialization, run:
./scripts/run_vision.sh \
{DATASET} \
{METHOD} \
./data \
./output/{DATASET}/{METHOD}/random/run_1 \
1 \
random \
5 \
{CUDA_DEVICE} \
rndminit_ckpt \
no_sam
where {DATASET}
is one of "5data", "cifar50", "cifar100"
, and {METHOD}
is
one of "sgd", "er", "ewc"
.
Similarly, to run and create the necessary model checkpoints for pre-trained initialization, run:
./scripts/run_vision.sh \
{DATASET} \
{METHOD} \
./data \
./output/{DATASET}/{METHOD}/pt/run_1 \
1 \
pt \
5 \
{CUDA_DEVICE} \
imagenetinit_ckpt \
no_sam
where {DATASET}
is one of "5data", "cifar50", "cifar100"
, and {METHOD}
is
one of "sgd", "er", "ewc"
.
The above run commands will create a folder called output
with all of the relevant data for the
run as well as the model checkpoints. In our experiments, we run this with 5 different random seeds. The data in Table 5
for vision experiments is generated based on the log.json
files in each run folder.
Create the folders:
mkdir -p results/analysis/sharpness/{DATASET}/random
mkdir -p results/analysis/sharpness/{DATASET}/pt
for each dataset of interest (5data, cifar50, cifar100
).
To run the sharpness analysis, we run the following command:
./scripts/run_vision_analysis.sh \
{DATASET} \
./data \
./output/{DATASET}/random/run_1 \
./results/analysis/sharpness/{DATASET}/random/run_1.json \
sharpness \
0
Run a similar command for the pre-trained models.
If you use our code in your research, please cite: An Empirical Investigation of the Role of Pre-training in Lifelong Learning
@article{JMLR:v24:22-0496,
author = {Sanket Vaibhav Mehta and Darshan Patil and Sarath Chandar and Emma Strubell},
title = {An Empirical Investigation of the Role of Pre-training in Lifelong Learning},
journal = {Journal of Machine Learning Research},
year = {2023},
volume = {24},
number = {214},
pages = {1--50},
url = {http://jmlr.org/papers/v24/22-0496.html}
}