Skip to content

Prompt Generation Networks for Input-Space Adaptation of Frozen Vision Transformers. Jochem Loedeman, Maarten C. Stol, Tengda Han, Yuki M. Asano. Tech Report. 2022

License

Notifications You must be signed in to change notification settings

jochemloedeman/PGN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

49 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Prompt Generation Networks for Input-Space Adaptation of Frozen Vision Transformers

This repository is the official implementation of the BMVC2024 paper Prompt Generation Networks for Input-Space Adaptation of Frozen Vision Transformers by Jochem Loedeman, Maarten Stol, Tengda Han and Yuki M Asano.

drawing

Requirements

To install python dependencies, make sure that poetry is installed and execute the following in the project root directory:

poetry install

Data

See data/README.md

DINO

Download the full checkpoint for DINO ViT-S/16 from here and insert it as pgn/pgn_models/dino/dino_deitsmall16_pretrain_full_checkpoint.pth.

Training

To train/test with the CLIP backbone, run

poetry run train_clip
poetry run test_clip

To train/test with either DINO or supervised ViT, specify the backbone with --vision_model_type and run

poetry run train_visionmodel
poetry run test_visionmodel

For all available command line arguments, see pgn/scripts.

Pretrained PGNs

Pretrained PGNs are supplied in pretrained_pgns/. To use them in the context of this repository, specify the desired model by setting the --pgn_path argument in the test scripts.

Reference

If you find this repository is useful for your project, please consider citing our paper:

@article{Loedeman2022prompt,
    author       = "Jochem Loedeman and Maarten Stol and Tengda Han and Yuki M Asano",
    title        = "Prompt Generation Networks for Input-based Adaptation of Frozen Vision Transformers",
    journal      = "arxiv preprint arxiv:2210.06466",
    year         = "2022",
}

About

Prompt Generation Networks for Input-Space Adaptation of Frozen Vision Transformers. Jochem Loedeman, Maarten C. Stol, Tengda Han, Yuki M. Asano. Tech Report. 2022

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages