Skip to content

nashory/rtic-gcn-pytorch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RTIC: Residual Learning for Text and Image Composition using Graph Convolutional Network

This is the official code of RTIC: Residual Learning for Text and Image Composition using Graph Convolutional Network. The code only supports training and evaluation on FashionIQ. We release the implementations for the other baselines together.

banner

Updates

  • (2021.10.26) Update model checkpoints, trianing configs and tensorboard logs.
  • (2021.09.10) The official code is released.

Requirements

Prepare your environment with virtualenv.

python3 -m virtualenv --python=python3 venv # create virtualenv.
. venv/bin/activate # activate environment.
pip3 install -r requirements.txt # install require packages.

Download Data

We provide script for downloading FashionIQ images. Note that it does not ensure that all images can be downloaded because we found some urls are broken.

sh script/download_fiq.sh

Model Zoo

We provide pretrained checkpoints for RTIC / RTIC-GCN trained on FashionIQ.

Model Recall Checkpoint Config Training Log
RTIC 39.22 ckpt config tensorboard_log
RTIC-GCN (scratch) 39.55 ckpt config tensorboard_log
RTIC-GCN (finetune) 40.64 ckpt config tensorboard_log

Benchmark Score on FashionIQ Dataset

Method Metric ((R@10 + R@50) / 2) Paper
JVSM 19.26 pdf
TRACE w/ BERT 34.38 pdf
VAL w/ GloVe 35.38 pdf
CIRPLANT w/ OSCAR 30.20 pdf
MAAF 36.60 pdf
CurlingNet 38.45 pdf
CoSMo 39.45 pdf
RTIC w/ GloVe 39.22 -
RTIC-GCN w/ GloVe (scratch) 39.55 -
RTIC-GCN w/ GloVe (fine-tune) 40.64 -

Quick Start

We provide sample training script to run on different configurations. The default configurations are stored in cfg/default.yaml which represents "unified environmet" in our paper. To try with "optimal environment", please use +optimize=<someting> option.

(1) RTIC (unified env)

EXPR_NAME=testrun python main.py \
    config.EXPR_NAME=${EXPR_NAME}

(2) RTIC (optimal env)

EXPR_NAME=testrun python main.py \
    +optimize=rtic \
    config.EXPR_NAME=${EXPR_NAME}

(3) RTIC-GCN (optimal env, scratch)

EXPR_NAME=testrun_gcn LOAD_FROM=testrun python main.py \
    +optimize=rtic_gcn_scratch \
    +gcn=enabled \
    config.LOAD_FROM=${LOAD_FROM} \
    config.EXPR_NAME=${EXPR_NAME}

(4) RTIC-GCN (optimal env, finetune)

EXPR_NAME=testrun_gcn LOAD_FROM=testrun python main.py \
    +optimize=rtic_gcn_finetune \
    +gcn=enabled \
    config.LOAD_FROM=${LOAD_FROM} \
    config.EXPR_NAME=${EXPR_NAME}

(5) Other Baselines

you can train any other baselines by simply changing config.TRAIN.MODEL.composer_model.name.

(w/o GCN)
EXPR_NAME=testrun python main.py \
    config.TRAIN.MODEL.composer_model.name=<any-composer-method-you-want-to-try> \
    config.EXPR_NAME=${EXPR_NAME}
(w GCN)
EXPR_NAME=testrun_gcn LOAD_FROM=testrun python main.py \
    +gcn=enabled \
    config.TRAIN.MODEL.composer_model.name=<any-composer-method-you-want-to-try> \
    config.LOAD_FROM=${LOAD_FROM} \
    config.EXPR_NAME=${EXPR_NAME}

Citation

If you find this work useful for your research, please cite our paper:

@article{shin2021rtic,
  title={RTIC: Residual Learning for Text and Image Composition using Graph Convolutional Network},
  author={Shin, Minchul and Cho, Yoonjae and Ko, Byungsoo and Gu, Geonmo},
  journal={arXiv preprint arXiv:2104.03015},
  year={2021}
}

License

MIT License

About

Official PyTorch Implementation of RITC

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published