Official implementation of Multimodal Color Recommendation in Vector Graphic Documents, ACM MM 2023.
In this study, we propose a multimodal masked color model that integrates both color and textual contexts to provide text-aware color recommendation for graphic documents. Our proposed model comprises self-attention networks to capture the relationships between colors in multiple palettes, and cross-attention networks that incorporate both color and CLIP-based text representations.
This proposal is applicable for two color recommendation tasks, color palette completion, which recommends colors based on the given colors and text, and full palette generation, which generates a complete color palette corresponding to the given text. The code for these two tasks is organized in two separate folders.
This project has been developed and tested in a Google Cloud Platform (GCP) notebook instance. PS: We use google-cloud-vision to detect image labels. An updated version with Open-source-software(OSS) will be released soon.
- Machine type: n1-standard-4
- Environment: TensorFlow Enterprise 2.3
- GPU: NVIDIA T4 * 1.
- Python:3.7.12
- CUDA: 11.6
Install pytorch and requirements
pip3 install torch torchvision --index-url https://download.pytorch.org/whl/cu116
pip3 install requirement.txt
Task: Recommends colors based on the given colors and text
Target: multiple palettes in a graphic document
notebooks/palette_compl.ipynb: recommend colors for multiple palettes in a design
- Trained model of color prediction are in
data/trained_models/
. - Json files for test are pre-created in
data/samples/crello_samples/
. - New input image files for test are in
data/samples/image_samples/
.
You can also create a json file for test from crello dataset on a notebook notebooks/create_json_file.ipynb.
Step1: Create color and text embedding files of train, validation, and test on a notebook notebooks/preprocess.ipynb
data/data_t2p/color
: color corpus of train, validation, and test dataset, and color vocabulary from train datasetdata/data_t2p/text
: text contents and image labels of train, validation, and test datasetdata/data_t2p/text/emb_clip_imagemust_seq
: pre-created text embedings of text contents and image labels for train, validation, and test
Step2: Train a color model on a notebook notebooks/train_model.ipynb.
data/data_colors/data_colors_labels/.
: extracted color palettes for Image-SVG-Text elements, text contents and image lables from Crello-dataset-v2 (the lastest Crello-dataset).
- Rawdata in
data/crello-dataset-v2
: Download and unzip the crello dataset v2 and get .tfrecord files for train/val/test. - Data filter: high frequent image labels, English contents
data/trained_model
: trained model for text-aware color completion
data/samples
: json sample files for testing the results of color completion
Task: Generates a complete color palette corresponding to the given text
Target: A single palette for an image
Step1: Create color and text embedding files of train, validation, and test on a notebook notebooks/preprocess.ipynb
data/data_t2p/color
: color corpus of train, validation, and test dataset, and color vocabulary from train datasetdata/data_t2p/text
: text input of train, validation, and test datasetdata/data_t2p/text/emb_clip
: pre-created text embedings of text contents and image labels for train, validation, and test
Step2: Generate colors based on given text on a notebook notebooks/palette_gen.ipynb
- Trained model of color generation are in
data/trained_models/
.
Train a color model on a notebook notebooks/train_model.ipynb.
data/data_colors/.
: palette-text pairs for train, validation, and test from Palette-And-Text dataset
data/trained_model
: trained model for text-aware palette generation
@inproceedings{qiu2023multimodal,
title = {Multimodal Color Recommendation in Vector Graphic Documents},
author = {Qiu, Qianru and Wang, Xueting and Otani, Mayu},
booktitle = {Proceedings of the 31st ACM International Conference on Multimedia},
pages = {4003--4011},
year = {2023}
}