Skip to content

Latest commit

 

History

History
317 lines (300 loc) · 10.3 KB

README.md

File metadata and controls

317 lines (300 loc) · 10.3 KB

Variational Gradient Flow for Deep Generative Learning

A tensorflow implementation of VGrow by using progressive growing method descriped in the following paper:

System requirements

  • We only test our model on Linux.
  • 64-bit Python 3.6 and Tensorflow 1.12.0
  • When you want to generate higher resolution image than 128x128, We recommend GPU with at least 16GB memory.
  • NVIDIA driver 384.145 or newer, CUDA toolkit 9.0 or newer, cuDNN 7.1.2 or newer. We test the code based on the following two configuration.
    • NIVDIA driver 384.145, CUDA V9.0.176, Tesla V100
    • NVIDIA driver 410.93 , CUDA V10.0.130, RTX 2080 Ti

Results

We train VGrow-Pg model based on different f-divergence such as KL-divergence, JS-divergence, Jeffreys-divergence and our new proposed logD-divergence. Here we only show the complete process of progressive growing based on KL-divergence.

Resolution 4x4 8x8 16x16 32x32
MNIST
Fashion-MNIST
CIFAR-10
Resolution 4x4 8x8 16x16
CelebA
Resolution 32x32 64x64 128x128
CelebA

Other-divergence

We show all dataset final resolution results from each f-divergence.

KL-divergence JS-divergence Jeffreys-divergence logD-divergence
MNIST
Fashion-MNIST
CIFAR-10
LSUN-Bedroom
KL-divergence JS-divergence
CelebA
Jeffreys-divergence logD-divergence
CelebA
KL-divergence JS-divergence
LSUN-Church
Jeffreys-divergence logD-divergence
LSUN-Church
KL-divergence JS-divergence
Portrait
Jeffreys-divergence logD-divergence
Portrait

Latent space manipulation

We first generate 10,000 faces using the network trained with CelebA dataset and KL divergence. We use the age and gender classification networks provided in https://github.com/dpressel/rude-carnie for those generated faces, and then find a latent direction that controls these semantics. For example, we apply a logistic regression for gender and regard the normal of decision boundary as the direction.

Manipulating gender

alt textalt textalt textalt textalt textalt text

Manipulating age

alt textalt textalt textalt textalt textalt text

Usage

Command

We provide all arguments with default value and you can run this program with CIFAR-10 dataset by bash cifar10.sh. Training with other datasets is similar.

Arguments in train.py

  • --gpu: Specific GPU to use. Default: 0
  • --dataset: Training dataset. Default: mnist
  • --divergence: f-divergence. Default: KL
  • --path: Output path. Default: ./results
  • --seed: Random seed. Default: 1234
  • --init_resolution: Initial resolution of images. Default: 4
  • --z_dim: Dimension of latent vector. Default: 512
  • --dur_nimg: Number of images used for a phase. Default: 600000
  • --total_nimg: Total number of images used for training. Default: 18000000
  • --pool_size: Number of batches of a pool. Default: 1
  • --T: Number of loops for moving particles. Default: 1
  • --U: Number of loops for training D. Default: 1
  • --L: Number of loops for training G. Default: 1
  • --num_row: Number images in a line of image grid. Default: 10
  • --num_line: Number images in a row of image grid. Default: 10
  • --use_gp: Whether use gradient penalty or not. Default: True
  • --coef_gp: Coefficient of gradient penalty. Default: 1
  • --target_gp: Target of gradient penalty. Default: 1
  • --coef_smoothing: Coefficient of generator moving average. Default: 0.99
  • --resume_training: Whether resume Training or not. Default: False
  • --resume_num: Resume number of images. Default: 0

Link

The Portrait dataset is available at https://drive.google.com/file/d/1j_a2OXB_2rhaVqojzSPJLv_bDrSjHguR/view?usp=sharing

Developer and Maintainer

Gefei WANG, HKUST

Contact Information

Please feel free to contact Gefei WANG [email protected] or Prof. Can YANG [email protected] if any questions.