Skip to content

Latest commit

 

History

History
69 lines (53 loc) · 2.37 KB

README.md

File metadata and controls

69 lines (53 loc) · 2.37 KB

TensorFlow Implementation of Stein Variational Gradient Descent (SVGD)

References

Usages

  1. Define network, and get gradients and variables, e.g.,
def network():
    '''
    Define target density and return gradients and variables. 
    '''
    return gradients, variables
  1. Define gradient descent optimizer, e.g.,
def make_gradient_optimizer():
    return tf.train.GradientDescentOptimizer(learning_rate=0.01)
  1. Build multiple networks (particles) using network() and take all those gradients and variables in grads_list and vars_list.

  2. Make SVGD optimizer, e.g.,

optimizer = SVGD(grads_list, vars_list, make_gradient_optimizer)
  1. In the training phase, optimizer.update_op will do single SVGD update, e.g.,
sess = tf.Session()
sess.run(optimizer.update_op, feed_dict={X: x, Y: y})

Examples

1D Gaussian mixture

  • The goal of this problem is to match the target density p(x) (mixture of two Gaussians) by moving the particles initially sampled from other distributions q(x). For details, I recommend you to see the experiment section in the authors' paper.

  • I got the following result:

  • NOTE THAT I compared my implementation with that of authors and checked the results are the same.

Bayesian Binary Classification

  • In this example, we want to classify binary data by using multiple neural classifier. I checked how SVGD differs from ensemble method in this example. I made a pdf file for detailed mathematical derivations.

  • I got the following results:

    • Thus, ensemble methods make particles to strongly classify samples, where as SVGD leads to draw the particles that characterize the posterior distribution.