Regularization in Tensorflow Scala models #88

mandar2812 · 2018-03-07T17:57:43Z

This is a bit of an open ended question, but from a practitioners perspective an important one I guess. How do I access/find out the regularization function being used in my model's learning process? I can see from the tensorflow_scala examples that the loss function can be explicitly specified and custom loss functions can be implemented, but how do we do the same for tweaking or extending regularizers?

eaplatanios · 2018-03-09T16:34:01Z

@mandar2812 The learn API does not currently have an explicit notion of a regularization function. We could add support for that if we think it's important to distinguish, but I believe that being able to define arbitrary loss functions (which you can do currently) and having easy access to the model variables (which you can currently sort of do, but we can look into making it easier), let's you create arbitrary regularizers.

Another option would be to tie regularizers to variables directly. The Python API has some support for something like that.

What do you think? :)

mandar2812 · 2018-03-12T15:58:44Z

@eaplatanios I was going through the code-base and saw that you already have org.platanios.tensorflow.api.ops.variables.Regularizer, which is also an input to tf.variable().

So I have extended it with L2Regularizer and L1Regularizer in DynaML.

But if you prefer, I think we can just copy these two files to org.platanios.tensorflow.api.ops.variables package, I can create a pull request for that, what do you think? Be a nice little first contribution to tf_scala on my part!

eaplatanios · 2018-03-12T17:26:27Z

@mandar2812 That raises an issue. I don't think these regularizers are being applied currently. I had some concerns about what's the right way to go about it and never got to it. I can make some edits to support them but my concern is that they can only be used from within the learn API (that's the only place where the library gets to build and manage a loss function). This means that they should probably be part of that API somehow and not of the variables, in order to avoid confusion (e.g., you might add them and then create your own train op, but never realize that they are actually not being used). So, we should probably find a nice way to go about that.

To clarify, when you add such a regularizer currently, it gets added in a graph collection. Some other part of the library needs to obtain the ops of that collection and add them to the loss function. That's not currently happening and the question is about where it should be happening.

As for contributing, that would be great! :) Let's hold on until we decide where to put the regularizers logic and you can then add them there. How does that sound?

mandar2812 · 2018-03-12T19:01:50Z

Sounds good!

mandar2812 · 2018-03-15T16:45:33Z

@eaplatanios I guess no matter which package holds the code to build the ops for regularizers, as long as one can specify the desired regularizer instance at the point of layer/network construction, it is fine from the usability point of view. And if we want to keep code expressing construction of layers/networks succinct (from an end user perspective), we can have sensible defaults in places like layer constructors api.learn.layers.Linear().

What do you think? Sorry to push the discussion, but I think this can be a crucial addition, because when using tf-scala for research applications, applying some/any regularization would be paramount.

mandar2812 · 2018-03-16T16:01:33Z

@eaplatanios As a temporary workaround and an alternate idea, I have implemented L2 Regularization as follows.

import org.platanios.tensorflow.api._
import org.platanios.tensorflow.api.learn.Mode
import org.platanios.tensorflow.api.learn.layers.Layer
import org.platanios.tensorflow.api.ops.variables.{ReuseExistingOnly, ReuseOrCreateNew}
import org.platanios.tensorflow.api.types.DataType

case class L2Regularization(names: Seq[String], dataTypes: Seq[String], reg: Double = 0.01) extends
  Layer[Output, Output]("") {

  override val layerType: String = s"L2Reg[gamma:$reg]"

  override protected def _forward(input: Output, mode: Mode): Output = {

    val weights = names.zip(dataTypes).map(n =>
      tf.variable(n._1, dataType = DataType.fromName(n._2), reuse = ReuseOrCreateNew)
    )

    val reg_term = weights.map(_.square.sum()).reduce(_.add(_)).multiply(0.5*reg)

    input.add(reg_term)
  }
}

case class L1Regularization(names: Seq[String], dataTypes: Seq[String], reg: Double = 0.01) extends
  Layer[Output, Output]("") {

  override val layerType: String = s"L2Reg[gamma:$reg]"

  override protected def _forward(input: Output, mode: Mode): Output = {

    val weights = names.zip(dataTypes).map(n =>
      tf.variable(n._1, dataType = DataType.fromName(n._2), reuse = ReuseOrCreateNew)
    )

    val reg_term = weights.map(_.abs.sum()).reduce(_.add(_)).multiply(reg)

    input.add(reg_term)
  }
}

So when using it in code it takes the form.

val layer_parameter_names = Seq("Linear_1/Weights", "Linear_2/Weights")

val loss = tf.learn.L2Loss("L2") >>
 L2Regularization(layer_parameter_names, Seq("FLOAT64", "FLOAT64"), reg) >>
 tf.learn.ScalarSummary("Loss", "ModelLoss")

If we want a more transparent API in terms of the user specifying it as a part of the loss function, I think it can take a form similar to this, what do you think?

eaplatanios · 2018-03-16T17:05:19Z

@mandar2812 The main reason I don't like this is because the "wiring" to the model parameters is done through their names. Names are generally automatically generated and may differ than what the user thinks. Therefore, this is very bug prone. I was leaning more towards an easy way to obtain a reference to a layers' parameters (before it's instantiated) and define the regularizers in terms of that.

eaplatanios · 2018-03-16T17:06:12Z

I'm sorry I've been focusing on updates to support TensorFlow 1.7.0 and sort out some bugs. I'll work on a solution for this right after. Thanks for the suggestions though. It really helps hearing your opinion on design choices. :)

eaplatanios added the enhancement label Mar 9, 2018

eaplatanios mentioned this issue Apr 28, 2018

Revamping the Learn API #102

Closed

mandar2812 mentioned this issue Mar 22, 2019

Problems with Variable Scope #156

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regularization in Tensorflow Scala models #88

Regularization in Tensorflow Scala models #88

mandar2812 commented Mar 7, 2018

eaplatanios commented Mar 9, 2018

mandar2812 commented Mar 12, 2018

eaplatanios commented Mar 12, 2018

mandar2812 commented Mar 12, 2018

mandar2812 commented Mar 15, 2018

mandar2812 commented Mar 16, 2018

eaplatanios commented Mar 16, 2018

eaplatanios commented Mar 16, 2018

Regularization in Tensorflow Scala models #88

Regularization in Tensorflow Scala models #88

Comments

mandar2812 commented Mar 7, 2018

eaplatanios commented Mar 9, 2018

mandar2812 commented Mar 12, 2018

eaplatanios commented Mar 12, 2018

mandar2812 commented Mar 12, 2018

mandar2812 commented Mar 15, 2018

mandar2812 commented Mar 16, 2018

eaplatanios commented Mar 16, 2018

eaplatanios commented Mar 16, 2018