Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regularization in Tensorflow Scala models #88

Open
mandar2812 opened this issue Mar 7, 2018 · 8 comments
Open

Regularization in Tensorflow Scala models #88

mandar2812 opened this issue Mar 7, 2018 · 8 comments

Comments

@mandar2812
Copy link

This is a bit of an open ended question, but from a practitioners perspective an important one I guess. How do I access/find out the regularization function being used in my model's learning process? I can see from the tensorflow_scala examples that the loss function can be explicitly specified and custom loss functions can be implemented, but how do we do the same for tweaking or extending regularizers?

@eaplatanios
Copy link
Owner

@mandar2812 The learn API does not currently have an explicit notion of a regularization function. We could add support for that if we think it's important to distinguish, but I believe that being able to define arbitrary loss functions (which you can do currently) and having easy access to the model variables (which you can currently sort of do, but we can look into making it easier), let's you create arbitrary regularizers.

Another option would be to tie regularizers to variables directly. The Python API has some support for something like that.

What do you think? :)

@mandar2812
Copy link
Author

@eaplatanios I was going through the code-base and saw that you already have org.platanios.tensorflow.api.ops.variables.Regularizer, which is also an input to tf.variable().

So I have extended it with L2Regularizer and L1Regularizer in DynaML.

But if you prefer, I think we can just copy these two files to org.platanios.tensorflow.api.ops.variables package, I can create a pull request for that, what do you think? Be a nice little first contribution to tf_scala on my part!

@eaplatanios
Copy link
Owner

@mandar2812 That raises an issue. I don't think these regularizers are being applied currently. I had some concerns about what's the right way to go about it and never got to it. I can make some edits to support them but my concern is that they can only be used from within the learn API (that's the only place where the library gets to build and manage a loss function). This means that they should probably be part of that API somehow and not of the variables, in order to avoid confusion (e.g., you might add them and then create your own train op, but never realize that they are actually not being used). So, we should probably find a nice way to go about that.

To clarify, when you add such a regularizer currently, it gets added in a graph collection. Some other part of the library needs to obtain the ops of that collection and add them to the loss function. That's not currently happening and the question is about where it should be happening.

As for contributing, that would be great! :) Let's hold on until we decide where to put the regularizers logic and you can then add them there. How does that sound?

@mandar2812
Copy link
Author

Sounds good!

@mandar2812
Copy link
Author

@eaplatanios I guess no matter which package holds the code to build the ops for regularizers, as long as one can specify the desired regularizer instance at the point of layer/network construction, it is fine from the usability point of view. And if we want to keep code expressing construction of layers/networks succinct (from an end user perspective), we can have sensible defaults in places like layer constructors api.learn.layers.Linear().

What do you think? Sorry to push the discussion, but I think this can be a crucial addition, because when using tf-scala for research applications, applying some/any regularization would be paramount.

@mandar2812
Copy link
Author

@eaplatanios As a temporary workaround and an alternate idea, I have implemented L2 Regularization as follows.

import org.platanios.tensorflow.api._
import org.platanios.tensorflow.api.learn.Mode
import org.platanios.tensorflow.api.learn.layers.Layer
import org.platanios.tensorflow.api.ops.variables.{ReuseExistingOnly, ReuseOrCreateNew}
import org.platanios.tensorflow.api.types.DataType

case class L2Regularization(names: Seq[String], dataTypes: Seq[String], reg: Double = 0.01) extends
  Layer[Output, Output]("") {

  override val layerType: String = s"L2Reg[gamma:$reg]"

  override protected def _forward(input: Output, mode: Mode): Output = {

    val weights = names.zip(dataTypes).map(n =>
      tf.variable(n._1, dataType = DataType.fromName(n._2), reuse = ReuseOrCreateNew)
    )

    val reg_term = weights.map(_.square.sum()).reduce(_.add(_)).multiply(0.5*reg)

    input.add(reg_term)
  }
}

case class L1Regularization(names: Seq[String], dataTypes: Seq[String], reg: Double = 0.01) extends
  Layer[Output, Output]("") {

  override val layerType: String = s"L2Reg[gamma:$reg]"

  override protected def _forward(input: Output, mode: Mode): Output = {

    val weights = names.zip(dataTypes).map(n =>
      tf.variable(n._1, dataType = DataType.fromName(n._2), reuse = ReuseOrCreateNew)
    )

    val reg_term = weights.map(_.abs.sum()).reduce(_.add(_)).multiply(reg)

    input.add(reg_term)
  }
}

So when using it in code it takes the form.

val layer_parameter_names = Seq("Linear_1/Weights", "Linear_2/Weights")

val loss = tf.learn.L2Loss("L2") >>
 L2Regularization(layer_parameter_names, Seq("FLOAT64", "FLOAT64"), reg) >>
 tf.learn.ScalarSummary("Loss", "ModelLoss")

If we want a more transparent API in terms of the user specifying it as a part of the loss function, I think it can take a form similar to this, what do you think?

@eaplatanios
Copy link
Owner

@mandar2812 The main reason I don't like this is because the "wiring" to the model parameters is done through their names. Names are generally automatically generated and may differ than what the user thinks. Therefore, this is very bug prone. I was leaning more towards an easy way to obtain a reference to a layers' parameters (before it's instantiated) and define the regularizers in terms of that.

@eaplatanios
Copy link
Owner

I'm sorry I've been focusing on updates to support TensorFlow 1.7.0 and sort out some bugs. I'll work on a solution for this right after. Thanks for the suggestions though. It really helps hearing your opinion on design choices. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants