-
-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimization not working as soon as Dense layer gets replaced with others (ex. RNN) #897
Comments
Using sensealg = SciMLSensitivity.ForwardDiffSensitivity() in the first two UPDATE: only if direct call works and the problem is in AD. |
Forward mode would likely be faster for a model of this size, yes. That said, the real issue is that what's being passed doesn't match the interface. Look at the error message: (::RNNCell{true})(::Tuple{AbstractMatrix, Tuple{AbstractMatrix}}, ::Any, ::NamedTuple) This is saying what the input of an RNNCell has to be. Now look at what's passed in:
It says it wants a https://lux.csail.mit.edu/dev/api/Lux/layers#Lux.RNNCell
So this title doesn't really make sense. Other layers work, there's lots of examples of this, for example https://docs.sciml.ai/DiffEqFlux/dev/examples/mnist_conv_neural_ode/ uses convolutional layers so "oh no nothing other than Dense works" is just false. What is true is that when you change the layer type to something different, the neural network library may require a slightly different input (as is the case with |
Thank you. |
You need to have a batch dimension for the layer to work. Also, note that your current code doesn't do what you want to do. It always treats the input as the first input in the sequence, which makes using an RNNCell quite pointless. You might want to look at https://lux.csail.mit.edu/dev/api/Lux/layers#Lux.StatefulRecurrentCell. That being said, you need to turn off the adaptivity of the solver, else stateful RNNs don't make sense since the time is not monotonically increasing. |
Hi everyone,
I have a problem with my code, in particulat while training the neural network, I've encountered a significant issue when attempting to replace a dense layer with other types of layers. Specifically, I've noticed that when I tried to introduce different layers, such as scalar layers or recurrent layers, my code started generating errors related to automatic differentiation (AD). These errors made it impossible to optimize the model and caused issues during gradient backpropagation. Despite my efforts to address this problem, it seems that AD is not compatible with the new layers, leading to a standstill in training.
I'm sure I'm missing something but I don't uderstand it. I would really appreciate some help.
I'm attachin the code, the csvs, and the error.
Thank you!
Internal gains.csv
phi heating tutto.csv
phisun totale.csv
Testerna totale.csv
phi heating modificato.csv
Here's is my code:
and the related error:
The text was updated successfully, but these errors were encountered: