Keras 2.14 optimizer format changed causing simple models to not import #10042

YusifCodes · 2023-11-03T13:02:06Z

Hey everybody.

I am running into an issue when loading a simple python keras model.

Python keras model:

`model = keras.Sequential([
keras.layers.Dense(32, activation='relu', input_shape=(132,)),
keras.layers.Dropout(.2),
keras.layers.Dense(16, activation='relu'),
keras.layers.Dense(41, activation='softmax')
])
optimizer = SGD(learning_rate=0.1)
model.compile(optimizer=optimizer, loss=keras.losses.sparse_categorical_crossentropy, metrics=['accuracy'])

model.fit(train_x, train_y, epochs=10, batch_size=32, validation_split=0.1)

test_loss, test_accuracy = model.evaluate(test_x, test_y)
print(f"Test accuracy: {test_accuracy}")

model.save("test_model.h5")`

Java code that gives me the error:

MultiLayerNetwork model = KerasModelImport.importKerasSequentialModelAndWeights("path/to/model.h5");

The error itself:
org.deeplearning4j.nn.modelimport.keras.exceptions.UnsupportedKerasConfigurationException

The error message: Optimizer with name Custom>SGDcan not bematched to a DL4J optimizer. Note that custom TFOptimizers are not supported by model import.

Version Information

Windows 10 x64
JDK 21

My dependencies look like this:
https://gist.github.com/YusifCodes/1c275a810b5c966c50fb4303ae3143a7

Note: I used dl4j 1.0.0-M2.1, 1.0.0-beta6, 1.0.0-beta7 previously got the same error. I also replaced Adam with SGD, still no luck.

I relly wish we can resolve this issue, thanks!

The text was updated successfully, but these errors were encountered:

agibsonccc · 2023-11-03T13:11:04Z

@YusifCodes it tells you what the error is. This looks like it thinks it's a custom optimizer. I'm not even sure where that's coming from. It might be a newer optimizer in keras. I'll need more context than what you're telling me here. You're saying you replaced it but the error message is contradicting you here. It's named "SGD scan"? For anything with model import don't always just assume that it blanket doesn't work. Always pay attention to both versions.

YusifCodes · 2023-11-03T13:12:13Z

I am pretty sure adam and sgd are not custom, am I wrong?

agibsonccc · 2023-11-03T13:12:29Z

@YusifCodes yeah they are. I'm guessing it's a new version issue. That error message is strange. Are you using keras 3.0 or something?

YusifCodes · 2023-11-03T13:13:58Z

Name: keras Version: 2.14.0 Summary: Deep learning for humans. Home-page: https://keras.io/ Author: Keras team Author-email: [email protected] License: Apache 2.0 Location: c:\users\user\appdata\local\programs\python\python310\lib\site-packages Requires: Required-by: tensorflow-intel

My keras package.

Do you have any suggestions?

I was using the Adam optimizer before this and in the error was totally the same, except it was like this Custom>Adamcan for the name.

agibsonccc · 2023-11-03T13:43:34Z

Yeah that's definitely odd...that's new. I'll treat that as the main issue for now. For now try 2.12 or something. It looks like this thi sis the intel fork?

YusifCodes · 2023-11-03T13:50:28Z

I'll try 2.12 now, I will tell you how it goes, thanks.
Not sure what you mean about it being an intel fork.

agibsonccc · 2023-11-03T13:53:10Z

@YusifCodes oh sorry that's the package requirement. I see now. Either way, the hardware vendors tend to publish forks of the python frameworks. That's what I thought that was. Try a few different older versions and see what happens. There's no reason one of them shouldn't work.

YusifCodes · 2023-11-06T13:41:37Z

Hello there, sorry for a late reply. I got it working with 2.12 last friday. But after making some minor altercations to my model, saving it and trying to run it in java I am getting the same error. I tried 2.11, 2.13.1 still no luck. Thanks.

agibsonccc · 2023-11-06T22:14:12Z

Did you resave it with keras 2.14 again?

YusifCodes · 2023-11-07T05:29:05Z

Nope, I resaved it with .11, .12. .13.1 it is the same everywhere.

ParshinAlex · 2023-11-20T09:52:15Z

Guys, for me it's the same problem, I was debugging it, and what I saw is that problem possibly can be in dl4j 1.0.0-M2.1 module, in particular in class KerasOptimizerUtils (package org.deeplearning4j.nn.modelimport.keras.utils).

I guess some time ago code in Keras was changed to that names of the optimizers were changed from simple Adam, SGD and so on to Custom>Adam and so on. This change is treated by current code on master of dl4j, BUT in the decompiled class KerasOptimizerUtils, which lays in .jar of version 1.0.0-M2.1 I see (possibly) some old code, which does not treat this change - and produces and exception mentioned by YusifCodes.

Here is the code:

public KerasOptimizerUtils() {
}

public static IUpdater mapOptimizer(Map<String, Object> optimizerConfig) throws UnsupportedKerasConfigurationException, InvalidKerasConfigurationException {
    if (!optimizerConfig.containsKey("class_name")) {
        throw new InvalidKerasConfigurationException("Optimizer config does not contain a name field.");
    } else {
        String optimizerName = (String)optimizerConfig.get("class_name");
        if (!optimizerConfig.containsKey("config")) {
            throw new InvalidKerasConfigurationException("Field config missing from layer config");
        } else {
            Map<String, Object> optimizerParameters = (Map)optimizerConfig.get("config");
            Object dl4jOptimizer;
            double lr;
            double rho;
            double epsilon;
            double decay;
            double scheduleDecay;
            switch (optimizerName) {
                case "Adam":
                    lr = (Double)(optimizerParameters.containsKey("lr") ? optimizerParameters.get("lr") : optimizerParameters.get("learning_rate"));
                    rho = (Double)optimizerParameters.get("beta_1");
                    epsilon = (Double)optimizerParameters.get("beta_2");
                    decay = (Double)optimizerParameters.get("epsilon");
                    scheduleDecay = (Double)optimizerParameters.get("decay");
                    dl4jOptimizer = (new Adam.Builder()).beta1(rho).beta2(epsilon).epsilon(decay).learningRate(lr).learningRateSchedule(scheduleDecay == 0.0 ? null : new InverseSchedule(ScheduleType.ITERATION, lr, scheduleDecay, 1.0)).build();
                    break;
                case "Adadelta":
                    lr = (Double)optimizerParameters.get("rho");
                    rho = (Double)optimizerParameters.get("epsilon");
                    dl4jOptimizer = (new AdaDelta.Builder()).epsilon(rho).rho(lr).build();
                    break;
                case "Adgrad":
                    lr = (Double)(optimizerParameters.containsKey("lr") ? optimizerParameters.get("lr") : optimizerParameters.get("learning_rate"));
                    rho = (Double)optimizerParameters.get("epsilon");
                    epsilon = (Double)optimizerParameters.get("decay");
                    dl4jOptimizer = (new AdaGrad.Builder()).epsilon(rho).learningRate(lr).learningRateSchedule(epsilon == 0.0 ? null : new InverseSchedule(ScheduleType.ITERATION, lr, epsilon, 1.0)).build();
                    break;
                case "Adamax":
                    lr = (Double)(optimizerParameters.containsKey("lr") ? optimizerParameters.get("lr") : optimizerParameters.get("learning_rate"));
                    rho = (Double)optimizerParameters.get("beta_1");
                    epsilon = (Double)optimizerParameters.get("beta_2");
                    decay = (Double)optimizerParameters.get("epsilon");
                    dl4jOptimizer = new AdaMax(lr, rho, epsilon, decay);
                    break;
                case "Nadam":
                    lr = (Double)(optimizerParameters.containsKey("lr") ? optimizerParameters.get("lr") : optimizerParameters.get("learning_rate"));
                    rho = (Double)optimizerParameters.get("beta_1");
                    epsilon = (Double)optimizerParameters.get("beta_2");
                    decay = (Double)optimizerParameters.get("epsilon");
                    scheduleDecay = (Double)optimizerParameters.getOrDefault("schedule_decay", 0.0);
                    dl4jOptimizer = (new Nadam.Builder()).beta1(rho).beta2(epsilon).epsilon(decay).learningRate(lr).learningRateSchedule(scheduleDecay == 0.0 ? null : new InverseSchedule(ScheduleType.ITERATION, lr, scheduleDecay, 1.0)).build();
                    break;
                case "SGD":
                    lr = (Double)(optimizerParameters.containsKey("lr") ? optimizerParameters.get("lr") : optimizerParameters.get("learning_rate"));
                    rho = (Double)(optimizerParameters.containsKey("epsilon") ? optimizerParameters.get("epsilon") : optimizerParameters.get("momentum"));
                    epsilon = (Double)optimizerParameters.get("decay");
                    dl4jOptimizer = (new Nesterovs.Builder()).momentum(rho).learningRate(lr).learningRateSchedule(epsilon == 0.0 ? null : new InverseSchedule(ScheduleType.ITERATION, lr, epsilon, 1.0)).build();
                    break;
                case "RMSprop":
                    lr = (Double)(optimizerParameters.containsKey("lr") ? optimizerParameters.get("lr") : optimizerParameters.get("learning_rate"));
                    rho = (Double)optimizerParameters.get("rho");
                    epsilon = (Double)optimizerParameters.get("epsilon");
                    decay = (Double)optimizerParameters.get("decay");
                    dl4jOptimizer = (new RmsProp.Builder()).epsilon(epsilon).rmsDecay(rho).learningRate(lr).learningRateSchedule(decay == 0.0 ? null : new InverseSchedule(ScheduleType.ITERATION, lr, decay, 1.0)).build();
                    break;
                default:
                    throw new UnsupportedKerasConfigurationException("Optimizer with name " + optimizerName + "can not bematched to a DL4J optimizer. Note that custom TFOptimizers are not supported by model import");
            }

            return (IUpdater)dl4jOptimizer;
        }
    }
}

}

As you can see, it expects simple names of optimizers, and no treating of Custom>... is here, which is different from code on master.

agibsonccc, please, can you check this and tell, whether it's only my problem with versions, or is it really old code in 1.0.0-M2.1 version in class KerasOptimizerUtils?

treo · 2023-11-20T09:58:51Z

No need to go and decompile anything. You can literally see the code as it was when M2.1 was released: https://github.com/deeplearning4j/deeplearning4j/blob/1.0.0-M2.1/deeplearning4j/deeplearning4j-modelimport/src/main/java/org/deeplearning4j/nn/modelimport/keras/utils/KerasOptimizerUtils.java

This is the PR that addressed the changes in Keras: #9939

ParshinAlex · 2023-11-20T10:43:59Z

Treo, will this change go to next release or can we use it somehow?

agibsonccc · 2023-11-20T11:00:01Z

@ParshinAlex I'll fix whatever is going on here in the next release. I'm on the tail end of more important testing (the underlying cuda internals) that's unfortunately gone on longer than I'd like but I've hit 90% of the milestones I need for that and will turn my attention to some of the minor issues like this next. Unfortunately unless you're willing to be part of the solution either via being a paying customer of mine or writing the code yourself you'll just have to wait and downgrade for now.

ParshinAlex · 2023-11-20T11:06:43Z

@agibsonccc @treo Thank you for the explanations and effort, now it's clear

Kali-Zoidberg · 2023-11-20T19:18:56Z

I was also running into this problem but resolved it by setting the enforceTrainingConfig param in importKerasSequentialModelAndWeight to false (as I'm only using the model for inference)

KerasModelImport.importKerasSequentialModelAndWeights(configPath + "model.h5", false);

Edit: It solves the runtime issue but the model gets hung up on loading.

agibsonccc · 2023-11-21T00:10:25Z

That doesn't happen for random reasons. Whatever is going on there might be model specific. Code doesn't just mysteriously "hang". It has a reason. Can you look in to this using jstack or post the model somewhere? I don't need anything secret just something to reproduce it.

Kali-Zoidberg · 2023-11-21T04:13:47Z

That doesn't happen for random reasons. Whatever is going on there might be model specific. Code doesn't just mysteriously "hang". It has a reason. Can you look in to this using jstack or post the model somewhere? I don't need anything secret just something to reproduce it.

Sure thing, here is the stack trace. I noticed that the stack trace showed ND4jCpu.execCustomOp2 so I changed the ND4J backend to use the GPU. The model loaded after that. I am using an LSTM network and have changed the file ext from .h5 to .txt to upload it here stacked_lstm - Copy.txt

2023-11-20 16:25:42
Full thread dump Java HotSpot(TM) 64-Bit Server VM (18.0.1+10-24 mixed mode, sharing):

"main" #1 prio=5 os_prio=0 cpu=150531.25ms elapsed=154.88s tid=0x000001fba9e95630 nid=9376 runnable  [0x0000005b5ecfe000]
   java.lang.Thread.State: RUNNABLE
        at org.nd4j.linalg.cpu.nativecpu.bindings.Nd4jCpu.execCustomOp2(Native Method)
        at org.nd4j.linalg.cpu.nativecpu.ops.NativeOpExecutioner.exec(NativeOpExecutioner.java:1900)
        at org.nd4j.linalg.cpu.nativecpu.ops.NativeOpExecutioner.exec(NativeOpExecutioner.java:1540)
        at org.nd4j.linalg.factory.Nd4j.exec(Nd4j.java:6545)
        at org.nd4j.linalg.api.rng.distribution.impl.OrthogonalDistribution.sample(OrthogonalDistribution.java:240)
        at org.nd4j.linalg.api.rng.distribution.impl.OrthogonalDistribution.sample(OrthogonalDistribution.java:255)
        at org.deeplearning4j.nn.weights.WeightInitDistribution.init(WeightInitDistribution.java:48)
        at org.deeplearning4j.nn.params.LSTMParamInitializer.init(LSTMParamInitializer.java:143)
        at org.deeplearning4j.nn.conf.layers.LSTM.instantiate(LSTM.java:82)
        at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.init(MultiLayerNetwork.java:720)
        at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.init(MultiLayerNetwork.java:605)
        at org.deeplearning4j.nn.modelimport.keras.KerasSequentialModel.getMultiLayerNetwork(KerasSequentialModel.java:266)
        at org.deeplearning4j.nn.modelimport.keras.KerasSequentialModel.getMultiLayerNetwork(KerasSequentialModel.java:255)
        at org.deeplearning4j.nn.modelimport.keras.KerasModelImport.importKerasSequentialModelAndWeights(KerasModelImport.java:204)
        at org.deeplearning4j.nn.modelimport.keras.KerasModelImport.importKerasSequentialModelAndWeights(KerasModelImport.java:89)
        at server.ServerMain.loadKerasModel(ServerMain.java:155)
        at server.ServerMain.main(ServerMain.java:55)

   Locked ownable synchronizers:
        - None

trevorrovert · 2024-02-22T18:20:07Z

So I may have discovered something related to this issue. I was running into the same issues described above and read through the comments and saw the latest one from @Kali-Zoidberg, specifically how the issue was resolved by switching to a GPU backend. That got me thinking about the number of connections between the hidden layers of my model. I dropped my LSTM units from 100 down to 8 and this seemed to resolve my issue. I was able to bump the units up to 15 and still get the model to load on CPU but was not able to load at anything above 15. Hope this helps.

agibsonccc changed the title ~~Can't import a simple keras model into my Java app~~ Keras 2.14 optimizer format changed causing simple models to not import Nov 3, 2023

agibsonccc added the DL4J Keras Issues related to Keras import label Nov 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Keras 2.14 optimizer format changed causing simple models to not import #10042

Keras 2.14 optimizer format changed causing simple models to not import #10042

YusifCodes commented Nov 3, 2023 •

edited

agibsonccc commented Nov 3, 2023 •

edited

YusifCodes commented Nov 3, 2023

agibsonccc commented Nov 3, 2023 •

edited

YusifCodes commented Nov 3, 2023 •

edited

agibsonccc commented Nov 3, 2023

YusifCodes commented Nov 3, 2023

agibsonccc commented Nov 3, 2023

YusifCodes commented Nov 6, 2023 •

edited

agibsonccc commented Nov 6, 2023

YusifCodes commented Nov 7, 2023 •

edited

ParshinAlex commented Nov 20, 2023 •

edited

treo commented Nov 20, 2023

ParshinAlex commented Nov 20, 2023

agibsonccc commented Nov 20, 2023 •

edited

ParshinAlex commented Nov 20, 2023

Kali-Zoidberg commented Nov 20, 2023 •

edited

agibsonccc commented Nov 21, 2023

Kali-Zoidberg commented Nov 21, 2023 •

edited

trevorrovert commented Feb 22, 2024

Keras 2.14 optimizer format changed causing simple models to not import #10042

Keras 2.14 optimizer format changed causing simple models to not import #10042

Comments

YusifCodes commented Nov 3, 2023 • edited

Version Information

agibsonccc commented Nov 3, 2023 • edited

YusifCodes commented Nov 3, 2023

agibsonccc commented Nov 3, 2023 • edited

YusifCodes commented Nov 3, 2023 • edited

agibsonccc commented Nov 3, 2023

YusifCodes commented Nov 3, 2023

agibsonccc commented Nov 3, 2023

YusifCodes commented Nov 6, 2023 • edited

agibsonccc commented Nov 6, 2023

YusifCodes commented Nov 7, 2023 • edited

ParshinAlex commented Nov 20, 2023 • edited

treo commented Nov 20, 2023

ParshinAlex commented Nov 20, 2023

agibsonccc commented Nov 20, 2023 • edited

ParshinAlex commented Nov 20, 2023

Kali-Zoidberg commented Nov 20, 2023 • edited

agibsonccc commented Nov 21, 2023

Kali-Zoidberg commented Nov 21, 2023 • edited

trevorrovert commented Feb 22, 2024

YusifCodes commented Nov 3, 2023 •

edited

agibsonccc commented Nov 3, 2023 •

edited

agibsonccc commented Nov 3, 2023 •

edited

YusifCodes commented Nov 3, 2023 •

edited

YusifCodes commented Nov 6, 2023 •

edited

YusifCodes commented Nov 7, 2023 •

edited

ParshinAlex commented Nov 20, 2023 •

edited

agibsonccc commented Nov 20, 2023 •

edited

Kali-Zoidberg commented Nov 20, 2023 •

edited

Kali-Zoidberg commented Nov 21, 2023 •

edited