Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concatenate layer requires inputs with matching shapes #22

Open
oonisim opened this issue Jun 11, 2022 · 1 comment
Open

Concatenate layer requires inputs with matching shapes #22

oonisim opened this issue Jun 11, 2022 · 1 comment

Comments

@oonisim
Copy link

oonisim commented Jun 11, 2022

Problem

03-training-formalization.ipynb

ValueError: A `Concatenate` layer requires inputs with matching shapes except for the concatenation axis. Received: input_shape=[(None, 2), (None, 4), (7,), (None, 3), (None, 1), (None, 1), (6,), (None, 3), (None, 3), (None, 1), (None, 10)]

At:

_train_module_file = 'src/model_training/runner.py'

trainer = tfx.components.Trainer(
    module_file=_train_module_file,
    examples=transform.outputs['transformed_examples'],
    schema=schema_importer.outputs['result'],
    base_model=latest_model_resolver.outputs['latest_model'],
    transform_graph=transform.outputs['transform_graph'],
    hyperparameters=hyperparams_gen.outputs['hyperparameters'],
)

context.run(trainer, enable_cache=False)

Output:

running bdist_wheel
running build
running build_py
creating build
creating build/lib
copying trainer.py -> build/lib
copying runner.py -> build/lib
copying defaults.py -> build/lib
copying task.py -> build/lib
copying model.py -> build/lib
copying exporter.py -> build/lib
copying data.py -> build/lib
installing to /tmp/tmpaet8vft8
running install
running install_lib
copying build/lib/task.py -> /tmp/tmpaet8vft8
copying build/lib/model.py -> /tmp/tmpaet8vft8
copying build/lib/data.py -> /tmp/tmpaet8vft8
copying build/lib/runner.py -> /tmp/tmpaet8vft8
copying build/lib/defaults.py -> /tmp/tmpaet8vft8
copying build/lib/trainer.py -> /tmp/tmpaet8vft8
copying build/lib/exporter.py -> /tmp/tmpaet8vft8
running install_egg_info
running egg_info
creating tfx_user_code_Trainer.egg-info
writing tfx_user_code_Trainer.egg-info/PKG-INFO
writing dependency_links to tfx_user_code_Trainer.egg-info/dependency_links.txt
writing top-level names to tfx_user_code_Trainer.egg-info/top_level.txt
writing manifest file 'tfx_user_code_Trainer.egg-info/SOURCES.txt'
reading manifest file 'tfx_user_code_Trainer.egg-info/SOURCES.txt'
writing manifest file 'tfx_user_code_Trainer.egg-info/SOURCES.txt'
Copying tfx_user_code_Trainer.egg-info to /tmp/tmpaet8vft8/tfx_user_code_Trainer-0.0+5bd9ac13044cc46c88b21ccdcff2b74d85a1c15be76527e33008964fa7da7d12-py3.7.egg-info
running install_scripts
creating /tmp/tmpaet8vft8/tfx_user_code_Trainer-0.0+5bd9ac13044cc46c88b21ccdcff2b74d85a1c15be76527e33008964fa7da7d12.dist-info/WHEEL
creating '/tmp/tmpv9boslhs/tfx_user_code_Trainer-0.0+5bd9ac13044cc46c88b21ccdcff2b74d85a1c15be76527e33008964fa7da7d12-py3-none-any.whl' and adding '/tmp/tmpaet8vft8' to it
adding 'data.py'
adding 'defaults.py'
adding 'exporter.py'
adding 'model.py'
adding 'runner.py'
adding 'task.py'
adding 'trainer.py'
adding 'tfx_user_code_Trainer-0.0+5bd9ac13044cc46c88b21ccdcff2b74d85a1c15be76527e33008964fa7da7d12.dist-info/METADATA'
adding 'tfx_user_code_Trainer-0.0+5bd9ac13044cc46c88b21ccdcff2b74d85a1c15be76527e33008964fa7da7d12.dist-info/WHEEL'
adding 'tfx_user_code_Trainer-0.0+5bd9ac13044cc46c88b21ccdcff2b74d85a1c15be76527e33008964fa7da7d12.dist-info/top_level.txt'
adding 'tfx_user_code_Trainer-0.0+5bd9ac13044cc46c88b21ccdcff2b74d85a1c15be76527e33008964fa7da7d12.dist-info/RECORD'
removing /tmp/tmpaet8vft8
/opt/conda/lib/python3.7/site-packages/setuptools/command/install.py:37: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
  setuptools.SetuptoolsDeprecationWarning,
WARNING:absl:Examples artifact does not have payload_format custom property. Falling back to FORMAT_TF_EXAMPLE
WARNING:absl:Examples artifact does not have payload_format custom property. Falling back to FORMAT_TF_EXAMPLE
WARNING:absl:Examples artifact does not have payload_format custom property. Falling back to FORMAT_TF_EXAMPLE
WARNING: Ignoring invalid distribution -umpy (/opt/conda/lib/python3.7/site-packages)
Processing /tmp/tmp31cqmx79/tfx_user_code_Trainer-0.0+5bd9ac13044cc46c88b21ccdcff2b74d85a1c15be76527e33008964fa7da7d12-py3-none-any.whl
WARNING: Ignoring invalid distribution -umpy (/opt/conda/lib/python3.7/site-packages)
Installing collected packages: tfx-user-code-Trainer
Successfully installed tfx-user-code-Trainer-0.0+5bd9ac13044cc46c88b21ccdcff2b74d85a1c15be76527e33008964fa7da7d12
WARNING: Ignoring invalid distribution -umpy (/opt/conda/lib/python3.7/site-packages)
WARNING: Ignoring invalid distribution -umpy (/opt/conda/lib/python3.7/site-packages)
WARNING: Ignoring invalid distribution -umpy (/opt/conda/lib/python3.7/site-packages)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/tmp/ipykernel_10107/3433963696.py in <module>
     10 )
     11 
---> 12 context.run(trainer, enable_cache=False)

~/.local/lib/python3.7/site-packages/tfx/orchestration/experimental/interactive/notebook_utils.py in run_if_ipython(*args, **kwargs)
     29       # __IPYTHON__ variable is set by IPython, see
     30       # https://ipython.org/ipython-doc/rel-0.10.2/html/interactive/reference.html#embedding-ipython.
---> 31       return fn(*args, **kwargs)
     32     else:
     33       logging.warning(

~/.local/lib/python3.7/site-packages/tfx/orchestration/experimental/interactive/interactive_context.py in run(self, component, enable_cache, beam_pipeline_args)
    162         telemetry_utils.LABEL_TFX_RUNNER: runner_label,
    163     }):
--> 164       execution_id = launcher.launch().execution_id
    165 
    166     return execution_result.ExecutionResult(

~/.local/lib/python3.7/site-packages/tfx/orchestration/launcher/base_component_launcher.py in launch(self)
    201                          copy.deepcopy(execution_decision.input_dict),
    202                          execution_decision.output_dict,
--> 203                          copy.deepcopy(execution_decision.exec_properties))
    204 
    205     absl.logging.info('Running publisher for %s',

~/.local/lib/python3.7/site-packages/tfx/orchestration/launcher/in_process_component_launcher.py in _run_executor(self, execution_id, input_dict, output_dict, exec_properties)
     72     # output_dict can still be changed, specifically properties.
     73     executor.Do(
---> 74         copy.deepcopy(input_dict), output_dict, copy.deepcopy(exec_properties))

~/.local/lib/python3.7/site-packages/tfx/components/trainer/executor.py in Do(self, input_dict, output_dict, exec_properties)
    176     # Train the model
    177     absl.logging.info('Training model.')
--> 178     run_fn(fn_args)
    179 
    180     # Note: If trained with multi-node distribution workers, it is the user

/tmp/tmp7ilh_lit/runner.py in run_fn(fn_args)
     51         hyperparams=hyperparams,
     52         log_dir=log_dir,
---> 53         base_model_dir=fn_args.base_model,
     54     )
     55 

~/home/repositories/git/oonisim/python-programs/courses/mlops-with-vertex-ai/src/model_training/trainer.py in train(train_data_dir, eval_data_dir, tft_output_dir, hyperparams, log_dir, base_model_dir)
     57     tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir)
     58 
---> 59     classifier = model.create_binary_classifier(tft_output, hyperparams)
     60     if base_model_dir:
     61         try:

~/home/repositories/git/oonisim/python-programs/courses/mlops-with-vertex-ai/src/model_training/model.py in create_binary_classifier(tft_output, hyperparams)
     83         )
     84 
---> 85     return _create_binary_classifier(feature_vocab_sizes, hyperparams)

~/home/repositories/git/oonisim/python-programs/courses/mlops-with-vertex-ai/src/model_training/model.py in _create_binary_classifier(feature_vocab_sizes, hyperparams)
     62             pass
     63 
---> 64     joined = keras.layers.Concatenate(name="combines_inputs")(layers)
     65     feedforward_output = keras.Sequential(
     66         [

/opt/conda/lib/python3.7/site-packages/keras/utils/traceback_utils.py in error_handler(*args, **kwargs)
     65     except Exception as e:  # pylint: disable=broad-except
     66       filtered_tb = _process_traceback_frames(e.__traceback__)
---> 67       raise e.with_traceback(filtered_tb) from None
     68     finally:
     69       del filtered_tb

/opt/conda/lib/python3.7/site-packages/keras/layers/merge.py in build(self, input_shape)
    517       ranks = set(len(shape) for shape in shape_set)
    518       if len(ranks) != 1:
--> 519         raise ValueError(err_msg)
    520       # Get the only rank for the set.
    521       (rank,) = ranks

ValueError: A `Concatenate` layer requires inputs with matching shapes except for the concatenation axis. Received: input_shape=[(None, 2), (None, 4), (7,), (None, 3), (None, 1), (None, 1), (6,), (None, 3), (None, 3), (None, 1), (None, 10)]

Environment

tfx                                   1.8.0

Soluition

import tfx.v1
@oonisim
Copy link
Author

oonisim commented Jun 13, 2022

Cause

keras.layers.experimental.preprocessing.CategoryEncoding generates the shape (N,), which cannot be concatenated with keras.layers.Embedding which has shape (None, M).

Need to use tf.keras.layers.CategoryEncoding which generates (None, N) shape.

Fix

keras - cannot concatenate Embedding and CategoryEncoding layers

src/model.py

def _create_binary_classifier(feature_vocab_sizes, hyperparams):
    input_layers = create_model_inputs()

    layers = []
    for key in input_layers:
        feature_name = features.original_name(key)
        if feature_name in features.EMBEDDING_CATEGORICAL_FEATURES:
            vocab_size = feature_vocab_sizes[feature_name]
            embedding_size = features.EMBEDDING_CATEGORICAL_FEATURES[feature_name]
            
            embedding_output = keras.layers.Embedding(
                input_dim=vocab_size + 1,
                output_dim=embedding_size,
                name=f"{key}_embedding",
            )(input_layers[key])
            print(f"Shape of embed layer [{key}] has None [{embedding_output.shape}] output_dim is {embedding_size}")
            
            layers.append(embedding_output)
        elif feature_name in features.ONEHOT_CATEGORICAL_FEATURE_NAMES:
            vocab_size = feature_vocab_sizes[feature_name]
            #onehot_layer = keras.layers.experimental.preprocessing.CategoryEncoding(
            #    max_tokens=vocab_size,
            #    output_mode="binary",
            #    name=f"{key}_onehot",
            #)(input_layers[key])

            onehot_layer = tf.keras.layers.CategoryEncoding(
                num_tokens=vocab_size, 
                output_mode="one_hot",
                name=f"{key}_onehot",
            )(input_layers[key])
            
            #print(f"Shape of one hot layer [{key}] has None [{onehot_layer.shape}]")
                      
            layers.append(onehot_layer)
        elif feature_name in features.NUMERICAL_FEATURE_NAMES:
            numeric_layer = tf.expand_dims(input_layers[key], -1)

            #print(f"Shape of numeric layer [{key}] has None [{numeric_layer.shape}]")
                      
            layers.append(numeric_layer)
        else:
            pass
        
    joined = keras.layers.Concatenate(name="combines_inputs")(layers)
    feedforward_output = keras.Sequential(
        [
            keras.layers.Dense(units, activation="relu")
            for units in hyperparams["hidden_units"]
        ],
        name="feedforward_network",
    )(joined)
    logits = keras.layers.Dense(units=1, name="logits")(feedforward_output)

    model = keras.Model(inputs=input_layers, outputs=[logits])
    return model

Result

Model generated.

classifier = model.create_binary_classifier(tft_output, hyperparams)
classifier.summary()

keras.utils.plot_model(
    classifier, 
    show_shapes=True, 
    show_dtype=True,
    to_file='model.png'
)

image

oonisim pushed a commit to oonisim/python-programs that referenced this issue Jun 13, 2022
oonisim pushed a commit to oonisim/python-programs that referenced this issue Jun 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant