Support for Streaming Conformer Transducer #178

andreselizondo-adestech · 2021-04-15T18:47:24Z

This PR is an attempt at adding support for the Streaming Conformer Transducer network.

The changes that have been identified are:

DepthwiseConv2D inside ConvModule needs to have padding='causal'
MHSA layer must receive a mask that indicates which chunks to use for each timestep.
2.1. A parameter for history_window_size needs to be added to config and dataset preprocessing.

Deferred:

Create MaskedTransducerTrainerGA for working with gradient accumulation. (GA is no longer supported)

All comments and edits are welcome.

…sducer.

andreselizondo-adestech · 2021-04-15T18:48:18Z

This PR is aimed to advance the TODO list on #14 .

andreselizondo-adestech · 2021-04-16T16:01:28Z

@usimarit Hello!
I've run into a problem.
The Streaming Conformer Transducer (SCT) paper states that we need to convert the depthwise convolution inside the "ConvModel" to a "causal" depthwise convolution. However, this requires it to be a Conv1D, not a Conv2D.
Take a look at: tensorflow_asr/models/conformer.py#L158

My question is... why is a Conv2D being used? I've double checked with the original Conformer paper and it's supposed to be a Conv1D.
Maybe there's something that I'm missing.

andreselizondo-adestech · 2021-04-16T16:12:07Z

I noticed you're using a DepthwiseConv2D with kernel_size=(32,1).
Would you consider replacing this by a SeparableConv1D with kernel_size=(32)?
We would also need to specify a value for filters, I'd guess filters=input_dim would be good enough, no? 🤷‍♂️

Here's what that would look like:

self.dw_conv = tf.keras.layers.SeparableConv1D(
    filters=input_dim,
    kernel_size=(kernel_size), strides=1,
    padding="same" if not streaming else "causal",
    name=f"{name}_dw_conv",
    depth_multiplier=depth_multiplier,
    depthwise_regularizer=kernel_regularizer,
    bias_regularizer=bias_regularizer
)

…g Conformer.

andreselizondo-adestech · 2021-04-16T21:33:00Z

Good news @usimarit, most of the initial work is done. The model is now trainable, though a few things are still missing.
Please take a look and let me know what you think 😃

I had to modify the base Conformer class, but the changes done should not affect anything.

nglehuy · 2021-04-17T04:02:21Z

I noticed you're using a DepthwiseConv2D with kernel_size=(32,1).
Would you consider replacing this by a SeparableConv1D with kernel_size=(32)?
We would also need to specify a value for filters, I'd guess filters=input_dim would be good enough, no? 🤷‍♂️

Here's what that would look like:
self.dw_conv = tf.keras.layers.SeparableConv1D(
    filters=input_dim,
    kernel_size=(kernel_size), strides=1,
    padding="same" if not streaming else "causal",
    name=f"{name}_dw_conv",
    depth_multiplier=depth_multiplier,
    depthwise_regularizer=kernel_regularizer,
    bias_regularizer=bias_regularizer
)

Sorry for the late reply, SeparableConv1D is the DepthwiseConv2D combine with the Conv1D after that, so the architecture would be wrong if you apply SeparableConv1D.

nglehuy · 2021-04-17T04:09:08Z

@andreselizondo-adestech We will have a big change in the repo structure as in the PR #177. Please be aware of that 😄 These changes will split the conformer file to the encoder file and the model file like this. I about to finish that PR so you will have to pull the main, create a new branch and cherry pick what you've done into the new structure 😄

andreselizondo-adestech · 2021-04-17T09:24:11Z

@andreselizondo-adestech We will have a big change in the repo structure as in the PR #177. Please be aware of that 😄 These changes will split the conformer file to the encoder file and the model file like this. I about to finish that PR so you will have to pull the main, create a new branch and cherry pick what you've done into the new structure 😄

Understood, I'll look into the new format :)

Regarding the SeparableConv1D. I now see what you mean, it seems odd to me that DepthwiseConv1D only exists as a combination of both layers. This means internally the implementation is supported, it's just not exposed for us to use.

I found this issue/PR (tensorflow/tensorflow#48557) on the Tensorflow repository. They intend to add support for the layer we need. However, the issue was opened less than 24hrs ago, so we'll have to wait and see how long it takes to be released into tf-nightly.

nglehuy · 2021-04-17T09:30:24Z

@andreselizondo-adestech We can build our own DepthwiseConv1D 😄 no need to wait until tensorflow support it.

nglehuy · 2021-04-17T17:39:37Z

@andreselizondo-adestech The refactor PR is merged 😄

Updates fork with refactoring on base repo

andreselizondo-adestech · 2021-04-19T15:37:01Z

@usimarit I'm merging my changes into the refactored code, however.. there appears to be an issue using SentencePiece for training.
Specifically at line tensorflow_asr/featurizers/text_featurizers.py#L342.
Seems like this function is nowhere to be found, but at the same time, the default value for model in that function is None. So when being called from examples/conformer/train.py#L68, model is not specified and is therefore None.

nglehuy · 2021-04-19T15:45:12Z

@andreselizondo-adestech Ah yeah, I missed that part, I'll update it.

andreselizondo-adestech · 2021-04-19T18:16:16Z

@usimarit I've adapted my changes to the refactored repo and everything seems to be working. Next step is to create our own implementation of DepthwiseConv1D.

I've been digging into how TF does the SeparableConv1D, but they just call SeparableConv2D (similar to how you did it).
So I looked into SeparableConv2D and DepthwiseConv2D but I couldn't find the implementation for this TF operation

Could you help me out with this?

nglehuy · 2021-04-21T03:40:28Z

@usimarit I've adapted my changes to the refactored repo and everything seems to be working. Next step is to create our own implementation of DepthwiseConv1D.

I've been digging into how TF does the SeparableConv1D, but they just call SeparableConv2D (similar to how you did it).
So I looked into SeparableConv2D and DepthwiseConv2D but I couldn't find the implementation for this TF operation

Could you help me out with this?

Seem like it's from tf c/c++ library 😄

andreselizondo-adestech · 2021-04-21T18:57:54Z

I'm currently running a test on two VMs: Regular Conformer vs DepthwiseConv1D Conformer
We'll see the results in maybe ~30hrs. (I am training on CommonVoice2 dataset though, so WER results won't be directly comparable to the paper.)

@usimarit In the mean time, I'm not sure how inference should work for the Streaming Conformer.
Can you guide me? Do you see something that's missing in the PR?

andreselizondo-adestech · 2021-04-23T17:36:45Z

@usimarit Good news! The two Conformer models converge to the same CER, meaning performance was not impacted negatively by the custom DepthwiseConv1D layer.
I trained on chars and the best CER I got was ~5.2
I'll be training on subwords shortly.

In the meantime, I think we should look at how to do steaming inference on the Streaming Conformer Transducer.

tensorflow_asr/datasets/asr_dataset.py

tensorflow_asr/models/encoders/conformer.py

tensorflow_asr/models/transducer/streaming_conformer.py

examples/streaming_conformer/train.py

tensorflow_asr/models/encoders/conformer.py

tensorflow_asr/models/layers/depthwise_conv1d.py

Adds mask pre-compute when input max_length is defined.

andreselizondo-adestech · 2021-04-29T16:44:36Z

@usimarit The next step is looking at the file StreamingConformer class.
I based the class on the StreamingTransducer class, so I don't know if there's methods that should be different.
Mind requesting any changes necessary? Or you could also explain to me how it should work.

Updates fork

tensorflow_asr/datasets/asr_dataset.py

nglehuy · 2021-05-16T07:52:24Z

@usimarit The next step is looking at the file StreamingConformer class.
I based the class on the StreamingTransducer class, so I don't know if there's methods that should be different.
Mind requesting any changes necessary? Or you could also explain to me how it should work.

I haven't had time to dive into how the StreamingConformer work in the inference mode but I think it's quite different than the RnnTransducer (StreamingTransducer before then). I'll try to make time for this.

But anyway we should complete the whole pipeline (training, inference, testing, tflite) before merging 😄

andreselizondo-adestech · 2021-07-07T20:34:09Z

@usimarit Hey there, this is just a gentle ping.
Do you have anything that might guide me on implementing inference for Streaming Conformer? :)

nglehuy · 2021-07-23T15:57:01Z

@andreselizondo-adestech Sorry, I'm currently a bit busy until the end of July. So after that, I can go back to support this feature 😄

andreselizondo-adestech · 2021-10-28T16:10:04Z

Hello @usimarit
Are you still interested in implementing this?
Let me know if you need my help 😄

nglehuy · 2021-10-28T17:36:14Z

@andreselizondo-adestech Of course, I'll find some free time to help implement the inference of this
In the mean time, can you help me to resolve the conflicts? It's just some conflicts about the code format and imports, I changed from autopep8 to black and used absolute imports instead of relative imports (which is more recommended)

nglehuy · 2024-05-06T15:15:06Z

@andreselizondo-adestech hi, are you still working on this?
I think we should compute attention mask in MHA layer instead of in dataset, which is kinda similar to causal attention masking in v2.x version of tfasr, but with history truncation and limited futures

andreselizondo-adestech added 2 commits April 15, 2021 13:40

Adds parameter to specify model behaviour.

1356ceb

Initial untested commit. Training script for Streaming Conformer Tran…

47eba7d

…sducer.

Renames variables.

8d70d9b

andreselizondo-adestech added 5 commits April 16, 2021 16:17

Adds ASRMaskedSliceDataset, for generating rolling masks for Streamin…

ac216b2

…g Conformer.

Changes DepthwiseConv2D for SeparableConv1D.

f4988fb

Adds StreamingConformer class. Cleanup pending.

a6bc25f

Adds MaskedTransducerTrainer for training Streaming Conformer.

a4a2b18

Configures for training Streaming Conformer.

f8491e6

andreselizondo-adestech and others added 3 commits April 19, 2021 10:07

Merge pull request #1 from TensorSpeech/main

cabbd43

Updates fork with refactoring on base repo

Adds ASRMaskedSliceDataset to refactored repo.

64a354f

Adds streaming and changes Conv2D for Conv1D to refactored repo.

52ad653

andreselizondo-adestech added 5 commits April 19, 2021 12:48

Adapts ASRMaskedSliceDataset to refactored repo.

9989dc4

Bugfix uses shape_list from shape_util.

c2e3ec6

Adds mask compatibilty for create_inputs.

e3da465

Adapts StreamingConformer to refactored repo.

0c53bc7

Adapts StreamingConformer training script to refactored repo.

27ef8dd

andreselizondo-adestech added 2 commits April 19, 2021 16:45

Adds eval_batch_size and default value.

5a9aeb0

Adds DepthwiseConv1D layer from github.

50def47

Removes problem causing imports.

3ded7e1

nglehuy requested changes Apr 24, 2021

View reviewed changes

andreselizondo-adestech added 6 commits April 26, 2021 16:49

Removes unnecessary argument.

00032a7

Removes unused lines from DepthwiseConv2D.

a0223ec

Renames DepthwiseConv1D definition script.

cfbc29e

Bufgix, typo

aadcf91

Renames model _build() to make().

d593105

Adds ASRMaskedTFRecordDataset. Fixes ASRTFRecordDataset.

9d66c2d

nglehuy reviewed Apr 28, 2021

View reviewed changes

tensorflow_asr/models/layers/depthwise_conv1d.py Show resolved Hide resolved

andreselizondo-adestech added 3 commits April 28, 2021 11:10

Fixes pep8 formatting.

34525a1

Adds _create_mask_tf for pure TF mask creation.

6d4bfac

Adds mask pre-compute when input max_length is defined.

Adds use of ASRMaskedTFRecordDataset.

b4f3d72

andreselizondo-adestech and others added 2 commits April 29, 2021 11:46

Merge pull request #2 from TensorSpeech/main

6a6d5a3

Updates fork

Merge branch 'main' into tmp_merge

3254450

andreselizondo-adestech marked this pull request as ready for review May 4, 2021 14:25

andreselizondo-adestech requested a review from nglehuy May 4, 2021 14:26

nglehuy reviewed May 16, 2021

View reviewed changes

tensorflow_asr/datasets/asr_dataset.py Outdated Show resolved Hide resolved

tensorflow_asr/datasets/asr_dataset.py Outdated Show resolved Hide resolved

andreselizondo-adestech added 2 commits May 18, 2021 12:01

Change request: Use math_util.get_reduced_length.

73959dd

Change request: ASRMaskedTFRecordDataset inherits from two classes.

7d743ee

andreselizondo-adestech marked this pull request as draft May 18, 2021 19:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for Streaming Conformer Transducer #178

Support for Streaming Conformer Transducer #178

andreselizondo-adestech commented Apr 15, 2021 •

edited

Loading

andreselizondo-adestech commented Apr 15, 2021

andreselizondo-adestech commented Apr 16, 2021

andreselizondo-adestech commented Apr 16, 2021 •

edited

Loading

andreselizondo-adestech commented Apr 16, 2021

nglehuy commented Apr 17, 2021

nglehuy commented Apr 17, 2021

andreselizondo-adestech commented Apr 17, 2021

nglehuy commented Apr 17, 2021

nglehuy commented Apr 17, 2021

andreselizondo-adestech commented Apr 19, 2021

nglehuy commented Apr 19, 2021

andreselizondo-adestech commented Apr 19, 2021

nglehuy commented Apr 21, 2021

andreselizondo-adestech commented Apr 21, 2021

andreselizondo-adestech commented Apr 23, 2021

andreselizondo-adestech commented Apr 29, 2021

nglehuy commented May 16, 2021

andreselizondo-adestech commented Jul 7, 2021

nglehuy commented Jul 23, 2021

andreselizondo-adestech commented Oct 28, 2021

nglehuy commented Oct 28, 2021

nglehuy commented May 6, 2024

Support for Streaming Conformer Transducer #178

Are you sure you want to change the base?

Support for Streaming Conformer Transducer #178

Conversation

andreselizondo-adestech commented Apr 15, 2021 • edited Loading

andreselizondo-adestech commented Apr 15, 2021

andreselizondo-adestech commented Apr 16, 2021

andreselizondo-adestech commented Apr 16, 2021 • edited Loading

andreselizondo-adestech commented Apr 16, 2021

nglehuy commented Apr 17, 2021

nglehuy commented Apr 17, 2021

andreselizondo-adestech commented Apr 17, 2021

nglehuy commented Apr 17, 2021

nglehuy commented Apr 17, 2021

andreselizondo-adestech commented Apr 19, 2021

nglehuy commented Apr 19, 2021

andreselizondo-adestech commented Apr 19, 2021

nglehuy commented Apr 21, 2021

andreselizondo-adestech commented Apr 21, 2021

andreselizondo-adestech commented Apr 23, 2021

andreselizondo-adestech commented Apr 29, 2021

nglehuy commented May 16, 2021

andreselizondo-adestech commented Jul 7, 2021

nglehuy commented Jul 23, 2021

andreselizondo-adestech commented Oct 28, 2021

nglehuy commented Oct 28, 2021

nglehuy commented May 6, 2024

andreselizondo-adestech commented Apr 15, 2021 •

edited

Loading

andreselizondo-adestech commented Apr 16, 2021 •

edited

Loading