Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Milan multiple compression #11083

Merged
merged 7 commits into from
Aug 9, 2022

Conversation

milanro
Copy link
Contributor

@milanro milanro commented Jul 13, 2022

Description

Facility for adding extensions to the SharedPropertyTree without duplicating the code. Added LZ4 compression algorithm.

Does this introduce a breaking change?

No

@milanro milanro requested a review from msfluid-bot as a code owner July 13, 2022 14:30
@github-actions github-actions bot added the base: main PRs targeted against main branch label Jul 13, 2022
@milanro
Copy link
Contributor Author

milanro commented Jul 13, 2022

@ruiterr @DLehenbauer This PR is an extension of already implementedcompression at PropertyDDS. This allows us to add various algorithms without code duplication.

@milanro
Copy link
Contributor Author

milanro commented Jul 19, 2022

@DLehenbauer I checked backward compatibility and now this PR can be reviewed and merged.


export class LZ4PropertyTree extends SharedPropertyTree {
public static create(runtime: IFluidDataStoreRuntime, id?: string, queryString?: string) {
return runtime.createChannel(id, DeflatedPropertyTreeFactory.Type) as LZ4PropertyTree;
Copy link
Contributor

@DLehenbauer DLehenbauer Jul 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect this should be 'LZ4PropertyTreeFactory.Type'.

When loading a document, the type string is used to find the right factory to instantiate the DDS.

The way the unit test is written, the LZ4 and Deflate factories are never both registered in the same container instance, so the bug doesn't surface.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DLehenbauer thank you for finding this

@DLehenbauer
Copy link
Contributor

@milanro, @dstanesc - I wanted to introduce you to @justus-camp, who has started work on op compression at the Fluid framework layer.

@justus-camp - Milan has been adding experimental compression support to the Property DDS, and is ramping up on tackling compression at the fluid runtime level. @dstanesc has done work to synthetically create real-world data for benchmarking.

* This class contains builders of the compression methods used to compress
* of summaries and messages with the plugable compression algorithm.
*/
class CompressionMethods {
Copy link
Contributor

@DLehenbauer DLehenbauer Jul 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most JS developers would find a function more natural than a class here:

function createCompressionMethods(encoder, decoder) {
        return {
            messageEncoder: {
                encode: (...) => { ... },
                decode: (...) => { ... },
            },
            summaryEncoder: {
                encode: (...) => { ... },
                decode: (...) => { ... },
            },
        };
    };

Possibly this function should be part of CompressedPropertyTreeFactory.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DLehenbauer The idea here is complete separation of compression methods from the CompressedPropertyTreeFactory. The only relevant method here is getEncDec and how it is implemented is the responsibility of the extended Factory class. CompressionMethods is only a helper, just a utility class which helps the extended factories to reduce the code if they want this behavior but that is only one way of many. Puristic approach would require here the method getEncDec to be abstract. I have chosen the default behavior using the CompressionMethods but did not want to bind them too closely with CompressedPropertyTreeFactory so I kept them outside in another util class.

Concerning the method usage instead of class usage, I can do it but it looks to me that it might produce messy code putting various responsibilities in one place (one method). I was trying to follow single responsibility pattern and separate the functionality in dedicated methods referencing them in the returned object. I could create 5 functions instead of the class for building encoders, decoders and createCompressionMethods but it looked to me more self explaining if I put them to one class.

Nevertheless I will follow your suggestions if you still think that it is more reasonable to do it by function within the CompressedPropertyTreeFactory.

@milanro
Copy link
Contributor Author

milanro commented Aug 2, 2022

@DLehenbauer I implemented the createCompressionMethods method and fixed the Typo you suggested above.

// eslint-disable-next-line @typescript-eslint/dot-notation
change["isZipped"] = "1";
change.changeSet = zippedStr;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: "zipped" implies deflate to most people. Perhaps substitute "compressed"?

decode: (transferChange: IPropertyTreeMessage) => {
// eslint-disable-next-line @typescript-eslint/dot-notation
if (transferChange["isZipped"]) {
const zipped = new Uint8Array(stringToBuffer(transferChange.changeSet, "base64"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just FYI - In the new SharedTree, we're using "changeset" rather than "changeSet". (i.e., "changeset" is a single word, like "hashtable".)

However, 'changeSet' is okay if you prefer to be consistent with PropertyTree.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implementation is dedicated to Property DDS only and will not be used in any other context so I would suggest to stay consistent with Property DDS.

public abstract getDecodeFce();
private createCompressionMethods(encodeFn, decodeFn): ISharedPropertyTreeEncDec {
return {
messageEncoder: {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: @justus-camp has landed runtime-level per-op compression yesterday:
#11208

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing to this. I will go through. At first time I do not see any reusable code which we can use here or in general summary compression (such which would determine the compression algorithm based on configuration, environment or any further conditions). The API is dedicated to the fluid operation message : unpackRuntimeMessage(message: ISequencedDocumentMessage) and then hard-coding usage of lz4. So I still need to keep this implementation as is. I would like to point you to #11366 where we could discuss the reusability.

@DLehenbauer DLehenbauer merged commit 685c9fb into microsoft:main Aug 9, 2022
@github-actions
Copy link
Contributor

github-actions bot commented Aug 9, 2022

This commit is queued for merging with the next branch! Please ignore this PR for now. Contact @microsoft/fluid-cr-infra for help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants