Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(tree): Add the ability to associate metadata with Node Schema #22554

Closed
wants to merge 82 commits into from

Conversation

Josmithr
Copy link
Contributor

@Josmithr Josmithr commented Sep 18, 2024

Introduces a props parameter like we support for Field Schema, but for Node Schema. Metadata is propagated to the resulting Node Schema, and is available via a readonly metadata property.

Updates the simple schema and json schema domains and associated transformations to respect the built-in description property.

Follow-up to #22538

* E.g., when converting a Node Schema to {@link https://json-schema.org/ | JSON Schema}, this description will be
* used as the `description` property.
*/
readonly description?: string | undefined;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if using this in error messages would make sense. Either way, good data to have.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hadn't thought of that. That's a good idea! Out of scope for this PR though :)

public array<const T extends TreeNodeSchema | readonly TreeNodeSchema[]>(
public array<
const T extends TreeNodeSchema | readonly TreeNodeSchema[],
const TCustomMetadata = unknown,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think adding "props" to the structurally typed APIs are a good idea.

If two people call this with different props but the same types, will they get back the same schema, discarding the props who ever was second gave, or get back differed schema with colliding names? Or do we expect the props to be somehow converted into strings and included in the structurally generated name? Given the type constraint is just "unknown" that does not seem viable.

I see no possible way this API could produce reasonable results, and its undocumented (and I assume untested since don't think there is a correct thing the implementation could do here)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's an interesting point. I admit that I haven't thought about the case where you define a schema twice with the same name. I assume that our current behavior (and guarantee?) is that you get back the "same thing" in both cases? In which case, yea - I agree this is weird that you can do it twice with difference metadata. I can't think of anything to do in that case except throw.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

props isn't intended to influence the schema identity and should only serve as a supplementary parameter to the schema factory right? Is there no way for the schema factory to disregard the props parameter when creating the schema instance?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

props isn't intended to influence the schema identity and should only serve as a supplementary parameter to the schema factory right? Is there no way for the schema factory to disregard the props parameter when creating the schema instance?

The data inside of the props parameter can end up in the actual schema objects. The metadata property added here is such an example. So this is a problem.

It's probably fine to only support additional options for named schemas. We can follow up on the structurally typed side of things in the future as needed. I'll take a stab at refactoring this soon.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think we should do if the user does sf.object("Foo", { metadata: X }) and then later does sf.object("Foo", { metadata: Y })? We could:

  1. Fail unconditionally if you try to make a schema with the same name twice
  2. Fail if you try to make a schema with the same name twice and either creation includes metadata
  3. Allow the first creation to establish metadata, but no others
  4. If there's metadata, diff it (how?), and fail if it's different
  5. Something else?

Copy link
Contributor Author

@Josmithr Josmithr Nov 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, was heads down on the website update for the last month+. Getting back to this now.

On the surface, I like #1 above best. But, I don't think we can reasonably depend on tree shaking, etc. to ensure that the same schema definition code doesn't get run multiple times in an application, so presumably we need to be more tolerant.

For the same reason as above, I think we ultimately have to do something like what you suggested in #4, if we want to expose the metadata the way I prototyped here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That said, since this is only true (correct me if I am wrong) for structurally typed schemas, maybe we can get away with only supporting this (at least to start) for class-based schema?

This case seems simpler, and generally we (again, correct me if I'm wrong) would prefer for users to leverage class-based schemas for most scenarios. We could always come back to this (i.e., support node metadata for structurally based schemas) if/when there is a user need.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW I've since checked in areMetadataEqual - might be what we want for the comparison

public map<const T extends TreeNodeSchema | readonly TreeNodeSchema[]>(
public map<
const T extends TreeNodeSchema | readonly TreeNodeSchema[],
const TCustomMetadata = unknown,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless I missed something, this seems impossible to define a good behavior for, see my comment on the structurally named array overload.

@noencke noencke self-requested a review October 1, 2024 17:43
>(
name: Name,
fields: T,
props?: NodeSchemaProps<TCustomMetadata>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't this the names here are good.

Our schema system does define JavaScript properties on nodes based on the schema, and those properties are defined based on the fields parameter before this one. Adding a parameter called "props" could make someone thing this is specifies what properties should exist, and NodeSchemaProps sounds like it has something to do with nodes, and this really has nothing to do with nodes since it has no impact on node instance at all: it only impacts the schema.

Additionally, the name "props" is commonly used in frameworks like React for a specific purpose, and users of our library with such tools likely will have a function to convert nodes of a given schema into a props object: this sounds like its related to that, but is not.

If we are not trying to align with react "props", then we probably should avoid the term "props". As noted above I also think "properties" is misleading since this adds schema custom metadata, not properties on nodes.

Additionally this parameter is missing documentation.

While we could change the method name in the future, the TypeName NodeSchemaProps is something we will be stuck with. I think we should pick a better name.

Also, if we ever want to use some of the non-custom parts of the metadata in a strongly typed way (ex: use metadata to control API options), this API design won't work since it only captures strong types for the custom data.

Maybe for now it would make sense to go with something like:
metadata?: SchemaMetadata<TCustomMetadata>,

Then if we need to add strongly types non-custom member for specific schema, we can do something like:

metadata?: ObjectSchemaMetadata<TCustomMetadata, TApiVersion, TWhatever>,

The other option (Which avoids lots of generic parameters as more options are added) is just capturing the passed in type, and updating the extends clause to something like TMetadata extends SchemaMetadata. Then we retain all type information about the parameter passed in, and can filter out TMetadata["custom"] (or any other field) later if needed. This would be my preferred approach from a implementation perspective, though it might permit extra/unrecognized parameters on the non custom part, which we could make a runtime error but that might be a reason to prefer the the other approach.

*
* @public
*/
export interface NodeSchemaProps<out TCustomMetadata = unknown> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this is only really used as a type constraint for input parameters, there is a different design pattern we can take around strong typing. Rather than requiring inferring type parameters and extending a parameterized version of this type, you could remove the type parameters, then infer those types from the members.

Ex:

export interface NodeSchemaOptions {
readonly metadata?: NodeSchemaMetadata;
readonly optionAddedInTheFutureSomeday: Foo;
}

Then the schema factory APIs can use these like TOptions["metadata"] or TOptions["optionAddedInTheFutureSomeday"]

This avoids the order dependent annoyances of getting lists of generic parameter options, and results in much simpler types (this type gets to be non-generic, and when its used, there is a single type parameter constrained to it, no matter no mater many options it captures)

Copy link
Contributor

🔗 No broken links found! ✅

Your attention to detail is admirable.

linkcheck output


> [email protected] ci:linkcheck /home/runner/work/FluidFramework/FluidFramework/docs
> start-server-and-test ci:start 1313 linkcheck:full

1: starting server using command "npm run ci:start"
and when url "[ 'http://127.0.0.1:1313' ]" is responding with HTTP status code 200
running tests using command "npm run linkcheck:full"


> [email protected] ci:start
> http-server ./public --port 1313 --silent


> [email protected] linkcheck:full
> npm run linkcheck:fast -- --external


> [email protected] linkcheck:fast
> linkcheck http://localhost:1313 --skip-file skipped-urls.txt --external

Crawling...

Stats:
  428360 links
    3310 destination URLs
       2 URLs ignored
       0 warnings
       0 errors


@Josmithr
Copy link
Contributor Author

Replaced by #23219

@Josmithr Josmithr closed this Dec 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: dds: tree area: dds Issues related to distributed data structures area: framework Framework is a tag for issues involving the developer framework. Eg Aqueduct base: main PRs targeted against main branch changeset-present public api change Changes to a public API
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants