Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(tree): Add the ability to associate metadata with Node Schema #23219

Closed
wants to merge 44 commits into from

Conversation

Josmithr
Copy link
Contributor

@Josmithr Josmithr commented Nov 27, 2024

Introduces an options parameter like we support for Field Schema, but for Node Schema. Metadata is propagated to the resulting Node Schema and is available via a readonly metadata property.

Example:

class Point extends withMetadata(
	schemaFactory.object("Point", {
		x: schemaFactory.required(schemaFactory.number),
		y: schemaFactory.required(schemaFactory.number),
	}),
	{
		description: "A point in 2D space",
	},
) {}

Note: the above is only available for class-based schema, and not structural node schema. There are technical reasons why the latter is more difficult to implement, and it isn't clear that the latter scenario is particularly valuable given that we generally recommend the former pattern when possible. We can always revisit this later as needed based on user need.

Also updates the simple schema and json schema domains and associated transformations to respect the built-in description property.

Follow-up to #22538

@github-actions github-actions bot added area: dds Issues related to distributed data structures area: dds: tree area: framework Framework is a tag for issues involving the developer framework. Eg Aqueduct changeset-present public api change Changes to a public API base: main PRs targeted against main branch labels Nov 27, 2024
packages/dds/tree/src/simple-tree/api/simpleSchema.ts Outdated Show resolved Hide resolved
packages/dds/tree/src/simple-tree/leafNodeSchema.ts Outdated Show resolved Hide resolved
export interface NodeSchemaOptions<out TMetadata = NodeSchemaMetadata> {
/**
* Optional metadata to associate with the Node Schema.
* @remarks Note: this metadata is not persisted in the document.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth explaining what 'the document' here is. I imagine this might confuse end users. I'm also wondering if we should explain why you'd want to include metadata in the first place in this comment as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated this to include a bit more detail, but have intentionally removed the term "document", as I don't think it's actually relevant to this. The important thing to note is that the data does not get persisted and is application-local only. As a result, changes to the data will not affect other collaborators.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@noencke interested in your thoughts on the specific terminology used here. Would probably be useful to ensure it aligns with how we document these concepts elsewhere.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. My first instinct is to call it "the document". That's definitely been the terminology used in SharedTree over the years. But I'm not sure if it agrees with broader FF. Saying "the file" is overly prescriptive (and possibly wrong). Saying "the summary" sounds a little too technical to me. "Container" isn't right either. "Document" lines right up with what we want IMO - it means that some data has been "documented" (i.e. written down) - nothing more and nothing less.

@Josmithr Josmithr requested a review from noencke December 2, 2024 19:00
@Josmithr Josmithr force-pushed the tree/node-schema-metadata-2 branch from 6de5309 to 40e8bcf Compare December 2, 2024 19:04
@Josmithr Josmithr marked this pull request as ready for review December 2, 2024 19:41
@Josmithr Josmithr requested review from a team as code owners December 2, 2024 19:41
@Josmithr Josmithr requested review from CraigMacomber and a team December 2, 2024 19:41
* @remarks Note: this metadata is not persisted in the document.
*
* @remarks
* Note: this metadata is not persisted; it is strictly local to the application.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is even clearer? "Local to the application" could be a bit confusing since Fluid applications are built on top of client-server interactions. Would even consider something like "local to a given instance of a Fluid container / instance of SharedTree" or something along those lines.

Suggested change
* Note: this metadata is not persisted; it is strictly local to the application.
* Note: this metadata is not persisted; it is strictly local to the client.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except that this data never even makes it to Fluid. It really is local to the instance of the node schema. I'm not sure what would make this the most clear, unless we already have documentation for similar "ephemeral" concepts.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It gets verbose, but how about something like this: "Note: this metadata is not persisted nor made part of the collaborative state. Multiple instances of the application (e.g. multiple users across several computers) connected to the same Fluid container could all be using the same metadata if they're all running the same version of the code, but they could as easily be using different metadata if they're running different versions of the code (e.g. during a rollout of a new version)."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, something like that sounds good. I'll tweak the wording a bit, but something to the same effect.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated. I went with something more verbose than what I had, but slightly less verbose than your suggestion. Let me know what you think.

public map<
Name extends TName,
const T extends ImplicitAllowedTypes,
const TMetadata extends NodeSchemaMetadata,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way this API was added requires modifying the signatures of all the schema production APIs.
This doesn't scale great (one feature, many APi changes, lots of parameters that need docs).

An API like

class Foo extends sf.object(....).withMetadata({...}) {}

Should be possible, and seems more orthogonal.

Specifically a withMetadata method on TreeNodeSchemaClass which returns a subclass of the "this" parameter with the metadata added.

Copy link
Contributor Author

@Josmithr Josmithr Dec 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. I like how this scales. I can play around with this.

Questions:

  1. Are any caching behaviors / identifier invariants broken by this? E.g., what happens if a developer does something like...

    class Foo extends sf.object("Foo", {...}) { }
    class Foo2 extends Foo.withMetadata({...}) { }

    Both schemas have the same identifier ("Foo"), so mixing and matching them seems unsafe. Is that right?

  2. Would it be safe to introduce the same pattern for field schema? Same questions as #1 for this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to explore the "fluent-style" (i.e. chaining methods that return this) syntax for our schema APIs. In the past we've recognized that it could also be useful for parameterizing fields (e.g. sf.optional(...).withDefault(...).withEtc(...)). It might be a convenient way to decompose the giant bag of properties that currently lives in the options parameter. However, we'd need to think it through and review it, since it'd be a significant departure from the current API. In the meantime, this PR's syntax matches the way we do options for field schema, so it is consistent. If we want to add a fluent API, we can update both at once (node options and field options) in the future.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For classes, we allow subclassing as long as there is only a single leaf used as the schema. Since we can't tell what that leaf is until its actually used, nothing should be broken by this pattern caching wise. See TreeNodeValid.constructorCached for details.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For field schema, we don't care about object identity, so no issue there. However we can't put a method on ImplicitFieldSchema (which can be an array), so I think having it work via the FieldSchema constructor makes sense, though we could also add a method to FieldSchema's to replace the metadata (copy the schema with new metadata), Given our comparison semantics for field schema, that should work fine.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to have time to explore options before stabilizing something, you can take the fluent/method approach, but for now make it a free function so it can be @alpha and not forced to be public (it can be implemented by calling a non publicly exposed method, which could be made public to stabilize the API).

Generally I'd like to avoid arguments after the fields list since any parameters after one which is wrapped onto multiple lines makes for hard to read code. As this new APi violates that, I consider it a significant departure from our existing patterns, and impacting a bunch of methods when doing so. I'd rather have a small single new API which might be a bit odd: if we decide the method is bad and want to do it some other way, its easy to keep supported and doesn't clutter up the factory API methods with even harder to read type signatures and less clear overloading (when we have parameter count based overloads, adding optional parameters can be problematic).

Thus I don't really like adding a bunch of optional arguments to these methods, and if we do it this way, then decide we don't like it, we are left with a worse situation since deprecating several method parameters is harder/messier (and more customer impacting) than one method. In short the cost of making the wrong call on how we want the API is a lot lower if we go with the method on TreeNodeSchemaClass. Thus even if we might ultimately want the extra parameter to the factory methods, I'd err on the side of the much smaller and easier to document and deprecate APi surface of the TreeNodeSchemaClass method. Also if we decide we want a different version of this API, we can always add more methods: adding more optional parameters doesn't really work.

.changeset/grey-triangles-shout.md Outdated Show resolved Hide resolved
.changeset/grey-triangles-shout.md Outdated Show resolved Hide resolved
.changeset/grey-triangles-shout.md Outdated Show resolved Hide resolved
.changeset/grey-triangles-shout.md Outdated Show resolved Hide resolved
.changeset/grey-triangles-shout.md Outdated Show resolved Hide resolved

```

Functionality like the experimental conversion of Tree Schema to [JSON Schema](https://json-schema.org/) (`getJsonSchema`) leverages such system-understood metadata to generate useful information.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Functionality like the experimental conversion of Tree Schema to [JSON Schema](https://json-schema.org/) (`getJsonSchema`) leverages such system-understood metadata to generate useful information.
Functionality like the experimental conversion of Tree Schema to [JSON Schema](https://json-schema.org/) (`getJsonSchema`) uses such system-understood metadata to generate useful information.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I'm not sure why the latter is better here. Both are "correct", where my wording is technically a bit more specific. I don't really have a strong opinion either way, but I don't see the benefit in the change.

@@ -460,7 +475,7 @@ describe("schemaFactory", () => {
);
const stuff = view.root.stuff;
assert(stuff instanceof NodeList);
const item = stuff[0];
const item = stuff[0] ?? oob();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to my changes, but fixes a linter warning.

Josmithr and others added 12 commits December 12, 2024 21:25
Co-authored-by: Alex Villarreal <[email protected]>
…RLs (microsoft#23286)

Ensures consistent relative file path link behavior by ensuring resolved
site URLs never include a trailing slash. See
<https://learn.microsoft.com/en-us/azure/static-web-apps/configuration#never>

Also makes some small fixes to config files for local development.


[AB#25895](https://dev.azure.com/fluidframework/235294da-091d-4c29-84fc-cdfc3d90890b/_workitems/edit/25895)
Partial reversion of microsoft#23286.

SWA's implementation of `"trailingSlash": "never"` is currently flawed.
The resulting redirect ends up exposing the underlying SWA URL of the
site, rather than our configured front door URL. E.g.
`https://fluidframework.com/docs/concepts/signals/` ends up redirecting
to
`https://salmon-sand-0b7fa7c1e.1.azurestaticapps.net/docs/concepts/signals`.
- See Azure/static-web-apps#1036

This PR removes that configuration flag, and instead leverages
Docusaurus's
[trailingSlash](https://docusaurus.io/docs/api/docusaurus-config#trailingSlash)
configuration option to not emit trailing slashes.

This required updating a handful of relative URL links to use file path
links. The guidance documentation in the README has been updated to
offer new guidance around relative links in light of new learnings.

[AB#25895](https://dev.azure.com/fluidframework/235294da-091d-4c29-84fc-cdfc3d90890b/_workitems/edit/25895)
@Josmithr Josmithr force-pushed the tree/node-schema-metadata-2 branch from 3f68d10 to d4e8232 Compare December 12, 2024 21:29
@github-actions github-actions bot removed area: build Build related issues area: dev experience Improving the experience of devs building on top of fluid area: driver Driver related issues area: runtime Runtime related issues area: server Server related issues (routerlicious) area: loader Loader related issues area: dds: propertydds area: website dependencies Pull requests that update a dependency file area: tests Tests to add, test infrastructure improvements, etc area: odsp-driver area: examples Changes that focus on our examples area: dds: sharedstring labels Dec 12, 2024
Copy link
Contributor

🔗 No broken links found! ✅

Your attention to detail is admirable.

linkcheck output


> [email protected] ci:check-links /home/runner/work/FluidFramework/FluidFramework/docs
> start-server-and-test "npm run serve -- --no-open" 3000 check-links

1: starting server using command "npm run serve -- --no-open"
and when url "[ 'http://127.0.0.1:3000' ]" is responding with HTTP status code 200
running tests using command "npm run check-links"


> [email protected] serve
> docusaurus serve --no-open

[SUCCESS] Serving "build" directory at: http://localhost:3000/

> [email protected] check-links
> linkcheck http://localhost:3000 --skip-file skipped-urls.txt

Crawling...

Stats:
  170006 links
    1595 destination URLs
    1825 URLs ignored
       0 warnings
       0 errors


1 similar comment
Copy link
Contributor

🔗 No broken links found! ✅

Your attention to detail is admirable.

linkcheck output


> [email protected] ci:check-links /home/runner/work/FluidFramework/FluidFramework/docs
> start-server-and-test "npm run serve -- --no-open" 3000 check-links

1: starting server using command "npm run serve -- --no-open"
and when url "[ 'http://127.0.0.1:3000' ]" is responding with HTTP status code 200
running tests using command "npm run check-links"


> [email protected] serve
> docusaurus serve --no-open

[SUCCESS] Serving "build" directory at: http://localhost:3000/

> [email protected] check-links
> linkcheck http://localhost:3000 --skip-file skipped-urls.txt

Crawling...

Stats:
  170006 links
    1595 destination URLs
    1825 URLs ignored
       0 warnings
       0 errors


@Josmithr
Copy link
Contributor Author

Replaced with #23321

@Josmithr Josmithr closed this Dec 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: contributor experience area: dds: tree area: dds Issues related to distributed data structures area: framework Framework is a tag for issues involving the developer framework. Eg Aqueduct base: main PRs targeted against main branch changeset-present public api change Changes to a public API
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants