-
-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: initial relay integration support #65
feat: initial relay integration support #65
Conversation
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #65 +/- ##
==========================================
+ Coverage 78.45% 84.71% +6.26%
==========================================
Files 10 14 +4
Lines 789 1590 +801
Branches 118 260 +142
==========================================
+ Hits 619 1347 +728
- Misses 141 191 +50
- Partials 29 52 +23 |
CodSpeed Performance ReportMerging #65 will not alter performanceComparing Summary
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still reviewing, but this is definitely over my head so far. Trying to focus on the API since I'm still wrapping my head around the implementation. Going to relay on @erikwrede and @patrick91 heavily :)
mapper = StrawberrySQLAlchemyMapper() | ||
|
||
@mapper.type(fruit_table) | ||
class Fruit(relay.Node): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think there's any way to create this type by default in the library through primary key abstraction? Doesn't have to be in this PR, but I really would love us to get to zero boilerplate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably there is a way, indeed! It goes beyond the scope of this PR, but it can be done in the future
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There definitely is a way, but the zero boilerplate should be optional. Explicitly selecting what to include in the schema on a type level has advantages too, so I'm all for supporting both ways.
hey everyone, I still need some time to fully review this. Will get back when I have done some more testing |
Hey @erikwrede , @mattalbr I just got back from djangocon and I'll get back to this PR during the weekend now. One thing I was wondering here is regarding the fact that you need to have a session to query sqlalchemy objects. Because of that I had to add a I was thinking about a way to avoid that and I have a proposal. Let me know what you think and I can implement that in this PR: The idea is to use contextvars to provide a session that will be used during the execution of a query. Not only the relay fields I'm defining here can use it, but the dataloaders can also use it instead of requiring a Something like: current_session = contextvars.ContextVar("current-session")
class SessionProviderExtension(SchemaExtension):
def __init__(self, sessionmaker):
self.sessionmaker = sessionmaker
def on_execute(self):
token = current_session.set(self.sessionmaker())
try:
yield
finally:
current_session.reset(token) Of course we can also make the current_session be started lazely (create it when first accessing it) What do you think? |
05bf3e6
to
df6ff84
Compare
df6ff84
to
c337376
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few API clarifications and concerns, but this is AWESOME and a huge step in the right direction for this library.
Lots of things we can improve eventually, but I love the direction this is going. Thank you so much Bellini!
Patrick and Erik, I'm relying on you two to review the implementation since it's way out of my expertise, and I'm strictly keeping a user-focused lens while reviewing.
assert result.errors is None | ||
assert result.data == { | ||
"fruit": { | ||
"id": relay.to_base64("Fruit", f.id), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trying to understand how this relay API works. Having this returned in id
seems like it would obfuscate id
to the caller, which for a lot of situations is totally fine, but what if we wanted to do this without having an opaque id
? Is there a way to put the relay ID in a different response field, but still tell the mapper that it should use id
for mapping?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is the resolved query to the frontend, the frontend clients should only know the real relay ID. In your mutations you can use relay.from_base64
and then the id to node function to get the corresponding DB entity. Where else would you need the raw ID where it is difficult to access by this paradigm?
Not sure I got the question right, so please feel free to specify.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's more a matter of, what if someone already has a field named ID that the client is already using, but they want to add support using this new relay functionality. It's nice to think all ids should be opaque, but that's far too opinionated for a middleware library like this to make that decision IMO.
So my question is, what does someone do if they already have a field named "id" that the client is depending on? Can they put this relay ID in a different field so as not to clobber "id"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mattalbr it is part of the Node
interface and its spec to have a id
attribute which is serialized something that can globally identify the object (i.e. without even having to tell the type).
Having said that, there's a proposal to make it easier to support custom implementations (in case you want to make it opaque in some other way, or even skip it at all) which might be useful once it gets merged: strawberry-graphql/strawberry#3057
Another possibility is to expose the original id in another field.
But I would say that anyone wanting to use this is not only aware of the whole GlobalID thing, but also expects that it works like that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I would say that anyone wanting to use this is not only aware of the whole GlobalID thing, but also expects that it works like that.
I personally disagree with this one :D I think the global id is mostly useful when using relay as a client, which might not be the case most of the times 😊
I think it might be worth trying to solve 3057 in Strawberry, but I haven't had the right amount of time to think about a good solution yet
if root is not None: | ||
# root won't be None when resolving nested connections. | ||
# TODO: Maybe we want to send this to a dataloader? | ||
query = getattr(root, field.python_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I almost want to log a warning here because this will be so bad for performance. But, I think we'd only want to log it once per query, rather than per resolver, which maybe we don't have an easy way to do. So maybe no action, just thinking out loud.
... ) | ||
... | ||
... @relay.connection(description="ABC") | ||
... def get_some_nodes(self, age: int) -> Iterable[SomeType]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would someone do something like this were they extend the resolver to implement custom filtering without pulling all the data in from the database? Feels like either an abstraction is missing that would allow this to modify the query instead of returning results, or it would need to have access to the pagination params
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By returning a Query
object. This is actually exactly what the default_resolver
does when you don't define one, but you could do something like:
@relay.connection(relay.ListConnection[SomeType])
def some_connection(self) -> Iterable[SomeType]:
return session.query(SomeModel)
In this case session.query(SomeModel)
will only access the database when being iterated, and what you return in here will be modified by the connection later: for ListConnection
it will slice it, producing a limit
/offset
pagination. For the KeysetConnection
I created it will do its thing
An example from strawberry-django: https://github.com/bellini666/strawberry-django-demo/blob/main/demo/order/schema.py#L28
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing, thanks!
def __init__( | ||
self, | ||
sessionmaker: _SessionMaker | None = None, | ||
keyset: Keyset | None = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is at field definition time, correct? Keyset I believe should come not at definition time but at query time. The Keyset gets passed to the caller as the pagination token (well technically the Marker gets passed, which includes the Keyset), which gets passed back when fetching the next page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mattalbr not sure if I understood whast you mean, and maybe the name of the variable here is not the best one =P
This variable defines the fields that will be used for the pagination, i.e., that will be used to order_by
the query when you are relying on the default resolver.
You can see an example in this test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry that link isn't working for me, can you point me to the test by name?
I just want to make sure that which fields will be used to order the query is defined per request. E.g. imagine we had a resolver:
def fruits(order: list[string], info: Info):
return info.context.session.query(models.Fruit).order(*order).all()
The ordering, and thus the keyset, is not fixed for the resolver -- it's determined by the client.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mattalbr oh I see. In the way I implemented it, the ordering is defined in the field
The old link is not working anymore, here is a new one: https://github.com/strawberry-graphql/strawberry-sqlalchemy/pull/65/files#diff-e466e4061d3ec646290e8d41bfb1ccfb9f3d7f36113d7c8c9196d2e1ae9de05aR509
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having said that
Sorry that link isn't working for me, can you point me to the test by name? I just want to make sure that which fields will be used to order the query is defined per request. E.g. imagine we had a resolver:
def fruits(order: list[string], info: Info): return info.context.session.query(models.Fruit).order(*order).all()The ordering, and thus the keyset, is not fixed for the resolver -- it's determined by the client.
Having said that, your example here would actually work. keyset is only used for the automatic resolver, not for manual ones. So if you define a resolver like you did it will work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
Minor revisions attached, let's figure those points out and work on shaping the rest iteratively in future releases.
About the context vars for sessions: I'd prefer to leave things in context for now, this way we don't force async on everyone and we have a clear entry point for field extensions to modify the context and thus the session.
while isinstance(field, StrawberryContainer): | ||
field = field.of_type | ||
|
||
edge_class = cast(Edge[NodeType], field) | ||
|
||
return cls( | ||
page_info=relay.PageInfo( | ||
has_next_page=page.paging.has_next, | ||
has_previous_page=page.paging.has_previous, | ||
start_cursor=page.paging.get_bookmark_at(0) if page else None, | ||
end_cursor=page.paging.get_bookmark_at(-1) if page else None, | ||
), | ||
edges=[ | ||
edge_class.resolve_edge(n, cursor=page.paging.get_bookmark_at(i)) | ||
for i, n in enumerate(page) | ||
], | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For a later PR: we should optimize this to use dataloader if there is no limit or filter on the pagination here.
if isinstance(session, AsyncSession): | ||
|
||
async def resolve_async(nodes=nodes): | ||
return await session.run_sync(lambda s: resolve_nodes(s)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we falling back to sync here? If this is a limitation of sqlakeyset, we need to properly warn users about it and this is a major drawback for anyone using async sqla.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to use the sqlakeyset.get_page
API instead of the sqlakeyset.select_page
because of the way this is implemented.
sqlakeyset doesn't have a get_page
async counterpart, only a select_page
TBH I don't know much of the sqlalchemy internals and neither of sqlakeyset
, so open to try any suggestions here =P
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
run_sync should be fine here, right, since sqlalchemy does all the magic of making sure this still hooks into the async dbapi?
If we really need to keep this async, we would have to basically clone sqlakeyset.paging.orm_get_page (very short, so not a huge deal) to get the query and then execute it and process results.
I've tried to augment sqlakeyset but got stuck on code reviews unfortunately.
) | ||
|
||
try: | ||
return session.get(model, filter) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, we should be clear about where async is supported. Not a blocker for first version, we can add async everywhere later.
raise | ||
|
||
|
||
def resolve_model_id_attr(source: Type) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we handle this via str | tuple[str]
? Parsing a string everytime seems slow and hard to understand without knowing the source. Also easier to not miss the standard for users providing custom implementations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would not work very well as the relay implementation on strawberry will do a str()
on anything that is not already a string, meaning (1, 2, 3)
would become "(1, 2, 3)"
, which is harder to parse
But TBF, multiple primary keys are not very common, so anyone using them would probably know that there must be some special handling for this.
But I can/will add a comment here about this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Composite keys are very common in some of the codebases I am familiar with. Maybe we can adjust that behavior in strawberry to also allow tuples (which are later stringified)? cc @patrick91
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good to me 😊
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bellini666 @mattalbr are you fine with merging this in and later causing a soft breaking change in the internal api by switching to tuples after we added support in strawberrry? I'd like to avoid blocking the PR for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with this :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with that, yeah! I'm already way too slow on this PR and gotta get out of the way.
mapper = StrawberrySQLAlchemyMapper() | ||
|
||
@mapper.type(fruit_table) | ||
class Fruit(relay.Node): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There definitely is a way, but the zero boilerplate should be optional. Explicitly selecting what to include in the schema on a type level has advantages too, so I'm all for supporting both ways.
assert result.errors is None | ||
assert result.data == { | ||
"fruit": { | ||
"id": relay.to_base64("Fruit", f.id), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is the resolved query to the frontend, the frontend clients should only know the real relay ID. In your mutations you can use relay.from_base64
and then the id to node function to get the corresponding DB entity. Where else would you need the raw ID where it is difficult to access by this paradigm?
Not sure I got the question right, so please feel free to specify.
@mattalbr @erikwrede think I addressed all of the comments in this PR Let me know if there's anything else missing to be able to merge this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm from my side, pending agreement on this from you:
@bellini666 @mattalbr are you fine with merging this in and later causing a soft breaking change in the internal api by switching to tuples after we added support in strawberrry? I'd like to avoid blocking the PR for that.
Thank you! @mattalbr let me know if you want any extra changes in this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Last of the open questions, but looks good for a first version! Thanks so much for your help!
order_by = [] | ||
if order := kwargs.get("order"): | ||
order_by.extend([getattr(model, o) for o in order]) | ||
if field.keyset: | ||
order_by.extend(field.keyset) | ||
|
||
if order_by: | ||
query = query.order_by(*order_by) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah! No need to add this "order" column yet since we don't have a fully fleshed API yet. I think you can just do:
if after := kwargs.get("after"):
order_by.extend(sqlakeyset.unserialize_bookmark(after).place)
And then specify where field.keyset is defined that it's just the default ordering, and later pages grab the order from the cursor.
d86b98f
to
9ef9785
Compare
Hey @bellini666 , have you started working on something to avoid the duplicate session setup? |
TODO:
node()