-
Notifications
You must be signed in to change notification settings - Fork 362
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pain-points of the new Operation hooks #482
Comments
Regarding 2. I disagree - this is not always possible. However, whenever operating on an instance, it should definitely be available. This PR fixes the most glaring issues: |
Nice now we are going in the right direction. Could you go a bit into more detail on which cases where it is not possible to have the instance? |
Any operation that operates on multiple records at the same time, would not have the instance(s) available. While it would be possible to fetch from the db to provide the instance in all cases, it would not be efficient and very wasteful. See the last comment here: #481 This is what I propose to handle such scenario's when needed. |
IMHO, I feel SL has huge potential and I like where it's going, although I get a sense that some of the work, especially on design/discussion of events on model changes, we are re-creating the wheel. ORMs have been around for decades now and there are many very popular ORMs for different platforms that should be used to have great influence over LB-juggler's design. Taking a couple other ORMs couple for example, see their implementation of model events. /** @PostRemove */
public function postRemoveHandler(User $user, LifecycleEventArgs $event) { // ... }
public function postRemove(\Doctrine\ODM\MongoDB\Event\LifecycleEventArgs $eventArgs)
{
}
User::deleted(function($user)
afterDestroy: function(deleted_record, next){
Cache.sync(next);
}
User.beforeCreate(function(user) {
if (user.accessLevel > 10 && user.username !== "Boss") {
throw new Error("You can't grant this user an access level above 10!")
}
}) Bulk Events for @fabien: Model.beforeBulkCreate(function(records, fields, fn) {
// records = the first argument sent to .bulkCreate
// fields = the second argument sent to .bulkCreate
})
Model.beforeBulkDestroy(function(whereClause, fn) {
// whereClause = first argument sent to Model.destroy
}) NHibernate Intercepter Interface public void OnDelete(object entity,
object id,
object[] state,
string[] propertyNames,
IType[] types) /// <summary>
/// Method to validate the containing entity right before the save sequence for the entity will start. LLBLGen Pro will call this method right after the
/// containing entity is selected from the save queue to be saved.
/// </summary>
/// <param name="involvedEntity">The involved entity.</param>
public override void ValidateEntityBeforeSave( IEntityCore involvedEntity )
{
CustomerEntity toValidate = (CustomerEntity)involvedEntity;
if( toValidate.BillingAddress == null )
{
throw new ORMEntityValidationException( "BillingAddress can't be null", toValidate );
}
if( toValidate.VisitingAddress == null )
{
throw new ORMEntityValidationException( "VisitingAddress can't be null", toValidate );
}
base.ValidateEntityBeforeSave( involvedEntity );
} from sqlalchemy import event
# standard decorator style
@event.listens_for(SomeClass, 'after_delete')
def receive_after_delete(mapper, connection, target):
"listen for the 'after_delete' event"
# ... (event handling logic) ... Standing on the shoulders of giants
Last thoughtThe ORM becomes the foundation for the entire application. Design of its interfaces and features even at this early stage should not be taken lightly. I apologize for ending with a negative fear inducing thought; I feel the power of the SL Process Manager won't matter much if developers are walking away from the ORM. |
Thanks for gathering all this info. However, to me it only illustrates that every ORM is doing it differently, and rightly so - each has its own way of interacting with the underlying datasource and such (not to mention the different platforms and languages you've gathered). Quite some time ago we first discussed this, and did look around for existing solutions - but those were already Node/JS based event/publish/subscribe-like solutions. I think SL is on the right track with this, it just needs to be fleshed out some more. |
The similarity in some of these examples is that the Model is passed into the eventhandler on the post-delete event, which adds some color to @JonnyBGod's point. The one exception is sequelize's beforeBulkDestroy where it passes in the where clause which LB does do it appears. Its been common for other orms to separate operations that do not work on Models in memory as 'Batch' or 'Bulk' operations with their own event handlers for more clarity of purpose of the api for the developer. It appears the first cut of the new event hooks push both in-memory model and database record/doc change events into the same hooks. Is that correct? Btw, I just re-read the docs on Operation Hooks and I think it looks great for single model event instance event and observers. |
I am not an expert on the matter of implementing an ORM, but from a user perspective I have to press the point that the returned data must be consistent. The problem with the "where" clause approach is that it is not consistent. ie: if you update by id you get and id but if you update by "path" you get a path. To account for this inconsistency we would have to create very static logics. A possible solution that I thought of would be to return an array of "ids" for all manipulated documents. Of course for "after delete" events the full documents must always be passed. |
@JonnyBGod the Also, I think having a single handler function adapt to single as well as batch operations makes the API in itself, the declaration of the hooks, very consistent. The availability of |
@fabien If having only a single handler for both single and bulk operations is the way it's going to be, then can we please update the docs to specifically state that certain bulk operations will not pass in an instance? It probably would have saved @JonnyBGod some frustration and many other developers who come from other ORMs. Something along the lines of ActiveRecord's docs section #5 on "Skipping Callbacks" |
@fabien ok I realise the problems you mention and believe we can cope with that. Will wait for #483 to check all our use cases. @Jeff-Lewis Yes better explicit documentation would help in times like this. |
The most common use cases I've seen for model events are for validation, caching and auditing. If any app needed cache consistency or confidence in auditing of all changes and deletes to all docs, what model operations would an app need to avoid or be limited to use? For reference and to get a sense of demand on the topic, here are some MongoDB/Mongoose plugins that were built to audit and track changes to docs, although I'm not familiar with all of their features. Also, this topic and the developer attitudes towards it is not unique. Please read this and this. limbo written specifically to get around this issue stated above in Mongoose. There's a lot of material out there regarding this topic and plenty of examples of it's implementation. SL-LB has an opportunity to learn from them and be as good or better given that it's a green-field app. |
@Jeff-Lewis http://docs.strongloop.com/display/public/LB/Operation+hooks already states exactly what the context will be, for each of the two scenarios (single/multiple). Unlike the Rails docs you mention (perhaps I missed something obvious there). Finally, I really don't see where all the complaints are coming from - we've already came a long way from having limited, simplified hooks that you had to implement, override and/or monkey-patch into a chain yourself. Also, did you see #483 ? Based on roughly 25 real-world demands (in plugins ~ mixins) I developed internally over the past half-year, I've set out to resolve any remaining issues. This includes versioning, auditing, i18n/translations, cascading deletes, file uploads ... I was able to pull these off with the old, very bare-bones hooks, and with the new operational hooks they will be much more reliable, because they can also act upon batch operations. Before, we had to mix I'm in the middle of updating these mixins (see strongloop/loopback-boot#79 for a related discussion - way overdue, so hopefully we'll merge this soon), and I'll aim to improve the operation hooks along the way. To be clear, I don't think it makes sense to have actual instances or the affected ids available to any of the hooks in batch mode. You'll have to implement this yourself, if you need this. Check this proposal for an example of how this would work: (the So yes, I agree with you and @JonnyBGod - the first implementation of operation hooks does not fully adhere to what's needed in real-world scenarios, but the basis of the implementation ( |
The current design is the result of several iterations on how to keep the hooks simple while supporting all DAO methods provided by LoopBack, see #403 for more details. The Operations hooks are not panacea, there are situations where a different tool makes more sense -see #397.
In short, because the database drivers don't return the list of deleted ids, it's not possible to provide this information in a generic way without a significant performance overhead. Please join the discussion in #474.
Consider the following call: There is one case where we have the instance but are passing the pair where+data to the hook instead: updateAttributes of a single model instance. Please read and join the discussion in #481.
If you can make your suggestion specific and actionable then I am happy to consider it. A pull request would be even better.
This is a chicken-and-egg problem. If we don't deprecate the old code, then people don't have any reason to migrate to the new code and thus nobody is going to test the new code. Deprecations is not removal, just a notice. You can continue using deprecated hooks, they are not going away at least until 3.x is released. You can suppress the warnings by setting |
I wrote my comment above based on the issue description, I did not notice the discussion already started while I was thinking and writing my response. Thank you all for taking the time to write the comments, your feedback is important to us. Also thank you @fabien for clarifying the common misunderstandings. We all want to make the new operation hooks work well for as many common scenarios as possible. I am supper happy that you @fabien managed to convert almost all of your mixins to use these new hooks and that it allowed you to simplify your implementation (if I read your comments correctly). For me, it is confirming that the new hook design is going in the right direction. Now let's focus on fixing any specific use cases which are not supported by the new hooks. We already have #481 for updateAttributes, #474 for passing the list of deleted ids/instances to "after delete" hook. Is there anything else? Please open a new github issue for each use-case that's not supported well by the current implementation and leave a comment here linking to the issue created. |
@bajtos to clarify, I am in the middle of converting a massive list of mixins. Fixing the first ones lead me to provide the PR, however incomplete it now seems. I am determined to fix this ASAP. Because I need this done .... today! And yes, I really like the new implementation - thanks for providing this! |
Another enhancement that was not captured by a standalone issue yet: #484 Shared state for hooks invoked for a single operation |
I'll do my best to review your changes to the point when we both agree the patch is ready to be merged. However, considering how opinionated this issue and the discussion is, I really need to hear from @raymondfeng and @ritch before we can land the changes. |
Added: strongloop/loopback#838 Add ctx. isNewInstance to "before save" and "after save" hooks |
Added #480 Embedded relations don't invoke operation hooks |
@JonnyBGod check #488 |
Added #487 updateOrCreate/upsert does not set ctx.instance |
Let's add our use cases to https://github.com/strongloop/loopback-datasource-juggler/wiki/Use-cases-for-intent-based-CRUD-hooks. |
I have a plea to anybody following this issue: please check the wiki page linked by Raymond and do add your use-case if it's not described yet. We would like to make hooks easy to use for as many users as possible, which we can't do hearing from you, the users of this feature. Please be as specific as possible, include a code snippet of your current implementation where possible. |
IMO there should be a different way how to implement this feature that won't involve hooks, see #441 (comment) |
Guys, thanks for opening it up like this. Hopefully we'll get some useful feedback. I'll try to add some cases. Although I have a small app already in production using SL, the very first iteration does not use hooks. I have other apps that use hooks of other ORMs which I would like to port to SL which is part of the reason why I've been following these changes. As mentioned earlier, the most common use cases I've used and relied on hooks from ORMs for the last 15+ years fall into 3 buckets:
Auditing is one that I feel can often be under appreciated by developers of B2C apps. For B2B and internally built enterprise applications, well instrumented auditing is usually the gem that keeps an application living in the enterprise past the typical 5 year life span of enterprise apps. It provides clarity and transparency to the data and data is always king. One simple use case is a CRM, many contacts are shared by many users. Any good CRM is going to track and report on all changes to contact data when, by whom, etc. Without going into specifics, I feel some of the pain in determining optimal design and with the PR @fabien is working on, is because we are trying to push to much functionality and different behaviors into a over-simplified ORM API interface. Creating a simple API is a noble exercise as we all know KISS is best, but in this case, I feel simplicity of function purpose and intent is more important and would provide easier adoption and usage of LB ORM API. For example, many other ORMs have in-memory model instance operations and bulk-operations (ones that operate directly on the database). Many developers are used to working with instance Models/Entities with a single, simple interface, regardless if the operation is taking place on one or many model instances. Yes, CRUD with thousands of in memory instance models in one operation is going to perform like a dog, but that's what experienced ORM developers a used to. In these cases, they usually switch then to the bulk operations knowing for full well the difference in behavior and how the bulk operations need to be handled differently in their application. In some way, it's analogous to a 'basic' and 'advanced' version of the ORM API. Separating in this fashion allows developers to adapt and use the in memory model instances for 80% maybe 90% of their app. Then the remaining 10-20% the developer can use bulk (DB Direct) operations to optimize performance when needed. Thus I feel separating the CRUD and Operation Hooks into (In-Memory) Instance vs (DB Direct) Bulk operations makes the most sense. |
The following issues have been fixed, though not all of the changes were released yet:
|
I appreciate the progress being made with the hooks. I want to throw something small into this discussion regarding The database won't don't give us much information after a delete, but it typically at least gives us the count of records removed. It seems natural that this information would make its way to the Beyond that, I'd like to give a hearty +1 to what @Jeff-Lewis wrote. It feels like too much is being crammed into one place, and perhaps in a bit of a rush. I also concur, for a large majority of the code being written, performance is less important than data integrity and avoiding unlikely but easy to overlook race conditions. A separate, lower level interface for bulk or special operations would be a reasonable solution, accompanied by clear documentation regarding the limits of the general purpose interface. |
Few thoughts to shed more light on the current design and issues in the juggler implementation. 1) SQL transactions versus NoSQL Most ORMs are depending on SQL transactions to provide atomicity of operations. For example, an update operation may be implemented like this:
Because all steps 1-5 are running in the same transaction, the database guarantees that no other client will modify the row while our operation is in progress. The situation is very different in the NoSQL world. There are no transactions and each database provide different set of tools to deal with that, for example MongoDB has atomic field operations like LoopBack and loopback-datasource-juggler provide a single ORM-like API for both SQL and NoSQL databases, thus many compromises must be made to cater to the needs of these different worlds. 2) Richness of the data-manipulation API provided by default By default, each LoopBack model provides (and exposes via REST) a rich API mixing both pure-ORM-like methods like "create", NoSQL-like partial updates like "prototype.updateAttributes", complex atomic operations like "updateOrCreate" and batch operations like "updateAll". The old hooks like beforeCreate and beforeUpdate were called only by the pure-ORM-like methods, to a big confusion of loopback users, who were expecting that "updateOrCreate" or "updateAttributes" should call these hooks too. And I agree with that, a generic hook like "beforeCreate" or "before save" should be called by all public methods that create new models and that are exposed by default. That's why I came with the design of more generic intention/operation hooks that are guaranteed to be called whenever the intention/operation is executed. The complexity of these hook is mirroring the complexity of the data-access API we are providing out of the box. 3) The ORM-like subset of the API is incomplete Ideally, if LoopBack was providing a full ORM API, applications that don't care about partial/batch updates could write hooks in the following way, effectively disabling these non-ORM methods: if (ctx.instance) { /* run the hook */ }
else return next(new Error('Partial/batch mode not supported.')); Unfortunately, we don't have a method that would update a whole model instance, therefore this approach is not feasible at the moment. Both If there were methods like |
|
@Jeff-Lewis I haven't been following this debate closely, but just FYI re: the mongoose issues you referenced, mongoose will NOT pass the affected document to |
Added few more items to the list:
Note that |
Closing this as we've created other issues to track the work. |
I would like to point out that the new operation hooks implementation is completely useless for several reasons.
Really... Please study long existing implementations for this kind of behavior (try mongoose) and just copy them.
DO NOT DEPRECATE old code before the new one is COMPLETELY tested! And I mean MONTHS of production test...
One last advice that has come to my mind on many occasions with strongloop. Do proper real life use case testing on all your products. If you keep using simpleton use cases you will continue making this kind of mistakes over and over again.
We are gathering list of uses cases to get a better understanding of what LoopBack users need from hooks, please add your use case to the list if it's not there yet.
https://github.com/strongloop/loopback-datasource-juggler/wiki/Use-cases-for-intent-based-CRUD-hooks.
List of specific known issues:
upsert does not set ctx.instance #487 updateOrCreate/upsert does not set ctx.instancebefore/after validate - Implement before/after validation hook #545, afterValidate operation hook strongloop/loopback#1226The text was updated successfully, but these errors were encountered: