There are multiple ways of achieving results similar to what this gems does using either only built-in ActiveRecord functionalities or other gems.
This is a list of some of those alternatives, explaining what issues they have or reasons to prefer this gem over them.
Use this gem, you will avoid problems and save time
- No more having to choose, case by case, which way has the less problems.
Just use
#where_assoc_*
each time and avoid every problems. - Need less raw SQL, which means less code, more clarity and less maintenance.
- Generates a single
#where
. No weird side-effects things like#eager_load
or#join
This makes well-behaved scopes, you can even have multiple conditions on the same association - Handles recursive associations correctly.
- Handles has_one correctly (Except MySQL has a limitation).
- Handles polymorphic belongs_to
Summary of the problems of the alternatives that the activerecord_where_assoc
gem solves. The following sections go in more details.
- every alternatives (except raw SQL):
- treat
#has_one
like a#has_many
. - can't handle recursive associations nicely. (ex: parent/children)
- no simple way of checking for more complex counts. (such as
less than 5
)
- treat
#joins
/#includes
:- doing
not exists
with conditions requires aLEFT JOIN
with the conditions as part of theON
, which requires raw SQL. - checking for 2 sets of conditions on different records of the same association won't work without extra things. (so your scopes can be incompatible)
- can't be used with Rails 5's
#or
. - doesn't work for polymorphic belongs_to.
- doesn't compose well.
- doing
joins
:has_many
may return duplicate records.- using
uniq
/distinct
to solve duplicate rows is an unexpected side-effect when this is in a scope.
#includes
:- triggers eagerloading, which makes your
scope
have unexpected bad performances if it's not necessary. - when using a condition, the eagerloaded records are also filtered, which is very bug-prone when in a scope.
- triggers eagerloading, which makes your
- raw SQL:
- verbose, less clear on the goal of the queries (you don't even name the association the query is about).
- need to repeat conditions from the association / default_scope.
#where_exists
gem:- No other problems than the ones common to every alternatives written above.
These are problems that affect most alternatives. Details are written in this section and just referred to by a one liner when they apply to an alternative.
Every alternative treats a has_one just like a has_many. So if any of the records (instead of only the first) matches your condition, you will get a match.
And example to clarify:
class Person < ActiveRecord::Base
has_many :addresses
has_one :current_address, -> { order("effective_date DESC") }, class_name: 'Address'
end
# This correctly matches only those whose current_address is in Montreal
Person.where_assoc_exists(:current_address, city: 'Montreal')
# In every alternatives (except raw SQL), doing a joins or anything else on :current_address will
# actually do the exact same thing as doing it on :addresses. So their effect will be identical to:
Person.where_assoc_exists(:addresses, city: 'Montreal')
The general version of this problem is to handle #limit
and #offset
on associations and in default_scopes.
#where_assoc_*
methods handle #limit
, #offset
and #has_one
correctly and checks that the records that
match the limit and the offset also match the condition.
Note: MySQL has a limitation, this makes handling has_one correctly not possible with MySQL.
Having to write the joins and conditions in raw SQL is more painful and more error prone than having a method do it for you. It hides the important details of what you are doing in a lot of verbosity.
If there are conditions set on either the association or a default_scope of the model, then you must rewrite those conditions in your manual joins and your manual sub-selects. Worst, if you add/change those conditions on the association / default_scope, then you must find every raw SQL that apply and do the same operation.
class Post < ActiveRecord::Base
# Any raw SQL doing a join or sub-select on public_comments, if it want to be representative,
# must repeat "public = true".
has_many :public_comments, -> { where(public: true) }, class_name: 'Comment'
end
class Comment < ActiveRecord::Base
# Any raw SQL doing a join or sub-select to this model, if it want to be representative,
# must repeat "deleted_at IS NULL".
default_scope -> { where(deleted_at: nil) }
end
All of this is done for you by the #where_assoc_*
methods.
Using #joins
, #includes
+ #references
, #eager_load
affects the query as a whole. This means you can't
use tools that only interact with the conditions.
For example, you can't use #or
, because that just deals with the conditions (the #where
), but if one of
your relation has a #joins
, the implicit "condition" that "there must be a record" is not part of the #where
.
Actually, the #or
will refuse to mix queries that mismatch structurally:
# Posts by an admin or that have comments.
# This will raise an exception because the joined tables of each queries are different.
Post.by_admin.or(Post.joins(:comments))
It can work, when both side use the same joins and only have different conditions.
Since the #where_assoc_*
methods only add a single #where
, they are compatible with #or
and other similar tools.
When you have recursive associations such as parent/children, you are interacting with the same table twice.
Using #joins
, #includes
+ #references
, #eager_load
automatically create an alias for you. But if you
want to have conditions, this alias can be arcane and will change as you do more such joins. Overall, this
feels complicated and won't work too well in scopes.
The last option is to use raw SQL, which has problems.
#where_assoc_*
methods handle this seamlessly. The conditions can use the real table name, so any scope can be used.
When you have a polymorphic belongs_to, you can't use #joins
or #includes
in order to do queries on it.
You have to use manual SQL (raw SQL joins) or a gem that provides the
feature, such as activerecord_where_assoc
.
#where_assoc_*
methods can handle this in 3 ways based on the
:poly_belongs_to option:
- The default will raise an exception
- You can have the gem do a
#pluck
to auto detect which models to search in, but this can be expensive - You can specify which models to search in, this has the added benefit of allowing to search for a subset only
Let's say Posts that have a comment that was reported and a comment that was made by an admin. You have
to be careful, because using #includes
and #joins
will instead return Posts what have a comment that is
both reported and made by an admin at the same time.
If you are looking for it to possibly be 2 different comments, then you either need to write the #joins
manually or to use simply use the #where_assoc_*
of this gem.
Those are the common ways given in stack overflow answers.
Post.where_assoc_exists(:comments, is_spam: true)
Post.joins(:comments).where(comments: {is_spam: true})
-
If the association maps to multiple records (such as with a has_many), then the the relation will return one record for each matching association record. In this example, you would get the same post twice if it has 2 comments that are marked as spam. Using
uniq
can solve this issue, but if you do that in a scope, then that scope unexpectedly adds a DISTINCT to your query, which can lead to unexpected results if you actually wanted duplicates for a different reason. -
Doing the opposite is a lot more complicated, as seen below. You have to include your conditions directly in the join and use a LEFT JOIN, this means writing the whole thing in raw SQL, and then you must check for the id of the association to be empty.
Post.where_assoc_not_exists(:comments, is_spam: true)
Post.joins("LEFT JOIN comments ON posts.id = comments.post_id AND comments.is_spam = true").where(comments: {id: nil})
Writing a raw join like that has yet more problems: raw SQL joins
- If you want to have another condition referring to the same association (or just the same table), then you need to write out the SQL for the second join using an alias. Therefore, your scopes are not even compatible unless each of them has a join with a unique alias.
# We want to be able to match either different or the same records
Post.where_assoc_exists(:comments, is_spam: true)
.where_assoc_exists(:comments, is_reported: true)
# Please don't ever do this, this just shows how painful it would be
# If you reach the need to do this but won't use where_assoc_exists,
# go for a regular #where("EXISTS( SELECT ...)")
Post.joins(:comments).where(comments: {is_spam: true})
.joins("JOIN comments comments_for_reported ON posts.id = comments_for_reported.post_id")
.where(comments_for_reported: {is_reported: true})
- Cannot be used with Rails 5's
#or
- Treats has_one like a has_many
- Can't handle recursive associations
- Can't handle polymorphic belongs_to
- Doesn't compose well
This solution is similar to the joins
one above, but avoids the need for uniq
. Every other problems of the
joins
remain. You also add other potential issues.
Post.where_assoc_exists(:comments, is_spam: true)
Post.eager_load(:comments).where(comments: {is_spam: true})
-
You are triggering the loading of potentially lots of records that you might not need. You don't expect a scope like
have_reported_comments
to trigger eager loading. This is a performance degradation. -
The eager loaded records of the association are actually also filtered by the conditions. All of the posts returned will only have the comments that are spam. This means if you iterate on
Post.have_reported_comments
to display each of the comments of the posts that have at least one reported comment, you are actually only going to display the reported comments. This may be what you wanted to do, but it clearly isn't intuitive. -
Simply cannot be used for complex cases.
Note: using #includes
(or #eager_load
) already does a LEFT JOIN, so it is pretty easy to do a "not exists",
but only if you don't need any condition on the association (which would normally need to be in the JOIN clause):
Post.where_assoc_exists(:comments)
Post.eager_load(:comments).where(comments: {id: nil})
This is what is gem does behind the scene, but doing it manually can lead to troubles:
-
Problems with writing raw SQL sub-selects
-
Unless you do a quite complex nested sub-selects, you will treat has_one like a has_many
https://github.com/EugZol/where_exists
An interesting gem that also does EXISTS (SELECT ... )
behind the scene. Solves most issues from ActiveRecord
only alternatives, but appears less powerful than where_assoc_exists.
-
where_exists supports polymorphic belongs_to only by always doing a
#pluck
everytime. In some situation could be a slow query if there is a lots of rows to scan. where_assoc also allows directly specifying the classes manually, avoiding the pluck and possibly filtering the choices. -
No shortcut to dig deeper in the associations
# with activerecord_where_assoc
User.where_assoc_exists([:posts, :comments])
# with where_exists
User.where_exists(:posts) { |posts| posts.where_exists(comments) }
- Has no equivalent to
#where_assoc_count
# There is no equivalent for this (posts with more than 5 comments)
Post.where_assoc_count(5, :<, :comments)
-
#where_exists
is shorter than#where_assoc_exists
, but it is also less obvious about what it does. In any case, it is trivial to alias one name to the other one. -
where_exists supports Rails 4.2 and up, while activerecord_where_assoc supports Rails 4.1 and up.