Quist (pronounced [kʊɪst], like "quick") is a small query language for querying JSON data. It is designed to be simple, easy to learn, and never has syntax errors. It does this by falling back to plain-text matching as soon as possible and only allows query syntax at its strictest interpretation.
Quist is used on CourseTable to query course data.
At the highest level, you write a query like this:
query1 OR query2 OR query3
This query will return the union of the results of query1
, query2
, and query3
.
AND is the default operator, so you can also write:
query1 query2 query3
query1 AND query2 AND query3
Which both return the intersection of the results of query1
, query2
, and query3
.
Note that you can only use one type of operator in one level. In the following:
query1 OR query2 AND query3
OR
is used as the operator, and AND
is simply treated as plain text!
To mix AND
and OR
, you can use parentheses:
query1 OR (query2 AND query3)
NOT
is also supported. You can prefix any query with NOT
to negate it:
NOT query1 OR NOT query2
# Equivalent to:
NOT (query1 AND query2)
To discuss queries, we first need to talk about data types. Quist supports the following JSON data types:
- Categorical: typically a string with a fixed set of values.
- Numerical: a number.
- Boolean:
true
orfalse
. - Text: arbitrary string.
- Set: an array of categorical values.
In the context of CourseTable, here are some example fields:
- Categorical:
school
,season
,type
,subject
- Numerical:
rating
,workload
,professor-rating
,number
,enrollment
,credits
- Boolean:
cancelled
,conflicting
,grad
,fysem
,colsem
,discussion
- Set:
skills
,areas
,days
,info-attributes
,subjects
,professor-names
- Text:
title
,description
,location
Each type corresponds to its own set of operators. If during parsing, we encounter an operator on a field that's not of the right type, we stop treating it as an operator and treat it as plain text instead!
All value
below should be string literals. In Quist, a string is either space-delimited, or double quoted. For example, days:has Monday
and professor-names:has "Jay Lim"
are both valid.
field:is value
: the field is exactlyvalue
.- This is like
field = value
.
- This is like
field:in value1, value2, ..., valueN
: the field is one of the values. We end looking for values when a value is followed by another string without a comma in between.- This is like
field IN (value1, value2, ..., valueN)
.
- This is like
Querying numerical fields is the most different because you use mathematical expressions. All num
below should be number literals.
field < num
: the field is less thannum
.field <= num
: the field is less than or equal tonum
.field > num
: the field is greater thannum
.field >= num
: the field is greater than or equal tonum
.field = num
: the field is equal tonum
.field != num
: the field is not equal tonum
.
The field
must appear on the left-hand side of the operator. We also support a compound expression syntax.
num < field < num
(either<
could be<=
)num > field > num
(either>
could be>=
)
Note that you can't do other compound expressions like num < field > num
. If we encounter any kind of invalid expression, we treat the whole thing as plain text. So the above is just 5 words.
is:field
: the field istrue
.not:field
: the field isfalse
. The same asNOT is:field
.
field:has value
: the field containsvalue
.field:has-all-of value1, value2, ..., valueN
: the field contains all of the values (i.e. the field is a superset of the given values).field:has-any-of value1, value2, ..., valueN
: the field contains any of the values (i.e. the field has an intersection with the given values).field:all-in value1, value2, ..., valueN
: the field is a subset of the given values.field:equals value1, value2, ..., valueN
: the field is exactly the same as the given values.
field:contains value
: the field containsvalue
as any substring. Case-insensitive by normalizing both the field value and thevalue
to lower case.field:contains-words value
: the field containsvalue
as whole words. For example, "photography" contains "graph" but doesn't contain it as a whole word. Case-insensitive by normalizing both the field value and thevalue
to lower case.- TODO: this doesn't support multi-word values yet.
field:matches value
: the field matchesvalue
wherevalue
is a regex pattern. The regex is compiled with theu
andi
flags.- TODO: support regex flags?
For text operations specifically, we have a special field name *
, which should cause it to match all fields that contain text.
Also, any token that does not belong to a query is implicitly part of *:contains
. For example, Hello world
as a query is the same as *:contains Hello *:contains world
.
We have a single API: buildEvaluator
. It takes two parameters describing your intended JSON shape:
targetTypes
-
Defines the type of each field, and queries containing unrecognized fields or fields with wrong types will become plain text. It can have the following keys:
categorical
numeric
boolean
set
text
Each key should have a
Set
of field names that are of that type. The field names can contain any character except: whitespace,(
,)
,,
,:
,=
,<
,>
,!
,"
.
-
targetGetter
- Defines how to map field names in queries to actual values in the JSON. By default, it uses
(data, field) => data[field]
, but you can customize this logic. By default, it does not support the*
query. You must explicitly define what it means intargetGetter
.
- Defines how to map field names in queries to actual values in the JSON. By default, it uses
buildEvaluator
returns a query evaluator, which is a function that takes a query string and returns a predicate. This predicate is a function that takes a target value and returns a boolean indicating whether the target value satisfies the query.
For the exact signature, refer to our TypeScript definitions.
import { buildEvaluator } from 'quist';
const targetTypes = {
boolean: new Set(['fysem', 'grad']),
set: new Set(['professor-names']),
categorical: new Set(['subject']),
numeric: new Set(['number']),
text: new Set(['title', 'description']),
};
const targetGetter = (data, field, expr) => {
// Return the value of the field in the target.
if (field === 'professor-names') {
return target.professors.map((p) => p.name);
} else if (field === '*') {
// Wildcard field; merge all fields that should be matched based on the operation
return `${target.title} ${target.description}`;
}
return target[field];
};
const evaluator = buildEvaluator(targetTypes, targetGetter);
const predicate = evaluator(
'(subject:in MATH, CPSC, S&DS AND 300<=number<500 AND NOT professor-names:has-any-of "Bruce Wayne", "Tony Stark") OR is:fysem',
);
console.log(
predicate({
subject: 'MATH',
number: 350,
professors: [{ name: 'Peter Parker' }],
fysem: true,
}),
); // true
For type inference, we strongly recommend you declare your targetTypes
and targetGetter
inline.
const evaluator = buildEvaluator(
{
boolean: new Set(['fysem', 'grad']),
set: new Set(['professor-names']),
categorical: new Set(['subject']),
numeric: new Set(['number']),
},
(data: CourseType, field, expr) => {
if (field === 'professor-names') {
return data.professors.map((p) => p.name);
} else if (field === '*') {
return `${data.title} ${data.description}`;
}
return data[field];
},
);
const predicate = evaluator(
'(subject:in MATH, CPSC, S&DS AND 300<=number<500 AND NOT professor-names:has-any-of "Bruce Wayne", "Tony Stark") OR is:fysem',
); // predicate only accepts CourseType
console.log(predicate(course));
You will notice that everything is automatically strongly typed!
Note: we are still improving type safety as much as we can.