How would I do this in PRQL? - reservation database #2498
Replies: 4 comments 4 replies
-
This was a great exercise! These are good examples.
I think PRQL could be quite helpful here at composability, since we can apply the window function to both the current occupancy relation and the one for the prior year. let stays_prior = (
from stays
derive dt = stays.year - 1years
# derive occupancy = occupancy_prior — edit: removed
)
func rolling_occupancy tbl -> (
tbl
group year (
window expanding:true (
select [dt, occupancy = (sum occupancy)]
)
)
)
from stays_r=(from stays | rolling_occupancy)
join stays_prior_r=(from stays_prior | rolling_occupancy) [==dt]
select [
stays_r.dt,
occupancy_diff = stays_r.occupancy - stays_prior_r.occupancy
] which compiles to: WITH table_3 AS (
SELECT
dt,
SUM(occupancy) OVER (
PARTITION BY year ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS _expr_0
FROM
stays
),
stays_prior AS (
SELECT
*,
year - INTERVAL 1 YEAR AS date,
occupancy_prior AS occupancy
FROM
stays
),
table_0 AS (
SELECT
dt,
SUM(occupancy) OVER (
PARTITION BY year ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS occupancy
FROM
stays_prior
)
SELECT
table_2.dt,
table_2._expr_0 - table_1.occupancy AS occupancy_diff
FROM
table_3 AS table_2
JOIN table_0 AS table_1 ON table_2.dt = table_1.dt
-- Generated by PRQL compiler version:0.8.0 (https://prql-lang.org) Example 2 Converting between rows & columns is admittedly not SQL's strongest point. Some DBs have built things to do this — e.g. https://duckdb.org/docs/sql/statements/pivot.html — but there isn't a standard way. What DB are you using? SQLite? It can be done manually with a bunch of So I would suggest checking out what the DB offers and using that... Happy to spend more time on this one if helpful |
Beta Was this translation helpful? Give feedback.
-
Sample Data... 2014-2015.csv |
Beta Was this translation helpful? Give feedback.
-
Thanks again for your suggestions about queries for counting reservations. I have a few questions:
Thanks again. |
Beta Was this translation helpful? Give feedback.
-
Hmmm... I think my data is different shape from what you describe. My table is two dates: As I work through the suggestion above, it looks as if that solution relies on summing the Update: For clarity, my example above and the sample data have columns of |
Beta Was this translation helpful? Give feedback.
-
Here's a pretty complicated use case for PRQL. (And I imagine that it's a complicated SQL query as well). I ask your indulgence for helping me learn about this stuff.
I have a database that has information about several years of reservations at a hospitality business. The interesting table has two columns:
staydate - yyyy-mm-dd indicating the night a cabin was/will be occupied
resdate - yyyy-mm-dd indicating the date the reservation was made
The table is many-to-one: a guest who stays for a week has seven rows in the table (one for each night) with successive dates in the staydate column, and all seven resdate columns contain the date the reservation was made.
Example 1: When managing a hospitality business, it's interesting to know "What is this year's occupancy compared to the occupancy for previous years as of a certain date?" For example, what's the occupancy (total nights reserved) on February 1 or April 15 compared the same date in previous years? The query would get a count of all the rows where resdate <= the date in question, for each year that we have data, and the output would be something like:
Example 2: Rather than run the query "on-demand" (that is, for any particular date), it would be helpful to print a single table showing occupancy on the first and fifteenth of each month in the form:
My question: How could I structure a PRQL (or even SQL) query for either of these examples? I assume that I might need to provide a new table defining the interesting dates -"01-01", "01-15", "02-01", "02-15", ... and concatenate the relevant years. I also realize that the second example would wind up with a variable number of columns. I would be content with a solution that gives 24 columns (first and 15th of each month) for each year.
I assume that this could turn into an extremely complicated query, so I thank everyone in advance for their thoughts here.
I'm mentioning @max-sixty @snth @eitsupi @aljazerzen to see if they have thoughts. Please feel free to ignore if you have important work. Thanks again
Update: I changed the column names above to
staydate
andresdate
to be consistent with the remainder of this threadBeta Was this translation helpful? Give feedback.
All reactions