Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MIMIC-III Concepts for Sqlite3 or DuckDB (probably now DuckDB) #1495

Open
1 task done
SphtKr opened this issue Feb 27, 2023 · 5 comments · May be fixed by #1529
Open
1 task done

MIMIC-III Concepts for Sqlite3 or DuckDB (probably now DuckDB) #1495

SphtKr opened this issue Feb 27, 2023 · 5 comments · May be fixed by #1529

Comments

@SphtKr
Copy link
Contributor

SphtKr commented Feb 27, 2023

Prerequisites

Description

Filing this because I am considering adding it: Create a set of scripts that will add the MIMIC-III concepts views to the Sqlite3 version of the database.

Has anyone tried this before, and if so any encountered difficulties? I'm having some initial success just with regexp-ing the Postgres scripts, though that may or may not be a practical solution for the whole task. Sqlite does not have functions/procedures, so some things might have to be done with python functions--which would preclude embedding them into materialized views (pretty sure), however, simply building the concepts entities as static tables may be an acceptable solution for the kinds of use cases you'd want to use this data and Sqlite for. Or maybe regexping the Postgres/BigQuery SQL will get me to views in Sqlite3 dialect.

@alistairewj
Copy link
Member

I'm not aware of a port personally. One thing to note is that the PostgreSQL concepts in this repo are actually generated from the BigQuery ones using regex. To make this simpler, I opted to create a few PostgreSQL functions which replicate the syntax of the BigQuery functions. You may find it easier to move from the BigQuery syntax to SQLite if the functions are a blocker.

You can see the script I use to convert from BigQuery to Postgres here (this is for MIMIC-IV, but the MIMIC-III one is similar): https://github.com/MIT-LCP/mimic-code/blob/main/mimic-iv/concepts/convert_bigquery_to_postgres.sh

Converting to sqlite should be possible since the queries avoided to many BigQuery specific constructs to allow conversion to psql.

@SphtKr
Copy link
Contributor Author

SphtKr commented Mar 25, 2023

Thanks. I noted those! I was hoping to do something similar, either generating things from the BigQuery or from the Postgres ones... I was making progress and got a little hung up on GENERATE_SERIES-based views, and I wasn't finding an sqlite equivalent for table-valued functions... or at least not an easy one, or one that didn't require adding additional dependencies (peewee)... I might come back to that, but have shifted for now to looking at duckdb instead, which has what looks like richer SQL support with a similar embedded design. I noticed MIMIC-IV has a duckdb builder, but MIMIC-III doesn't... if I can get the concepts built in one or the other I think that'll be useful.

On the topic of generating the SQL for one dialect from another... I also discovered sqlglot...which is super-interesting but definitely not a panacea...yet. Trying to make it do my bidding for translating the BigQuery stuff to duckdb right now... we'll see how that goes.

@alistairewj
Copy link
Member

IIRC there's only one query that uses generate series and I ended up hard coding it in psql because the conversion was indeed painful.

Build scripts for MIMIC-III in duckdb should exist in the repo now. One of the PRs from today or yesterday brought it in.

Good luck!

@SphtKr
Copy link
Contributor Author

SphtKr commented Apr 25, 2023

Build scripts for MIMIC-III in duckdb should exist in the repo now. One of the PRs from today or yesterday brought it in.

🎉

Random question on the concepts... I have now gotten as far as the pivot concept views (I think this might actually succeed!) and have an issue with pivoted_oasis.... but then I noticed that the pivot concept views don't seem to be translated to the Postgres version... am I missing them somewhere? Or are they kind of "bonus" and not included in the Postgres version? (Bottom line... would DuckDB version that excluded these be acceptable if I can't figure out how to fix it?)

@alistairewj
Copy link
Member

That's probably an oversight actually. The directory currently is not included in the list of subdirs to convert:

for d in durations comorbidity demographics firstday fluid_balance sepsis diagnosis organfailure severityscores;

There's nothing intrinsic about the pivot concepts that makes them difficult to convert to PostgreSQL. The only keywords they might use that aren't common elsewhere are LEAD and LAG, but I'd be surprised if duckdb doesn't already support those.

I wouldn't let the lack of completeness stop a contribution! I find the pivot queries extremely useful - which is why most of the MIMIC-IV concepts adopt the style out of the box (almost all the MIMIC-IV queries are id, time, <data specific features).

@SphtKr SphtKr changed the title MIMIC-III Concepts for Sqlite3 MIMIC-III Concepts for Sqlite3 or DuckDB (probably now DuckDB) Apr 26, 2023
@SphtKr SphtKr linked a pull request Apr 28, 2023 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants