Column names are not set correctly (MIMIC III) #1738

1990PACO · 2024-04-20T09:49:50Z

Prerequisites

[X ] Put an X between the brackets on this line if you have done all of the following:
- Checked the online documentation: https://mimic.mit.edu/
- Checked that your issue isn't already addressed: https://github.com/MIT-LCP/mimic-code/issues?utf8=%E2%9C%93&q=

Description

I had implement MIMIC III as Postgres SQL localy.
When fetching the column names in Python with pyodbc they partially dont match to the .csv files in compairsion.

def query(table, sql):
"""
table: input tablename to query \n
sql: input SQL query (select * from mimiciii.table) \n
return --> Pandas Dataframe
"""
cnstring = f'DRIVER={{PostgreSQL ODBC Driver(UNICODE)}};SERVER={SERVER};DATABASE={DATABASE};UID={USERNAME};PWD={PASSWORD}'
cnxn = pyodbc.connect(cnstring)
cursor = cnxn.cursor()
colnames = cursor.execute(f"SELECT column_name FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = '{table}';").fetchall()
rows = cursor.execute(sql).fetchall()
df = pd.DataFrame.from_records(data=rows, columns=[colname[0] for colname in colnames])
df.columns = [str(i).upper() for i in df.columns]
cursor.close()
return df

Example:
Wrong -- > ['ROW_ID', 'SUBJECT_ID', 'HADM_ID', 'ADMITTIME', 'DISCHTIME',
'DEATHTIME', 'EDREGTIME', 'EDOUTTIME', 'HOSPITAL_EXPIRE_FLAG',
'HAS_CHARTEVENTS_DATA', 'LANGUAGE', 'RELIGION', 'MARITAL_STATUS',
'ETHNICITY', 'DIAGNOSIS', 'ADMISSION_TYPE', 'ADMISSION_LOCATION',
'DISCHARGE_LOCATION', 'INSURANCE']

Correct --> ['ROW_ID', 'SUBJECT_ID', 'HADM_ID', 'ADMITTIME', 'DISCHTIME',
'DEATHTIME', 'EDREGTIME', 'EDOUTTIME', 'DISCHARGE_LOCATION',
'INSURANCE', 'LANGUAGE', 'RELIGION', 'MARITAL_STATUS',
'ETHNICITY', 'DIAGNOSIS', 'ADMISSION_TYPE', 'ADMISSION_LOCATION',
'HOSPITAL_EXPIRE_FLAG', 'HAS_CHARTEVENTS_DATA']

alistairewj · 2024-05-06T12:11:27Z

The query:

colnames = cursor.execute(f"SELECT column_name FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = '{table}';").fetchall()

Returns column names in a non-deterministic order, so you wouldn't expect the order to match exactly. The two groups look the same, just in a different order. I would verify that the sets of columns are equal, in which case this is expected and you just need to re-order your columns as necessary.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Column names are not set correctly (MIMIC III) #1738

Column names are not set correctly (MIMIC III) #1738

1990PACO commented Apr 20, 2024

alistairewj commented May 6, 2024

Column names are not set correctly (MIMIC III) #1738

Column names are not set correctly (MIMIC III) #1738

Comments

1990PACO commented Apr 20, 2024

Prerequisites

Description

alistairewj commented May 6, 2024