Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace visualization example #9

Open
ryanlovett opened this issue Apr 12, 2024 · 1 comment
Open

Replace visualization example #9

ryanlovett opened this issue Apr 12, 2024 · 1 comment

Comments

@ryanlovett
Copy link
Member

@minrk Should I just replace the visualization example with the one you created, demo-ch-data.ipynb ? Yours demonstrates the deep link, but also flattens the dict, uses the dataframe as input to plotting rather than json, and has visualizations too. I don't think is is long enough to warrant splitting up into smaller examples notebooks either (link, parsing, plotting).

@minrk
Copy link
Member

minrk commented Apr 15, 2024

my example notebook in the repo uses real data that we can't publish a chart with, so to have an example with synthetic data is good, I think we still want that. If you wanted to work on a synthetic data function that satisfied the CHCS data format (same as OMH in the 'body', but add the header as well), that would help us make example charts in public.

I think it's still an open question how best to transform the chcs list of ResourceHolder objects returned from get_data into a DataFrame or something to plot. In your example, you have a list of OMH schema-satisfying records to pass to a plotting function. I think this is good and general, regardless of where data might come from. To produce that from the ResourceHolder would be:

records = ch_client.fetch_data(patient_id)
omh_bp_records = [
    record.json_content["body"]
    for record in records
    if record.resource_type == "BLOOD_PRESSURE"
]

In my example, I decided to preserve all of the structure of the chcs record with something generic, which includes a header and doesn't understand any of the schemas. This produces not very nice columns, but it works in general (as long as nothing interesting is in a list...). I think understanding the schema is probably a good idea, though maybe there's something to be said for column names being deducible from the schema, rather than something nice for humans.

I think it's worth asking around about how folks tend to receive OMH data and if they have any existing visualizations or plotting tools, so we can get data from CHCS into the most useful shape for people already experienced working with this data.

I think something that would be useful in general is a function like my tidy_record, but that actually understands schemas and produces nice, standard dictionaries that will work with pd.DataFrame.from_record, if you want to work on that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants