-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Any thoughts on how to expand the metacore data model to other dimensions #51
Comments
I think for now we want to keep metacore specific to datasets for now. But, we would like to increase the number of metadata packages so more automation can be driven by metadata. I am currently working on a separate package called tfrmt which creates an object to store display metadata and applies that metadata when a dataset is available. Additionally I would love to build out the derivations table of metacore to make it easier to apply simple derivations like BMI or other straight forward calculations. Out of curiosity what tasks are you thinking about? |
Hi Christina, thanks for the quick reply. I was also thinking of metadata for TLF / output generation (which I understand you are currently developing with tfrmt). But in general other possibilities could come to mind, like specifying metadata for recurrent tasks (oversight, safety reports, DSUR, integrating data from multiple studies, etc). In the end, you could also store this information on a separate table, which would be linked to the existing metacore structure (this task depends on datasets abc, produces outputs xyz, the output contains variables efg, has formatting..., etc). That's more or less the direction I was thinking. I think "metadata" is a broad term and I agree with you, maybe it makes more sense to split different metadata types into separate packages. But the scaling question cannot be completely dismissed. How do you do if you have 30 different studies for one asset, do you create 30 independent metacore objects? |
That would be my gut reaction. In order to build out some of the other things like safety reports etc. I wonder if we could build a different object that would interact with metacores. But I think I need to understand a bit more about the particular changes before making anything |
This is an interesting problem - but @feigs metacore object itself is basically a single slice of the specifications for a particular deliverable. This is driven by existing data from the that a company would have on hand to support that CDISC deliverable. You're essentially asking for a versioning mechanism, which would kind of act like a layer on top of a metacore object. For a DMC, DSUR, etc., you'd have a metacore object for each of those that contains the SDTM or ADaM metadata for each deliverable. This kind of scales into a larger database structure of protocol -> deliverable -> type (i.e. SDTM, ADaM, and then there's realistically more TFL metadata) -> which leads down to the metacore object. At a larger company scale, this is kind of the import of data from an MDR scale into an R session. So we'd definitely need input of how to scale this. That said, I could see value in a higher level object that talks with different metacore type objects to query out the metadata that you need in program. |
Hi @mstackhouse and @statasaurus, since this issue is still open, I would like to give my thoughts on this subject after using metacore for some months. I think, as @mstackhouse has pointed out, metacore cannot be used as a substitute for a full-fledged MDR and thus cannot account for evey possible types of metadata. Maybe a higher-level object could handle metacore and other types of metadata objects. And I know you have been working on other exciting features (table metadata visualization, etc). |
Hi @feigs - I think it would help for you to maybe mock up some code examples to show us how you'd like metacore to be extended. We've discussed extensibility in metacore before, but we need to understand more about how the users would like it to be extensible, so any feedback that you have here would definitely help. Furthermore, like Christina mentioned there are some specific metadata use cases we've seen which drove the development of the package tfrmt. But we also note that there are viable use cases for things like titles and footnotes that need to be driven from external data like databases or spreadsheets. The lane that we're trying to stick in for metacore is that it is a container for external metadata, which can then be consumed by other packages like metatools, or xportr to carry out actions on those metadata. Ultimately, I'm not opposed to creating separate R6 objects based on different use cases, or opening up extensibility of existing objects to add in additional non-standard tables. So let us know what / how you might like to see that implemented. |
The idea of centralizing metadata into a relational data structure is a great one, although right now the current model only accounts for a "slice" of possible metadata storage within an R session. Think of one metacore object per dataset type (e.g. SDTM x ADAM), Task, Study, etc. One could of course just loop over the different dimensions and store individual metacore objects in a list, but some of this information could also be natively incorporated into the model. Are you currently discussing on how to expand the structure to account for these other dimensions or is this not part of the scope of the project?
The text was updated successfully, but these errors were encountered: