Replies: 4 comments
-
Interesting! Here's my initial thoughts. In this context I would consider a genome to be a sample. "Sample" and "feature" are intentionally generic, and a lot of the downstream things we'd do would make perfect sense in this context (e.g., many alpha/beta diversity metrics would stand to provide useful information for comparing genomes, as would "taxa" barplots illustrating functional composition by genome).
What are you thinking falls in that category of things we'd do with a |
Beta Was this translation helpful? Give feedback.
-
Yeah this is what @misialq and I were discussing... that a sample is basically just a collection of observations (perhaps an oversimplification), in which case a genome fits the bill.
This is what it all comes down to, of course, and so far we don't have any such actions. All current actions that operate on a feature table should probably also work with genomes... e.g., filtering, alpha and beta diversity, differential abundance tests, supervised classification... probably even rarefaction and grouping. In the end, it feels like there may just be some theoretical not-yet-extant edge cases, for which @misialq proposed that we could always use properties to control. Maybe the one case in which a I personally would prefer using |
Beta Was this translation helpful? Give feedback.
-
I also felt that having a separate |
Beta Was this translation helpful? Give feedback.
-
As far as I know there is not a mechanism for type inheritance. I'm feeling like a property makes the most sense here too - like @nbokulich mentions to avoid the work with updates to lots and lots of relevant actions. |
Beta Was this translation helpful? Give feedback.
-
Hey @bokulich-lab/moshpit-team,
This topic is inspired by #49 (or is rather a follow up). @nbokulich and I were discussing about going from MAGs to annotations, and more specifically about generating some form of a "feature table" which would represent counts of observed annotations per genome (MAG). We felt that calling it a
FeatureTable
may not exactly be appropriate, as traditionally one axis in those is supposed to represent samples. This brought about a question: what actually is a sample? Perhaps we could consider a genome as a kind of a sample (representing a collection of annotations)? What do you all think?Now, what I would like to propose here is the following:
FeatureData[MAG]
, representing a collection of dereplicated MAGs, theeggnog-annotate
action could produce aGenomeData[NOG]
artifact (since we effectively obtain a set of annotations per genome, and that's exactly what we designed that type for)GenomeTable[Frequency]
which works in the very similar way to how a typicalFeatureTable
works but instead of the sample axis we have a genome axis (using a separate type allows us to have a distinction between samples and genomes for the purpose of not using genome table where it simply should not be used)GenomeTable[Frequency]
and the correspondingFeatureTable[Frequency]
(representing frequencies of MAGs per sample; to be figured out still) and collapses those two together to produce a newFeatureTable[Frequency]
, this time representing samples vs. functional features/annotationsThat's just a dump of some "initial" thoughts. Please let me know what you all think - thanks! 🙏
Beta Was this translation helpful? Give feedback.
All reactions