redefine FeatureData[NOG]
type that is generated as output from eggnog-annotate
, and add support for downstream analysis of this data
#49
Replies: 6 comments 1 reply
-
After doing a little research on this, I think we should make the following changes to the
* Notes on
An example of the current @misialq, @nbokulich, @colinvwood, @ebolyen - thoughts on this? |
Beta Was this translation helpful? Give feedback.
-
The first point shows some close parallels with what we already discussed wrt unnormalized taxonomic frequencies, e.g., output by kraken2. I disagree that a |
Beta Was this translation helpful? Give feedback.
-
Thanks for looking into that @gregcaporaso! Here's what I think:
When looking at those two actions I only now noticed we are using contigs as input - why is this actually? Shouldn't we rather process either reads directly or dereplicated MAGs? (I know that neither of those would solve the issue described above, but just trying to think about the most probable routes the the users would want to take here... |
Beta Was this translation helpful? Give feedback.
-
Hey @gregcaporaso, some more thoughts after discussing this today with @nbokulich:
How does that sound? |
Beta Was this translation helpful? Give feedback.
-
I like it. Good call about keeping the original output accessible for users who may want that.
Works for me.
That makes sense, but I think the filtering can happen upstream on the graph TD;
actEDS([eggnog-diamond-search])
actEA([eggnog-annotate])
artFSDB([filter-sample-data-blast6])
artFT(((FeatureTable)))
artSDB((("SampleData#91;BLAST6#93;")))
artSDN((("SampleData#91;NOG#93;")))
artFDN((("FeatureData#91;NOG#93;")))
actEDS-->artFT;
actEDS-->artSDB;
artSDB-->actEA;
artSDB-->artFSDB;
artFSDB-->artSDB;
actEA-->artSDN;
actEA-->artFDN;
In my flowchart above, Here's the eggNOG 6 paper, and the list of annotation sources including
If we go this route, I think we should only support v6 (v5 is a previous version now, and I think it's only a database update, not an emapper update). I still need to test v6 though. Let's circle back to this bit of the discussion after I've done a little more research. |
Beta Was this translation helpful? Give feedback.
-
I guess all the required information should be there already so that sounds good to me!
Agreed.
I fully support that, provided that v6 is already released officially (or would be at the time of our release). I was under the impression that it's not yet official but please correct me if I'm mistaken. I would like to highlight again that we should revisit which inputs we're actually going to support as this influences semantic types of the linked outputs - please see the other related topic: #50. |
Beta Was this translation helpful? Give feedback.
-
This isn't
FeatureData
in the way that we typically define it, in that the feature ids from the corresponding table are not the ids in this artifact. This seems more likeSampleData
to me. I'm currently investigating this, but I wanted to get an issue up now to make folks aware that this semantic type is likely going to change. We'll ultimately want to process this artifact to generateFeatureData
.Beta Was this translation helpful? Give feedback.
All reactions