-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Induced subtree call fails on pruned taxa #84
Comments
Looking at this again. Have you all (@snacktavish and @LunaSare) implemented the two-step lookup -- i.e. if it fails, remove unknown ids and try again? |
I am going to streamline and condense the error-handling logic, which is messy and implemented in multiple places. This might make it possible to provide a list of reasons for name-lookup failure like ["pruned","unvalid"], so that we could ignore nodes where failure occurs for the listed reasons, and get the induced subtree of the other nodes. I'm not sure if "broken" fits in this categorization, but it seems like this issue is only about pruning taxa with reason "pruned_ott_id". |
I am not sure we want to change the output to make reasons clearer, since people could be depending on the current reasons, which consist of only "pruned_ott_id" and "invalid_ott_id". |
https://github.com/OpenTreeOfLife/python-opentree/blob/ejm-test-cov/opentree/ot_object.py#L367 this is the logic in python-opentree for catching and continuing |
Our current failure logic for looking up names in a set of names looks as follows. It is implemented in
I have refactored this on branch Some questions include:
Comments? @mtholder @snacktavish @kcranston @jimallman |
Oooh very interesting! Lets discuss on the call tomorrow. |
Based on discussion with Emily, Mark, and Karen, current decision is that we will leave the filtering to be implemented by the caller in (say) python. Doing this inside the call makes this "too smooth", giving the impression that you can ignore name-lookup issues. The user might then realize that names are being dropped only when by chance all names are dropped. While we could try and improve errors for looking up mrcaottXottY names, almost all of the issues come from looking up ottX names. We might want a better way of determining the status of a collection of ids than calling induced_subtree and throwing away the tree. There is the secret nodes_info (vs node_info) call. Mark told Emily about it, so the secret's out. All the "pruned_ott_id" nodes should be in the taxonomy. Users might possibly want a bit more info: what flags resulted in pruning this node from synth? was this invalid ott id ever valid? Mark pointed out that we also modify OTT to handle extinct things better, so whether a node is in "OTT" is a bit tricky. (We flag some things as extinct after OTT creation. Also there is the extinct taxon "bumping" procedure.) |
E.g.
This is a bit of a pain when getting trees for large sets of taxa (e.g. all famililes), as you need to parse the error message and rerun the call.
I think it would be nice to return the pruned subtree for the taxa that are in the tree, as well as the pruned taxa message. (cc @LunaSare )
The text was updated successfully, but these errors were encountered: