-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
review old ontology IDs that do not resolve to InterLex #223
Comments
@tgbugs Currently In InterLex: 4938 I will add the 3949 entities to InterLex to bridge the gap in the lack of resolving entities from NIF-NIFSTD to NIFSTD-ILX. After that is complete I will move to NIFSTD-SCR Side Note :: I noticed the "NIF" IDs are the "NIFSTD" IDs for ILX and SCR. Should it be NIFSTD-NIF-mapping instead? |
@tmsincomb can you get me the list of the 3949 for review before you add to interlex? The NIF-NIFSTD mapping is fixed, those are the ids that only appear in the ontology, and there shouldn't be anything missing from those. There are a number of NIFSTD ids that were never in the NIF namespace at all, and we aren't going to put them there since the NIF form of the ids has never existed nor be promulgated anywhere. I think they go in NLX-ILX mapping if they go anywhere (I don't think that that mapping file has ever been committed to git). |
The SCR (Registry entries) should not be in InterLex as they are in the registry (I assume that is what you mean by SCR). We would need to figure out how to handle these - e.g. if someone puts in an SCR id we could divert to a special redirection page to explain and then send them to the resolver. |
@jgrethe They aren't. Any mapping in the NIFSTD-SCR should not/is not in InterLex. We had discussed previously the desire to be able to have the SRC results show up in InterLex search results and take people to their registry page so that they don't get added to InterLex. |
After the loading of the 3949 - we still need to see what we are missing in regards to diseases (DO/MONDO/...), Chemical (Chebi), Organisms (NCBItaxonomy). This is noticeable with the term matching being done via Foundry. |
For now we should be able to use the code for loading taxon and chebi into the ontology to load into InterLex https://github.com/tgbugs/pyontutils/blob/master/nifstd/nifstd_tools/slimgen.py Obviously in the future we will want to be able to use the mechanism that allows us to load and update from the ontology files directly, but that is farther out. |
@tgbugs Re: SCR ids - OK that is what I thought. Just wanted to confirm. Added a ticket to SciCrunch-UI for this. |
The potential issue for taxon and Chebi was that there were a bunch of entries missing as the ontology didn't incorproate all of NCBItaxon or Chebi - however, the term mapping is finding terms that are from these non-included areas. |
@tgbugs Within InterLex should we set default ID to Mondo now for disease? |
Yes, we should flip disease over to the MONDO ids now. |
@tmsincomb can you cross reference what you are seeing now against the lists that we came up with in #124? |
In general I think that you need to run a bit deeper term matching on these. I have found existing terms that are in InterLex already that correspond where they were not pulled originally from the ontology when NeuroLex was loaded (re: #124 again). This time we have to deal with them. From the list of 3949 one issue is that there are 176 that are institutions. Those should not be loaded into InterLex and in theory should already be in SCR. You can filter them via the comment field. I'm guessing you just aren't including the subClassOf section for these terms in the report you sent, and that you would include them in the load into InterLex? Also, there are duplicates that we need to check over. For example cell types matched via
There are some others that seem to be coming from NIF-Organism which is a mess #70, but I think it is ok to pull those in to InterLex and we will just deal with the clean up later. |
For NIF-Organism - shouldn't these be in NCBI taxonomy (as #70 mentions many should be deprecated for NCBI taxonomy)? Perhaps we could do the full NCBI taxon load and then match to NIF-Organism for addition of information (ids, annotations,e tc.)? |
NIF organism was created at a time where there was no way to easily import other ontologies. We should not be maintaining these old branches-they should be deprecated.
… On Sep 21, 2020, at 1:32 PM, Jeffrey S. Grethe, Ph.D. ***@***.***> wrote:
For NIF-Organism - shouldn't these be in NCBI taxonomy (as #70 <#70> mentions many should be deprecated for NCBI taxonomy)? Perhaps we could do the full NCBI taxon load and then match to NIF-Organism for addition of information (ids, annotations,e tc.)?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub <#223 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABSGOPIC7SDKMAAAUZKG2GDSG6Z7VANCNFSM4RIIJBFQ>.
|
They should be deprecated and replaced by NCBITaxon, but we need to make sure that the old ids resolve so that people can find the new ones. |
So then perhaps NCBI taxon import followed by reconciliation with NIF-Organisms ids (and any other associated ids, dbxrefs, etc.) |
There are a number of identifiers in
https://github.com/SciCrunch/NIF-Ontology/blob/dev/ttl/generated/NIF-NIFSTD-mapping.ttl
that have no corresponding entry in
https://github.com/SciCrunch/NIF-Ontology/blob/dev/ttl/generated/NIFSTD-ILX-mapping.ttl
nor in
https://github.com/SciCrunch/NIF-Ontology/blob/dev/ttl/generated/NIFSTD-SCR-mapping.ttl.
@tmsincomb We need to review these and make sure that they resolve. IIRC there is already code that does this or could do this with little additional work in https://github.com/tgbugs/pyontutils/blob/master/nifstd/nifstd_tools/mapnlxilx.py.
The text was updated successfully, but these errors were encountered: