-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How is "correlated disease" used in Monarch? #128
Comments
One reason I ask is because I see susceptibilities that are listed as correlated and susceptibilities that are listed as causal i.e POT1 |
Hi @ValWood - Thanks for submitting this question! @kevinschaper @cmungall @monicacecilia: Can you help Val out here? Thank you! |
@kevinschaper @cmungall @monicacecilia I have a blog post ready to go about PomBase using Mondo, but can't post it until someone answers Val's question. |
@nlharris I'm working on this right now. I'm just digging in but I'll try to get this for you as soon as I can. |
@amc-corey-cox my memory is that I made a biolink model PR to add causal and correlated gene categories to match the labels shown in the old UI, which is a pretty unsatisfying answer. |
Thanks for that information Kevin. I'll look up that PR. |
Okay, here is the start of an explanation on how we use correlated_disease. We have biolink:CausalGeneToDiseaseAssociation and biolink:CorrelatedGeneToDiseaseAssociation as subclasses of biolink:GeneToDiseaseAssociation. I believe this means when there is evidence of a direct causal role of the gene, such as Mendelian heritability, for the disease we use the term biolink:CausalGeneToDiseaseAssociation. Any other association that links the gene to causally to a disease, such as polygenic or susceptibility, would be biolink:CorrelatedGeneToDiseaseAssociation. Other associations that don't necessarily imply any form of causation would be simply biolink:GeneToDiseaseAssociation. This is my current hypothesis of the explanation. I want to see if I can find these in the actual ingests to see what evidence we're using to create these edges in order to validate the above. The biolink model also has these descriptions: Does this seem reasonable or is there something obviously wrong that I've done? |
Okay, I think I have validation of the above. Here we discuss the terms Correlated or Causal gene to disease association. The associations are derived from these fields: This appears to mesh with my statements above. So, final answer to this question. We intend for 'correlated disease' to be used when a gene to disease association indicates some contribution to causing the disease condition but not including strict Mendelian association, for which we use the term 'causal'. It is possible that we've made a mistake in how these are derived and if so please bring this to our attention. However, I believe this should be correct based on what we are seeing with ATG16L1. |
This makes sense, so contributes_to should be polygenic (except I think many causal genes are classed as correlated. The POLD1 problem above would be resolved by adding the terms for the germ-line mutation diseases There are quite a lot of inconsistencies. For example Some of the issues are probably caused by conflating a heritable causal gene which increases susceptibility == It also seems strange for correlated genes to have definitions of the form: |
It would also be useful to have precise definitions on the Monarch website so that we could link to them. |
I guess for this it is OK |
This is great feedback @ValWood. Unfortunately, if the data we're ingesting has these marked inconsistently we will as well. However, we should also make sure we're ingesting them correctly. I'll discuss with my team how we should move forward with this. |
It is probably not a huge issue but it would be useful to be precise about the meaning of the qualifiers. I still don't fully understand. My main issue is describing genes "contributes_to" flagged as contributes to as "causal" for a disease in the ontology definitions. That seems to be misleading. And seems to be a Mondo issue rather than an ingest issue. I was chatting to PomBase team about this in our group meeting, and we wondered why you need a qualifier AND "susceptibility to" in the term label. We wondered why the information could not be captured in the ontology rather than with a qualifier (because people frequently ignore qualifiers) |
@sabrinatoro 👀 👆 |
I think the main problem here is with the "susceptibility" terms. It is therefore correct that we have different ways to represent "susceptibility" concepts in Mondo/Monarch and their causal/correlated gene:
We need to review the representation of disease susceptibility in both Mondo and Monarch. (@monicacecilia I don't know where this falls on the priority list for both these projects. Let's discuss) |
Please describe your question, suggestion, or concern.
How is "correlated disease" used in Monarch?
I can't see where this is defined.
I had assumed it was used when a variant is correlated with a disease, but not known to be causal.
However, I see cases where it is used when the gene is causal (but not always, it's a susceptibility, for example
ATG16L1 | and inflammatory bowel disease 10
I guess my question is "correlated disease"
A) always used for susceptibility (i.e with other environmental conditions), or polygenic contributions.
OR
B) Would it ever be used for disease candidates (genetic correlation, ie. via linkage disequilibrium?)
thanks,
Val
If your question or suggestion is specific to Mondo, please submit it here instead: https://github.com/monarch-initiative/mondo/issues
The text was updated successfully, but these errors were encountered: