How is "correlated disease" used in Monarch? #128

ValWood · 2024-06-15T16:08:12Z

Please describe your question, suggestion, or concern.

How is "correlated disease" used in Monarch?
I can't see where this is defined.

I had assumed it was used when a variant is correlated with a disease, but not known to be causal.
However, I see cases where it is used when the gene is causal (but not always, it's a susceptibility, for example

ATG16L1 | and inflammatory bowel disease 10

I guess my question is "correlated disease"
A) always used for susceptibility (i.e with other environmental conditions), or polygenic contributions.
OR
B) Would it ever be used for disease candidates (genetic correlation, ie. via linkage disequilibrium?)

thanks,

Val

If your question or suggestion is specific to Mondo, please submit it here instead: https://github.com/monarch-initiative/mondo/issues

ValWood · 2024-06-15T16:12:27Z

One reason I ask is because I see susceptibilities that are listed as correlated
(i.e. POLD)
https://monarchinitiative.org/MONDO:0012953

and susceptibilities that are listed as causal i.e POT1
https://monarchinitiative.org/MONDO:0014368

sagehrke · 2024-06-21T17:15:52Z

Hi @ValWood - Thanks for submitting this question!

@kevinschaper @cmungall @monicacecilia: Can you help Val out here? Thank you!

nlharris · 2024-06-24T17:46:58Z

@kevinschaper @cmungall @monicacecilia I have a blog post ready to go about PomBase using Mondo, but can't post it until someone answers Val's question.

amc-corey-cox · 2024-06-24T18:11:09Z

@nlharris I'm working on this right now. I'm just digging in but I'll try to get this for you as soon as I can.

kevinschaper · 2024-06-24T21:04:08Z

@amc-corey-cox my memory is that I made a biolink model PR to add causal and correlated gene categories to match the labels shown in the old UI, which is a pretty unsatisfying answer.

amc-corey-cox · 2024-06-24T21:10:34Z

Thanks for that information Kevin. I'll look up that PR.

amc-corey-cox · 2024-06-24T21:21:28Z

Okay, here is the start of an explanation on how we use correlated_disease.

We have biolink:CausalGeneToDiseaseAssociation and biolink:CorrelatedGeneToDiseaseAssociation as subclasses of biolink:GeneToDiseaseAssociation. I believe this means when there is evidence of a direct causal role of the gene, such as Mendelian heritability, for the disease we use the term biolink:CausalGeneToDiseaseAssociation. Any other association that links the gene to causally to a disease, such as polygenic or susceptibility, would be biolink:CorrelatedGeneToDiseaseAssociation.

Other associations that don't necessarily imply any form of causation would be simply biolink:GeneToDiseaseAssociation.

This is my current hypothesis of the explanation. I want to see if I can find these in the actual ingests to see what evidence we're using to create these edges in order to validate the above.

The biolink model also has these descriptions:
biolink:GeneToDiseaseAssociation: gene in which variation is correlated with the disease, may be protective or causative or associative, or as a model
biolink:CausalGeneToDiseaseAssociation: gene in which variation is shown to cause the disease.
biolink:CorrelatedGeneToDiseaseAssociation: gene in which variation is shown to correlate with the disease.

Does this seem reasonable or is there something obviously wrong that I've done?

amc-corey-cox · 2024-06-24T22:12:13Z

Okay, I think I have validation of the above. Here we discuss the terms Correlated or Causal gene to disease association.
https://monarch-initiative.github.io/monarch-ingest/Sources/hpoa/#gene-to-disease

The associations are derived from these fields:
MENDELIAN: biolink:causes
POLYGENIC: biolink:contributes_to
UNKNOWN: biolink:gene_associated_with_condition

This appears to mesh with my statements above. So, final answer to this question. We intend for 'correlated disease' to be used when a gene to disease association indicates some contribution to causing the disease condition but not including strict Mendelian association, for which we use the term 'causal'. It is possible that we've made a mistake in how these are derived and if so please bring this to our attention. However, I believe this should be correct based on what we are seeing with ATG16L1.
Further in answer to the question of "genetic correlation, ie. via linkage disequilibrium", I believe we intend to use biolink:GeneToDiseaseAssociation for these broader correlations. Again, please let us know if this appears to be inconsistent.

ValWood · 2024-06-25T05:50:06Z

This makes sense, so contributes_to should be polygenic (except I think many causal genes are classed as correlated.
I can provide a partial list).

The POLD1 problem above would be resolved by adding the terms for the germ-line mutation diseases
monarch-initiative/mondo#7845
(as the current term does not differentiate between germ-line and sporadic)

There are quite a lot of inconsistencies. For example
colorectal cancer, susceptibility to, 12 (MONDO:0014038)
is_a
hereditary neoplastic syndrome
but this has contributes_to
however this is a single gene inherited disorder

Some of the issues are probably caused by conflating a heritable causal gene which increases susceptibility
with a susceptibility that is presumed to increase incrementally by variants in multiple genes.

==

It also seems strange for correlated genes to have definitions of the form:
Any type 2 diabetes mellitus in which the cause of the disease is a mutation in the TBC1D4 gene.
because for polygenic disorders, the gene isn't causal?

ValWood · 2024-06-25T05:53:15Z

It would also be useful to have precise definitions on the Monarch website so that we could link to them.
tks
v

ValWood · 2024-06-25T08:44:18Z

I guess for this it is OK
colorectal cancer, susceptibility to, 12 (MONDO:0014038)
because for any cancer subsequent changes are required....

amc-corey-cox · 2024-06-25T12:29:36Z

This is great feedback @ValWood. Unfortunately, if the data we're ingesting has these marked inconsistently we will as well. However, we should also make sure we're ingesting them correctly. I'll discuss with my team how we should move forward with this.

ValWood · 2024-06-25T12:44:25Z

It is probably not a huge issue but it would be useful to be precise about the meaning of the qualifiers. I still don't fully understand.

My main issue is describing genes "contributes_to" flagged as contributes to as "causal" for a disease in the ontology definitions. That seems to be misleading. And seems to be a Mondo issue rather than an ingest issue.

I was chatting to PomBase team about this in our group meeting, and we wondered why you need a qualifier AND "susceptibility to" in the term label. We wondered why the information could not be captured in the ontology rather than with a qualifier (because people frequently ignore qualifiers)

monicacecilia · 2024-10-09T23:22:12Z

@sabrinatoro 👀 👆

sabrinatoro · 2024-10-10T19:51:53Z

I think the main problem here is with the "susceptibility" terms.
These "susceptibility" terms come from OMIM, and are therefore added into Mondo. However, the data we get from the different sources more often relate to a disease and not necessarily to a "disease susceptibility"

It is therefore correct that we have different ways to represent "susceptibility" concepts in Mondo/Monarch and their causal/correlated gene:

"susceptibility to disease X" (in Mondo) - caused by a variation in gene X
"disease X" - correlated with gene X (because a variation in gene X confers a susceptibility to getting the disease.)

We need to review the representation of disease susceptibility in both Mondo and Monarch. (@monicacecilia I don't know where this falls on the priority list for both these projects. Let's discuss)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How is "correlated disease" used in Monarch? #128

How is "correlated disease" used in Monarch? #128

ValWood commented Jun 15, 2024

ValWood commented Jun 15, 2024 •

edited

Loading

sagehrke commented Jun 21, 2024

nlharris commented Jun 24, 2024

amc-corey-cox commented Jun 24, 2024

kevinschaper commented Jun 24, 2024

amc-corey-cox commented Jun 24, 2024

amc-corey-cox commented Jun 24, 2024 •

edited

Loading

amc-corey-cox commented Jun 24, 2024

ValWood commented Jun 25, 2024 •

edited

Loading

ValWood commented Jun 25, 2024

ValWood commented Jun 25, 2024

amc-corey-cox commented Jun 25, 2024

ValWood commented Jun 25, 2024 •

edited

Loading

monicacecilia commented Oct 9, 2024

sabrinatoro commented Oct 10, 2024

How is "correlated disease" used in Monarch? #128

How is "correlated disease" used in Monarch? #128

Comments

ValWood commented Jun 15, 2024

ValWood commented Jun 15, 2024 • edited Loading

sagehrke commented Jun 21, 2024

nlharris commented Jun 24, 2024

amc-corey-cox commented Jun 24, 2024

kevinschaper commented Jun 24, 2024

amc-corey-cox commented Jun 24, 2024

amc-corey-cox commented Jun 24, 2024 • edited Loading

amc-corey-cox commented Jun 24, 2024

ValWood commented Jun 25, 2024 • edited Loading

ValWood commented Jun 25, 2024

ValWood commented Jun 25, 2024

amc-corey-cox commented Jun 25, 2024

ValWood commented Jun 25, 2024 • edited Loading

monicacecilia commented Oct 9, 2024

sabrinatoro commented Oct 10, 2024

ValWood commented Jun 15, 2024 •

edited

Loading

amc-corey-cox commented Jun 24, 2024 •

edited

Loading

ValWood commented Jun 25, 2024 •

edited

Loading

ValWood commented Jun 25, 2024 •

edited

Loading