Skip to content

Commit

Permalink
Merge pull request #8 from bcgsc/release/v1.3.5
Browse files Browse the repository at this point in the history
Release/v1.3.5
  • Loading branch information
creisle authored Jul 7, 2020
2 parents 057ba42 + 22c9ea1 commit fe941d9
Show file tree
Hide file tree
Showing 8 changed files with 291 additions and 23 deletions.
11 changes: 10 additions & 1 deletion .github/workflows/pytest.yml
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ jobs:
pip install black
black --check -S -l 100 graphkb tests
- name: Test with pytest
run: pytest --junitxml=junit/test-results-${{ matrix.python-version }}.xml
run: pytest --junitxml=junit/test-results-${{ matrix.python-version }}.xml --cov graphkb --cov-report term --cov-report xml
env:
GRAPHKB_USER: ${{ secrets.GKB_TEST_USER }}
GRAPHKB_PASS: ${{ secrets.GKB_TEST_PASS }}
Expand All @@ -44,3 +44,12 @@ jobs:
path: junit/test-results-${{ matrix.python-version }}.xml
# Use always() to always run this step to publish test results when there are test failures
if: always()
- name: Update code coverage report to CodeCov
uses: codecov/codecov-action@v1
with:
token: ${{ secrets.CODECOV_TOKEN }}
file: ./coverage.xml
flags: unittests
env_vars: OS,PYTHON
name: codecov-umbrella
fail_ci_if_error: true
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@

# GraphKB (Python)

![build](https://github.com/bcgsc/pori_graphkb_python/workflows/build/badge.svg) [![PyPi](https://img.shields.io/pypi/v/graphkb.svg)](https://pypi.org/project/graphkb)
![build](https://github.com/bcgsc/pori_graphkb_python/workflows/build/badge.svg) [![PyPi](https://img.shields.io/pypi/v/graphkb.svg)](https://pypi.org/project/graphkb) [![codecov](https://codecov.io/gh/bcgsc/pori_graphkb_python/branch/master/graph/badge.svg)](https://codecov.io/gh/bcgsc/pori_graphkb_python)

Python adapter package for querying the GraphKB API. See the [user manual](https://bcgsc.github.io/pori_graphkb_python/)

Expand Down
6 changes: 6 additions & 0 deletions codecov.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
coverage:
status:
project:
default:
target: 90%
threshold: 1%
177 changes: 177 additions & 0 deletions docs/tutorial.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,177 @@
# Tutorial

This tutorial will cover how to get started using GraphKB to annotate your variants.

## Install

Install graphkb as a dependency of your python script (It is recommended to use a virtual environment)

```python
pip install graphkb
```

## Connecting to the API

The first thing to do is setting up the connection to the API

```python
from graphkb import GraphKBConnection

graphkb_conn = GraphKBConnection()
```

Next, use this to login

```python
graphkb_conn.login(username, password)
```

This will store the credentials passed on the connection object and re-login as required.

## Variant Matches

For this example we are going to try matching a protein change (`p.G12D`) on the gene (`KRAS`).

```python
from graphkb.match import match_positional_variant

variant_name = 'KRAS:p.G12D'
variant_matches = match_positional_variant(graphkb_conn, variant_name)

for match in variant_matches:
print(variant_name, 'will match', match['displayName'])
```

From this step you should see something like this (actual content will vary depending on the
instance of the GraphKB API/DB you are using)

```text
KRAS:p.G12D will match KRAS:p.G12
KRAS:p.G12D will match KRAS:p.G12X
KRAS:p.G12D will match KRAS:p.G12D
KRAS:p.G12D will match KRAS:p.G12mut
KRAS:p.G12D will match KRAS:p.(G12_G13)mut
KRAS:p.G12D will match KRAS:p.?12mut
KRAS:p.G12D will match KRAS:p.G12D
KRAS:p.G12D will match chr12:g.25398284C>T
KRAS:p.G12D will match KRAS:p.G12mut
KRAS:p.G12D will match KRAS mutation
```

As you can see above the match function has pulled similar/equivalent variant representations which
we will then use to match statements.

Next, use these variant matches to find the related statements

## Statement Annotations

```python
from graphkb.constants import BASE_RETURN_PROPERTIES, GENERIC_RETURN_PROPERTIES
from graphkb.util import convert_to_rid_list

# return properties should be customized to the users needs
return_props = (
BASE_RETURN_PROPERTIES
+ ['sourceId', 'source.name', 'source.displayName']
+ [f'conditions.{p}' for p in GENERIC_RETURN_PROPERTIES]
+ [f'subject.{p}' for p in GENERIC_RETURN_PROPERTIES]
+ [f'evidence.{p}' for p in GENERIC_RETURN_PROPERTIES]
+ [f'relevance.{p}' for p in GENERIC_RETURN_PROPERTIES]
+ [f'evidenceLevel.{p}' for p in GENERIC_RETURN_PROPERTIES]
)

statements = graphkb_conn.query(
{
'target': 'Statement',
'filters': {'conditions': convert_to_rid_list(variant_matches), 'operator': 'CONTAINSANY'},
'returnProperties': return_props,
}
)

for statement in statements[:5]:
print(
statement['relevance']['displayName'],
statement['subject']['displayName'],
statement['source']['displayName'] if statement['source'] else '',
)
```

This should output lines similar to the following

```text
resistance gefitinib [C1855] CIViC
likely pathogenic lung cancer [DOID:1324] DoCM
```

## Categorizing Statements

Something we often want to know is if a statement is therapeutic, or prognostic, etc. The
naive approach is to base this on a list of known terms or a regex pattern. In GraphKB we can
leverage the ontology structure instead.

In this example we will look for all terms that would indicate a therapeutically relevent statement.

To do this we pick our 'base' terms. These are the terms we consider to be the highest level
of the ontology tree, the most general term for that category.

```python
from graphkb.vocab import get_term_tree


BASE_THERAPEUTIC_TERMS = 'therapeutic efficacy'

therapeutic_terms = get_term_tree(graphkb_conn, BASE_THERAPEUTIC_TERMS, include_superclasses=False)

print(f'Found {len(therapeutic_terms)} equivalent terms')

for term in therapeutic_terms:
print('-', term['name'])
print()
```

This will result in output like

```text
Found 13 equivalent terms
- therapeutic efficacy
- targetable
- response
- sensitivity
- likely sensitivity
- no sensitivity
- no response
- resistance
- reduced sensitivity
- likely resistance
- innate resistance
- acquired resistance
- no resistance
```

We can filter the statements we have already retrieved, or we can add this to our original query
and filter before we retrive from the API

```python
statements = graphkb_conn.query(
{
'target': 'Statement',
'filters': {
'AND': [
{'conditions': convert_to_rid_list(variant_matches), 'operator': 'CONTAINSANY'},
{'relevance': convert_to_rid_list(therapeutic_terms), 'operator': 'IN'},
]
},
'returnProperties': return_props,
}
)

for statement in statements:
print(statement['relevance']['displayName'])
```

Similar filtering can be done for the other properties and any other base-term classification you
would like to use. Use the graph view at https://graphkb.bcgsc.ca to explore record relationships
and decide on the categories you would like to use.

The full code for this tutorial can be downloaded from the
[github repo](https://github.com/bcgsc/pori_graphkb_python) under `docs/tutorial.py`
69 changes: 69 additions & 0 deletions docs/tutorial.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
from graphkb import GraphKBConnection
from graphkb.match import match_positional_variant
import os
from graphkb.constants import BASE_RETURN_PROPERTIES, GENERIC_RETURN_PROPERTIES
from graphkb.util import convert_to_rid_list
from graphkb.vocab import get_term_tree


graphkb_conn = GraphKBConnection()
graphkb_conn.login(os.environ['USER'], os.environ['JIRA_PASS'])

variant_name = 'KRAS:p.G12D'
variant_matches = match_positional_variant(graphkb_conn, variant_name)

for match in variant_matches:
print(variant_name, 'will match', match['displayName'])

# return properties should be customized to the users needs
return_props = (
BASE_RETURN_PROPERTIES
+ ['sourceId', 'source.name', 'source.displayName']
+ [f'conditions.{p}' for p in GENERIC_RETURN_PROPERTIES]
+ [f'subject.{p}' for p in GENERIC_RETURN_PROPERTIES]
+ [f'evidence.{p}' for p in GENERIC_RETURN_PROPERTIES]
+ [f'relevance.{p}' for p in GENERIC_RETURN_PROPERTIES]
+ [f'evidenceLevel.{p}' for p in GENERIC_RETURN_PROPERTIES]
)

statements = graphkb_conn.query(
{
'target': 'Statement',
'filters': {'conditions': convert_to_rid_list(variant_matches), 'operator': 'CONTAINSANY'},
'returnProperties': return_props,
}
)

for statement in statements[:5]:
print(
statement['relevance']['displayName'],
statement['subject']['displayName'],
statement['source']['displayName'] if statement['source'] else '',
)


BASE_THERAPEUTIC_TERMS = 'therapeutic efficacy'

therapeutic_terms = get_term_tree(graphkb_conn, BASE_THERAPEUTIC_TERMS, include_superclasses=False)

print(f'\nFound {len(therapeutic_terms)} equivalent terms')

for term in therapeutic_terms:
print('-', term['name'])
print()

statements = graphkb_conn.query(
{
'target': 'Statement',
'filters': {
'AND': [
{'conditions': convert_to_rid_list(variant_matches), 'operator': 'CONTAINSANY'},
{'relevance': convert_to_rid_list(therapeutic_terms), 'operator': 'IN'},
]
},
'returnProperties': return_props,
}
)

for statement in statements:
print(statement['relevance']['displayName'])
46 changes: 26 additions & 20 deletions graphkb/genes.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
from typing import Any, Dict, List, cast

from . import GraphKBConnection
from .types import Ontology
from .types import Ontology, Statement, Variant

ONCOKB_SOURCE_NAME = 'oncokb'
ONCOGENE = 'oncogenic'
Expand All @@ -29,21 +29,24 @@
def _get_oncokb_gene_list(conn: GraphKBConnection, relevance: str) -> List[Ontology]:
source = conn.get_source(ONCOKB_SOURCE_NAME)['@rid']

statements = conn.query(
{
'target': 'Statement',
'filters': [
{'source': source},
{'relevance': {'target': 'Vocabulary', 'filters': {'name': relevance}}},
],
'returnProperties': [f'subject.{prop}' for prop in GENE_RETURN_PROPERTIES],
},
ignore_cache=False,
statements = cast(
List[Statement],
conn.query(
{
'target': 'Statement',
'filters': [
{'source': source},
{'relevance': {'target': 'Vocabulary', 'filters': {'name': relevance}}},
],
'returnProperties': [f'subject.{prop}' for prop in GENE_RETURN_PROPERTIES],
},
ignore_cache=False,
),
)
genes: Dict[str, Ontology] = {}

for statement in statements:
if statement['subject']['biotype'] == 'gene':
if statement['subject'].get('biotype', '') == 'gene':
record_id = statement['subject']['@rid']
genes[record_id] = statement['subject']

Expand Down Expand Up @@ -90,14 +93,17 @@ def get_genes_from_variant_types(
Returns:
List.<dict>: gene (Feature) records
"""
variants = conn.query(
{
'target': 'Variant',
'filters': [
{'type': {'target': 'Vocabulary', 'filters': {'name': types, 'operator': 'IN'}}}
],
'returnProperties': ['reference1', 'reference2'],
},
variants = cast(
List[Variant],
conn.query(
{
'target': 'Variant',
'filters': [
{'type': {'target': 'Vocabulary', 'filters': {'name': types, 'operator': 'IN'}}}
],
'returnProperties': ['reference1', 'reference2'],
},
),
)

genes = set()
Expand Down
1 change: 1 addition & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
site_name: GraphKB Python
nav:
- Home: index.md
- Tutorial: tutorial.md
- Reference:
- graphkb: reference/graphkb/__init__.md
- graphkb.genes: reference/graphkb/genes.md
Expand Down
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@

setup(
name='graphkb',
version='1.3.4',
version='1.3.5',
description='python adapter for interacting with the GraphKB API',
url='https://github.com/bcgsc/pori_graphkb_python',
packages=find_packages(),
Expand Down

0 comments on commit fe941d9

Please sign in to comment.