diff --git a/manual/source/index.rst b/manual/source/index.rst index cf924f7..d571da0 100644 --- a/manual/source/index.rst +++ b/manual/source/index.rst @@ -40,7 +40,6 @@ The best place to leave feedback, ask questions, and report bugs is the `OntoLib install tutorial_io tutorial_hpo - tutorial_annotation tutorial_similarity .. toctree:: diff --git a/manual/source/install.rst b/manual/source/install.rst index b213e49..783fb93 100644 --- a/manual/source/install.rst +++ b/manual/source/install.rst @@ -4,9 +4,9 @@ Installation ============ ------------------- -Pre-built Binaries ------------------- +-------------------------- +Use Maven Central Binaries +-------------------------- .. note:: @@ -16,7 +16,18 @@ Simply use the following snippet for your ``pom.xml`` for using OntoLib modules .. code-block:: xml - TODO + <dependencies> + <dependency> + <groupId>com.github.phenomics</groupId> + <artifactId>ontolib-core</artifactId> + <version>${project.version}</version> + </dependency> + <dependency> + <groupId>com.github.phenomics</groupId> + <artifactId>ontolib-io</artifactId> + <version>${project.version}</version> + </dependency> + </dependencies> .. _install_from_source: diff --git a/manual/source/tutorial_annotation.rst b/manual/source/tutorial_annotation.rst deleted file mode 100644 index be4bf48..0000000 --- a/manual/source/tutorial_annotation.rst +++ /dev/null @@ -1,5 +0,0 @@ -.. _tutorial_annotation: - -======================== -Working with Annotations -======================== diff --git a/manual/source/tutorial_hpo.rst b/manual/source/tutorial_hpo.rst index 51e912f..432da4d 100644 --- a/manual/source/tutorial_hpo.rst +++ b/manual/source/tutorial_hpo.rst @@ -3,3 +3,104 @@ ================ Working with HPO ================ + +OntoLib supports you in working with the HPO in the following ways: + +- The ``HpoOntology`` class supports the standard ``Ontology`` interface. + Besides this, the "phenotypic abnormality" sub ontology is extracted on construction and is available through ``HpoOntology.getPhenotypicAbnormalitySubOntology()``. +- The classes ``HpoFrequencyTermIds``, ``HpoModeOfInheritanceTermIds``, and ``HpoSubOntologyRootTermIds`` provide shortcuts to important/special terms that come in handy when using the HPO. +- OntoLib provides means to parse disease-to-phenotype and disease-to-gene annotations from the HPO project into ``HpoDiseaseAnnotation`` and ``HpoGeneAnnotation`` objects. + +This section will demonstrate these three features. + +--------------------- +The HpoOntology Class +--------------------- + +When iterating over all primary (i.e., not alternative) and non-obsolete term IDs, you should use the ``getNonObsoleteTermIds()`` method for obtaining a ``Set`` of these ``TermId``s + +.. code-block:: java + + // final HpoOntology hpo = ... + System.err.println("Term IDs in HPO (primary, non-obsolete)"); + for (TermId termId : hpo.getNonObsoleteTermIds()) { + System.err.println(termId); + } + +You can obtain the correct ``HpoTerm`` instance for the given ``TermId`` by consulting the resulting ``Map`` from calling ``getTermMap()``: + +.. code-block:: java + + System.err.println("Term IDs+names in HPO (primary IDs, non-obsolete)"); + for (TermId termId : hpo.getNonObsoleteTermIds()) { + final HpoTerm term = hpo.getTermMap().get(termId); + System.err.println(termId + "\t" + term.getName()); + } + +The "phenotypic abnormality" sub ontology, can be accessed with ease and then used just like all other ``Ontology`` objects. + +.. code-block:: java + + // final HpoOntology hpo = ... + final Ontology<HpoTerm, HpoTermRelation> subOntology = + hpo.getPhenotypicAbnormalitySubOntology(); + System.err.println("Term IDs in phenotypic abnormality sub ontology"); + for (TermId termId : subOntology.getNonObsoleteTermIds()) { + System.err.println(termId); + } + +------------------------------- +Shortcuts to Important Term IDs +------------------------------- + +These can be accessed as follows. + +.. code-block:: java + + System.err.println("ALWAYS_PRESENT\t", HpoFrequencyTermIds.ALWAYS_PRESENT); + System.err.println("FREQUENT\t", HpoFrequencyTermIds.FREQUENT); + + System.err.println("X_LINKED_RECESSIVE\t", HpoModeOfInheritanceTermIds.X_LINKED_RECESSIVE); + System.err.println("AUTOSOMAL_DOMINANT\t", HpoModeOfInheritanceTermIds.AUTOSOMAL_DOMINANT); + + System.err.println("PHENOTYPIC_ABNORMALITY\t", HpoSubOntologyRootTermIds.PHENOTYPIC_ABNORMALITY); + System.err.println("FREQUENCY\t", HpoSubOntologyRootTermIds.FREQUENCY); + System.err.println("MODE_OF_INHERITANCE\t", HpoSubOntologyRootTermIds.MODE_OF_INHERITANCE); + +------------------------ +Parsing Annotation Files +------------------------ + +You can parse the phenotype-to-disease annotation files as follows. + +.. code-block:: java + + File inputFile = new File("phenotype_annotation.tab"); + try { + HpoDiseaseAnnotationParser parser = new HpoDiseaseAnnotationParser(inputFile); + while (parser.hasNext()) { + HpoDiseaseAnnotation anno = parser.next(); + // work with anno + } + } except (IOException e) { + System.err.println("Problem reading from file."); + } except (TermAnnotationException e) { + System.err.println("Problem parsing file."); + } + +The phenotype-to-gene annotation file can be parsed as follows. + +.. code-block:: java + + File inputFile = new File("phenotype_annotation.tab"); + try { + HpoDiseaseAnnotationParser parser = new HpoDiseaseAnnotationParser(inputFile); + while (parser.hasNext()) { + HpoDiseaseAnnotation anno = parser.next(); + // ... + } + } except (IOException e) { + System.err.println("Problem reading from file."); + } except (TermAnnotationException e) { + System.err.println("Problem parsing file."); + } diff --git a/manual/source/tutorial_io.rst b/manual/source/tutorial_io.rst index 11c97c0..aeedf4f 100644 --- a/manual/source/tutorial_io.rst +++ b/manual/source/tutorial_io.rst @@ -3,3 +3,29 @@ ============== Input / Output ============== + +OntoLib provides support for loading OBO files into objects implementing the `Ontology` interface. +Currently, there is no generic parsing of OBO files (yet), so you just select one of the supported ontologies (GO, HPO, MPO, or ZPO) and use the specialized parser. +For example, for the Gene Ontology: + +.. code-block:: java + + final GoOboParser parser = new GoOboParser(inputFile); + final GoOntology go; + try { + hpo = parser.parse(); + } catch (IOException e) { + // handle error + } + +Similarly, for the Human Phenotype Ontology: + +.. code-block:: java + + final HpoOboParser parser = new HpoOboParser(inputFile); + final GoOntology go; + try { + hpo = parser.parse(); + } catch (IOException e) { + // handle error + } diff --git a/manual/source/tutorial_similarity.rst b/manual/source/tutorial_similarity.rst index f656af1..dbffdd2 100644 --- a/manual/source/tutorial_similarity.rst +++ b/manual/source/tutorial_similarity.rst @@ -3,3 +3,36 @@ ============================= Querying with Term Similarity ============================= + +OntoLib provides multiple routines for computing the similarity of terms, given an ontology. +For the "classic" similarity measures, the `Similarity` interface provides the corresponding interface definition. + +Here is how to compute the Jaccard similarity between two sets of terms. + +.. code-block:: java + + // final HpoOntology hpo = ... + JaccardSimilarity<HpoTerm, HpoTermRelation> similarity = + new JaccardSimilarity<>(hpo); + // List<TermId> queryTerms = ... + // List<TermId> dbTerms = ... + double score = similarity.computeScore(queryTerms, dbTerms); + +The Resnik similarity is a bit more complicated as it requires the precomputation of the information content. + +.. code-block:: java + + // final ArrayList<HpoDiseaseAnnotation> diseaseAnnotations = ... + InformationContentComputation<HpoTerm, HpoTermRelation> computation = + new InformationContentComputation<>(hpo); + Map<TermId, Collection<String>> termLabels = + TermAnnotations.constructTermAnnotationToLabelsMap(hpo, diseaseAnnotations); + Map<TermId, Double> informationContent = + computation.computeInformationContent(termLabels); + PairwiseResnikSimilarity<VegetableTerm, VegetableTermRelation> pairwise = + new PairwiseResnikSimilarity<>(hpo, informationContent); + ResnikSimilarity<HpoTerm, HpoTermRelation> similarity = + new ResnikSimilarity<>(pairwise, /* symmetric = */true); + // List<TermId> queryTerms = ... + // List<TermId> dbTerms = ... + double score = similarity.computeScore(queryTerms, dbTerms); diff --git a/ontolib-io/src/main/java/com/github/phenomics/ontolib/io/obo/hpo/HpoGeneAnnotationParser.java b/ontolib-io/src/main/java/com/github/phenomics/ontolib/io/obo/hpo/HpoGeneAnnotationParser.java index 3a0852f..cd501e1 100644 --- a/ontolib-io/src/main/java/com/github/phenomics/ontolib/io/obo/hpo/HpoGeneAnnotationParser.java +++ b/ontolib-io/src/main/java/com/github/phenomics/ontolib/io/obo/hpo/HpoGeneAnnotationParser.java @@ -19,7 +19,7 @@ * </p> * * <pre> - * File inputFile = "genes_to_phenotype.txt"; + * File inputFile = new File("genes_to_phenotype.txt"); * try { * HpoGeneAnnotationParser parser = new HpoGeneAnnotationParser(inputFile); * while (parser.hasNext()) {