TerminoDiff - Diff for 🔥 Terminology

TerminoDiff is a graphical application to quickly compare HL7 FHIR CodeSystem resources.

This work was presented at the Medical Informatics Europe 2022 conference. The paper is available as follows (CC-BY-NC):

Wiedekopf J, Drenkhahn C, Rosenau L, Ulrich H, Kock-Schoppenhauer AK, Ingenerf J. TerminoDiff - Detecting Semantic Differences in HL7 FHIR CodeSystems. Stud Health Technol Inform. 2022 May 25;294:362-366. doi: 10.3233/SHTI220475. PMID: 35612097.

How do I run this?

There are executable distributions available under the Releases section.

Executables are available for:

Windows 64-bit (built via GitHub Actions on Windows Server 2022)
macOS Intel x64 (built via GitGub Actions on Big Sur)
macOS aarch64 (built on Monterey)
Debian and derivatives, 64-bit (built via GitHub Actions on Ubuntu 20.04)

These releases include a Java VM and the needed run-time components, and can be run as-is. The macOS binaries are code-signed, but not notarized by Apple, and may ask for permission to run.

You can build the project yourself using Gradle, either from the console or via an IDE such as IntelliJ. If you want to build a native distribution yourself, you might need to edit the key composeBuildOs within the file gradle.properties to only include the operating system and target format you require - cross-building is not yet supported by the Compose Desktop framework. Valid strings for composeBuildOs are:

ubuntu, debian, deb for building DEBs on Linux
redhat, rpm for building RPMs on Linux
mac, macos for DMGs on macOS
windows, win for installable EXEs on Windows.

Capitalization of these values is ignored.

If you want to run from code, use at least JDK 11. If you want to build a native distribution, you need JDK 15 or later (17 recommended, due to problems we encountered using 15 on recent macOS), since jpackage is not available in prior versions.

Scoping Review

For more details on the scoping review we carried out, visit this page.

Why this app?

Determining how HL7 FHIR CodeSystem resources differ proves to be a very difficult task without specialized tooling, since there are many aspects to consider in these resources. This makes maintenance of well-formed FHIR CodeSystem resources much more difficult than it should be.

Level	Aspect	Example(-s)	TerminoDiff's Approach
0	Serialization format	JSON vs. XML	Reading using HAPI FHIR allows ignoring wire format, since the formats are semantically identical
1	Metadata level		Presentation as a table (lower half) in the GUI
1.1	Simple differences	`title`, `name`, `version`	String comparisons
1.2	Differences within lists	`language`, `version`	(keyed) difference lists, e.g. by using `language.code` as the key
2	Concept level		Presentation as a table (upper half) in the GUI
2.1	Simple differences	`display`, `designation`	String comparisons
2.2	Differences within lists	`property`, `designation`	(keyed) difference lists, e.g. by using `property.code` as the key
2.3	Unilaterality of concepts	Deletions and additions of codes / concepts across versions	Surfacing within the table with dedicated filter and highlight
3	Edge differences	Deletions and additions of `parent`/`child` properties; other properties linking concepts also considered	Creation and display of a difference graph with multiple color-coded types of edges.

What do I see?

When you start the application, you will have to load two HL7 FHIR CodeSystem resources. It does not matter whether the resources are stored as JSON or XML files, the HAPI FHIR library will take care of this.

You will be able to load data from the file system, but also from a FHIR Terminology Server:

If you do not have a FHIR server on your own, you can use the following URLs:

https://r4.ontoserver.csiro.au/fhir (a public test instance for Ontoserver, a commercial FHIR Terminology server)
http://hapi.fhir.org/baseR4 (a public instance of HAPI FHIR JPA Server)

Once loaded, you will be presented with the view of differences over the two loaded CodeSystems:

The app is also translated into German; and a dark theme is implemented as well:

Many columns are searchable (using a fuzzy search function, so that near matches can be found as well). This is indicated by looking glass icons. Searches can be combined as well, by specifying multiple filters.

The fuzzy search requires a partial match of at least 75 % similarity.

Top section

In the top half of the app, you can inspect the concept differences in more detail. The filters at the top allow you to select different subsets of the concepts - you can view all concepts; those that are identical; those that are different (this is the default view); those where the concept is in both sides of the comparison, but different; and those that are only in one of the two CodeSystems. Differences are highlighted using colors.

The Properties / Designations buttons are clickable to reveal a more detailed comparison:

If the concept is unilateral (only in left / only in right), the dialog is accessible regardless.

You can also click on the "Graph" button in each row to get an in-depth look at the focused concept:

The focus concept, which you clicked on, is highlighted with a thick black border. All concepts connected to this concept are shown, with their respective edges, in orange (if they were deleted going from left to right), green ( if they were added), or blue (if they are in both).

At the bottom, there are controls for zooming out in this graph. By default, the graph starts at layer 1, showing all concepts immediately connected to the focus concept, unless they are orange or green. These edges are always traversed if they are encountered, because they show chains of edits to the hierarchy. Only one "layer" of blue edges is traversed for every layer selected below:

Layer 1	Layer 2	Layer 3

Bottom section

This section represents differences in the CodeSystem metadata. Attributes are represented as rows in the table, and the respective value is shown in the right-hand part. Every metadata comparison item is compared and represented using colored chips.

If the value is identical, the columns are merged, otherwise there will be a left and a right value. For properties that are lists of values in FHIR, such as Identifier, the colored chip will be a clickable button (as long as there is data in that property):

Difference Graph

At the very top of the app, there are three buttons to view a graphical representation of the two CodeSystems, as well as the computed difference graph. To illustrate, consider these two CodeSystems:

Left CS	Right CS

Going from "left" to "right", the concept C was removed, leading to changes in the edge going from D to A. Also, a new type of edge, related-to was introduced.

The dashed edges child are special, since these are handled internally as parent edges. This is since FHIR specifies three ways of handling parent-child relationships:

the concept property within concept, allowing arbitrarily deep nesting,
the implicit parent property,
and/or the implicit child property.

Since these properties are to be handled in the same fashion by implementing applications, we reduce all of these three options to the "canonical" parent property (which allows a poly-hierarchy, which concept does not).

The difference graph for these two fictitious CodeSystem resources could look like this:

Red and dotted edges refer to deletions (going from left to right), while green solid edges refer to insertions.

The difference graph implemented in the application looks very similar:

At the top of the dialog, you can choose from any of the available layout algorithms. Since CodeSystems generally have directed edges, and are often hierarchical, we find that the Sugiyama and Eiglsperger algorithms do a fine job at visualizing these graphs.

How is this built?

The application is written in Kotlin and utilizes JetBrains' Compose Multiplatform toolkit. This toolkit brings the declarative Jetpack Compose library from Android over to the desktop, allowing a Kotlin-first approach to GUI development.

We utilize the following libraries alongside Compose:

HAPI FHIR for processing FHIR resources
slf4j for logging
JGraphT for representation of CodeSystem and diff graphs
jungrapht-visualization for drawing the graphs (using Swing)
colormap for the colormap in the graph visualizations
NativeJFileChooser for the file selection dialog on Windows and Linux
Apache Commons Lang
FlatLaf for dark window chrome on Windows
ktor for coroutine-based HTTP
JavaWuzzy for fuzzy string matching, a port of FuzzyWuzzy in Python
RSyntaxTextArea for the JSON editor in the ConceptMap panel

Localization

The localization framework was built from scratch in the file LocalizedStrings.kt, since no suitable alternative for localization in Kotlin could be found. Currently, we support English (default) and German strings. Every component that displays strings receives an instance of LocalizedStrings, which declares a number of properties that are implemented in EnglishStrings and GermanStrings. Default arguments in LocalizesStrings represent strings that are identical in English and German - if you implement another language, you may need to make some default properties explicit in the derived classes.

Members with a trailing underscore represent formatting functions that are implemented as Lambda expressions. These are referenced from the GUI using localizesStrings.member_.invoke(param1, param2).

Tables

Since DataTables are not (anymore) part of the Compose toolkit, we implemented our own table component. The generic implementation takes in a list of column specifications that render the provided type parameter using a composable body. The table supports merging adjacent columns (if a predicate returns true), tooltips (e.g. on the lower table, when English is not selected as the language, the default name of the FHIR attribute is shown in the tooltip of the left-hand column), and zebra striping. In the top table, the table data is also pre-filtered using a generic set of filter buttons. Column specs can declare that they are searchable, which yields a search button next to the column name. Search parameters can be combined at will.

Graph window

While interoperability between Kotlin and Java is generally very good, the jungrapht-visualization library is better called from Java than from Kotlin. Also, the performance of integrating these heavy components into the composable framework is not sufficient, so that these windows are implemented using Swing instead of Compose.

The Swing code makes liberal use of the sample implementations in the jungrapht-visualization libraries, e.g. for the rubber-banding satellite viewer. The graph viewer supports a range of mouse operations, explained in more detail in the documentation .

Metadata differences

We have implemented a very generic difference engine for the metadata table, so that new diff items can be added very easily. Most elements are basically string comparisons, which can be added using a single line of code in MetadataDiffItems.kt ( function generateComparisonDefinitions). Stuff like identifiers are a bit more involved, since these require the definition of a key and a string representation of the value, as well as column definitions for the key columns (to render the details dialog shown above), but are also not very challenging to implement.

What's planned?

We are looking at implementing the following features (please look at our issue tracker on GitHub for more details):

a search for concepts in the CodeSystems by code and other characteristics
filters for the metadata table, similar to those in the concept table
query of resources from FHIR Terminology servers (by physical URL and canonical URL plus version)
support for the vread mechanism to compare across instance versions on FHIR Terminology servers
a visualization of the neighborhood of any concept in the graph, to view the connections a concept has across the network of concepts
- integrating this feature into the difference graph, so that layers of context can be added iteratively
Generation of a concept map for transitioning from the prior version to the newer one
Generation of release notes in Markdown for support of terminologists
support for other types of resources in FHIR, especially ValueSet and ConceptMap, likely with TS support.

How do I cite this?

Please cite this work as:

Wiedekopf J, Drenkhahn C, Rosenau L, Ulrich H, Kock-Schoppenhauer AK, Ingenerf J. TerminoDiff - Detecting Semantic Differences in HL7 FHIR CodeSystems. Stud Health Technol Inform. 2022 May 25;294:362-366. doi: 10.3233/SHTI220475. PMID: 35612097.

Can I help?

Absolutely 🔥! Please feel free to open an issue if you would like another feature added to the app. We are committed to improving on this app to provide a better experience for terminologists around the globe.

If you improve on this app, we only ask that your changes remain freely accessible, and that you create a pull request on GitHub. Thanks!

Name		Name	Last commit message	Last commit date
Latest commit History 146 Commits
.github/workflows		.github/workflows
docs		docs
gradle/wrapper		gradle/wrapper
images		images
resources		resources
src/main		src/main
.gitignore		.gitignore
.zenodo.json		.zenodo.json
LICENSE		LICENSE
build.gradle.kts		build.gradle.kts
gradle.properties		gradle.properties
gradlew		gradlew
gradlew.bat		gradlew.bat
settings.gradle.kts		settings.gradle.kts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TerminoDiff - Diff for 🔥 Terminology

Table of Contents

How do I run this?

Scoping Review

Why this app?