Skip to content

Commit

Permalink
Improvements to the documentation related to usage instructions and m…
Browse files Browse the repository at this point in the history
…etadata fields.
  • Loading branch information
csnyulas committed Jul 23, 2024
1 parent 9eb8900 commit 5677aa7
Show file tree
Hide file tree
Showing 5 changed files with 62 additions and 33 deletions.
3 changes: 1 addition & 2 deletions docs/antora/modules/ROOT/nav.adoc
Original file line number Diff line number Diff line change
@@ -1,9 +1,8 @@

* xref:index.adoc[Home]
** xref:software_requirements.adoc[Software Requirements]
** xref:methodology.adoc[Mapping Methodology]
** xref:package_structure.adoc[Mapping Suite Structure]
** xref:toolchain.adoc[Toolchain]
** xref:usage.adoc[Usage Instructions]
* [.separated]#**General References**#
Expand Down
24 changes: 23 additions & 1 deletion docs/antora/modules/ROOT/pages/package_structure.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,23 @@ The content of the "Metadata" sheet of an eForms Conceptual Mapping (which is sl

image:metadata_sheet.png[]

The comments in column C provide explanations, where necessary, about what each field in column A stands for, and how the values in column B should be formatted. Note that, in this example, as in most of the practical cases, there is no value specified for the "Start Date" and "End Date" fields. Those fields can be used to restrict the applicability of a given mapping package to notices published in a certain date range, which can be useful in certain scenarios (e.g. for testing), but will not be used in setups where the conversion of all the notices that belong to a certain eForms Subtype and were published according to an eForms SDK is desired.
The fields specified in the Metadata sheet are split into two section based on what type of information they provide. They can (a) describe the suite of mapping rules, and (b) the constraints necessary to identify notices eligible for transformation with this mapping suite (applicable in the context of TED-SWS pipeline).

*Mapping suite metadata:*

- *Identifier* - a unique alphanumeric sequence to identify the mapping suite
- *Title* - human-readable name of the mapping suite
- *Description* - human-readable concise statement about the mapping suite
- *Mapping version* - the version of the mapping suite
- *ePO version* - the version of the TARGET ontology

*Metadata constraints:*

- *eForms Subtype* - a comma separated list of eForm subtype IDs
- *Start date & End date* - the interval of time when an eForms notice is published. By default, they are empty
- *eForms SDK version* - a comma separated list of eForms SDK versions. Only major and minor versions of the SDK are relevant, and not the patch numbers.

The comments in column C provide explanations, where necessary, about what each field in column A stands for, and how the values in column B should be formatted. Note that, in the example above, as in most of the practical cases, there is no value specified for the "Start Date" and "End Date" fields. Those fields can be used to restrict the applicability of a given mapping suite to notices published in a certain date range, which can be useful in certain scenarios (e.g. for testing), but will not be used in setups where the conversion of all the notices that belong to a certain eForms Subtype and were published according to an eForms SDK is desired.

== The eForms `metadata.json` file
After the content of a mapping package is prepared, from the "Metadata" sheet of the CM a `metadata.json` file, like https://github.com/OP-TED/ted-rdf-mapping-eforms/blob/1.0.0-rc.3/mappings/package_cn_v1.9/metadata.json[this one], is generated, with a content similar to this:
Expand Down Expand Up @@ -56,4 +72,10 @@ After the content of a mapping package is prepared, from the "Metadata" sheet of
}
```

The interpretation of the meaning of the keys used in this JSON file should be straightforward based on the explanation provided above for the <<_the_eforms_metadata_sheet,Metadata sheet of the CM>>. They are basically lower cased, snake-case versions of the field names in the Metadata sheet. In addition, there are 3 more keys:

- *created_at* - the creation timestamp of the mapping suite, more specifically of the `metadata.json` file
- *mapping_type* - the type of the mapping suite. It can be either `standard_forms` or `eforms` (and is the latter for all packages in this project)
- *mapping_suite_hash_digest* - a hash digest that can serve as a unique key representative of the content of this mapping suite

*Note:* The `mapping_suite_hash_digest` is created based on the entire content of the mapping package, and serves as a validation key or signature, in order to ensure the TED-SWS pipeline software that the metadata in this file is valid for the current content of the package.
13 changes: 0 additions & 13 deletions docs/antora/modules/ROOT/pages/software_requirements.adoc

This file was deleted.

17 changes: 0 additions & 17 deletions docs/antora/modules/ROOT/pages/toolchain.adoc

This file was deleted.

38 changes: 38 additions & 0 deletions docs/antora/modules/ROOT/pages/usage.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
= How to use these mapping packages?

The mapping packages provided in this project are to be used within the *version 2* of th https://github.com/OP-TED/ted-rdf-conversion-pipeline[TED-SWS Conversion Pipeline]. The pipeline documentation describes how such packages will be loaded and used in the transformation steps of the pipeline.
//TODO provide a link to the antora documentation page, when the documentation provided in the word document will be made publicly available

Below we provide some additional technical information about how these packages can be used outside or inside the pipeline.

== Software Requirements

Users need only to install the following external software tools, libraries
and/or runtimes if developing and testing the RML mappings:

- Java 11+ (tested up to 17)
- RMLMapper-Java==v6.2.2

RMLMapper is currently tied to v6.2.2 because of an
https://github.com/RMLio/rmlmapper-java/issues/236[issue with conditional
instantiation] (currently fixed but
https://github.com/RMLio/rmlmapper-java/blob/144f9b4cb1ca3c7174f9453f28ec626996c19020/CHANGELOG.md[yet
unreleased]).

== Toolchain

=== No custom tooling

While there is a comprehensive set of https://docs.ted.europa.eu/SWS/mapping_suite/toolchain.html[command-line software tools for the SF mapping project], there is no such tooling provided for the eForms mapping project. The complete transformation cycle is supported in and carried out already by the https://github.com/OP-TED/ted-rdf-conversion-pipeline[pipeline].

=== RMLMapper for development

For development, simple https://github.com/RMLio/rmlmapper-java[RMLMapper] commands (which the SF tools use) can be employed directly, for example:

```
rmlmapper -m $MAPPINGS_FOLDER/* -s turtle > $OUTPUT_FILE
```

where the `$MAPPING_FOLDER` is a folder with the right package structure, and
`$OUTPUT_FILE` is the desired RDF output file, usually suffixed with the `.ttl`
extension (for the Turtle serialization as provided with the `-s` argument).

0 comments on commit 5677aa7

Please sign in to comment.