Skip to content

MaastrichtU-IDS/RdfUpload

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

52 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

About

See the Data2Services documentation to run RdfUpload as part of workflows to generate RDF knowledge graph from structured data.

This project uploads a RDF file into a specified SPARQL endpoint. It is possible to optionally define username and password.

Docker

Build

docker build -t rdf-upload .

Usage

docker run -it --rm rdf-upload -?

Usage: rdfupload [-?] [-ep=<endpoint>] -if=<inputFile> [-pw=<passWord>]
                 -rep=<repository> [-uep=<updateEndpoint>] [-un=<userName>]
                 -url=<url>
  -?, --help   display a help message
      -if, --inputFile=<inputFile>
               RDF file path
      -pw, --password=<passWord>
               Password used for authentication
      -rep, --repositoryId=<repository>
               Repository ID
      -un, --username=<userName>
               Username userd for authentication
      -url, --database-url=<uri>
               URL to access GraphDB (e.g.: http://localhost:7200)

Run

  • Linux / OSX
# RDF4J server URL + repository ID
docker run -it --rm -v /data/rdfu:/data rdf-upload -if "/data/rdf_output.ttl" -url "http://localhost:7200" -rep "test" -un USERNAME -pw PASSWORD

# Upload multiple files with full SPARQL endpoint URL 
docker run -it --rm -v /data/rdfu:/data rdf-upload -if "/data/*.ttl" -url "http://localhost:7200/repositories/test" -un USERNAME -pw PASSWORD
  • Windows
docker run -it --rm -v c:/data/rdfu:/data rdf-upload -if "/data/rdf_output.ttl" -url "http://localhost:7200" -rep "test" -un USERNAME -pw PASSWORD

Preload

docker run -d --rm --name graphdb -p 7200:7200 \
  -v /data/graphdb:/opt/graphdb/home \
  -v /data/graphdb-import:/root/graphdb-import \
  graphdb

Issue: GraphDB needs to be stopped when running the load tool. Killing the java process stop the container

Try without the --rm flag

/opt/graphdb/dist/bin/preload -f -i <repo-name> <RDF data file(s)>

preload -f -i test /data/graphdb-preload/biogrid_dataset.ttl


docker run -it -v /data/graphdb:/opt/graphdb/home -v /data/graphdb-import:/root/graphdb-import --entrypoint "/opt/graphdb/dist/bin/preload -f -i test /opt/graphdb/home/data/rdf_output.nq" graphdb

# Not failing, but nothing load when docker start graphdb
docker run -it -v /data/graphdb:/opt/graphdb/home -v /data/graphdb-import:/root/graphdb-import --entrypoint /opt/graphdb/dist/bin/preload graphdb -c /root/graphdb-import/repo-config.ttl /opt/graphdb/home/data/rdf_output.nq

# Create repo test using config file. Works but GraphDB should not run
docker exec -it graphdb /opt/graphdb/dist/bin/preload -c "/root/graphdb-import/repo-config.ttl" "/opt/graphdb/home/data/rdf_output.nq"
# Use existing test repo
docker exec -it graphdb /opt/graphdb/dist/bin/preload -f -i test "/opt/graphdb/home/data/rdf_output.nq"

repo-config.ttl:

# Configuration template for an GraphDB-Free repository
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.
@prefix rep: <http://www.openrdf.org/config/repository#>.
@prefix sr: <http://www.openrdf.org/config/repository/sail#>.
@prefix sail: <http://www.openrdf.org/config/sail#>.
@prefix owlim: <http://www.ontotext.com/trree/owlim#>.
[] a rep:Repository ;
    rep:repositoryID "test" ;
    rdfs:label "Test repo" ;
    rep:repositoryImpl [
        rep:repositoryType "graphdb:FreeSailRepository" ;
        sr:sailImpl [
            sail:sailType "graphdb:FreeSail" ;
            # ruleset to use
            owlim:ruleset "empty" ;
            # disable context index(because my data do not uses contexts)
            owlim:enable-context-index "true" ;
            # indexes to speed up the read queries
            owlim:enablePredicateList "true" ;
            owlim:enable-literal-index "true" ;
            owlim:in-memory-literal-properties "true" ;
        ]
    ].