Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retain relative URIs at the RDF model level #62

Open
dbooth-boston opened this issue Mar 28, 2019 · 11 comments
Open

Retain relative URIs at the RDF model level #62

dbooth-boston opened this issue Mar 28, 2019 · 11 comments
Labels
Category: language features For language features of RDF itself -- model and syntax

Comments

@dbooth-boston
Copy link
Collaborator

At the RDF model level, relative URIs currently do not exist: all URIs are absolute. Even though relative URIs are permitted in some RDF serializations, they are converted to absolute URIs during processing, and therefore lost.

URI allocation is a problem in RDF, as explained in issue #12, because allocating permanent absolute URIs is considerably more difficult in practice than in theory. One way to reduce the difficulty of URI allocation would be to allow relative URIs at the RDF model level, so that they are retained during processing. Relative URIs would allow the author to allocate mnemonic URIs based on natural keys, without incurring the up-front burden of assigning permanent absolute URIs. If desired, those relative URIs could be changed later to permanent absolute URIs.

Since relative URIs are only unique within a particular scope of use -- such as a file -- when combining data from different scopes or sources, those relative URIs should be renamed prior to merging RDF data, in a way that ensures continued uniqueness in the merged result. Two possibilities:

  • Permanent absolute URIs could be assigned prior to merging.

  • New relative URIs could be assigned prior to merging, by prepending source tags to the old relative URIs. For example, if you are merging data from two sources x and y, then you could prepend "x." or "y." to all relative URIs in those sources (respectively) before merging. Relative URI from source x would become <x.jane> while relative URI from source y would become <y.jane>. This would guarantee continued uniqueness in the merged result.

This approach would require a small change to the RDF standards. Note that this change was previously proposed for RDF 1.1, but not adopted. (The RDF 1.1 charter was tightly constrained for backward compatible with RDF 1.0.)

Tools should also be updated to:

  • retain relative URIs while processing -- details TBD;

  • (optionally) rename relative URIs when merging data, by prepending a source tag to each relative URI; and

  • (optionally) warn when merging data containing potentially conflicting relative URIs.

See also TimBL's Design Issues note on relative URIs

@dbooth-boston dbooth-boston added the Category: language features For language features of RDF itself -- model and syntax label Mar 28, 2019
@iherman
Copy link
Member

iherman commented Mar 29, 2019

@dbooth-boston is it really something to be done on the model? I realize the difficulty of minting all types of URI-s, but isn’t a serialization a better place? Turtle already allows relative URI-s, and what may be enough is to define some processing rules for Turtle (and for other language like JSON-LD) to let the ‘base’ be defined at processing time (or something like that...)

@dbooth-boston
Copy link
Collaborator Author

@iherman , If RDF processing were limited to Turtle-only tools that would always retain relative URIs, then that approach could work. But if we want to allow the full range of RDF tools to be used -- including SPARQL stores -- then those relative URIs need to be retained uniformly across all tools, which to my mind probably means those relative URIs should be respected at the RDF model level. The problem right now is that relative URIs disappear when the RDF is processed: they forever transformed into absolute URIs.

@namedgraph
Copy link

@dbooth-boston this is by design, as one of the RDF's main strengths are global identifiers.
Relative URIs would certainly not be global.

@dbooth-boston
Copy link
Collaborator Author

@namedgraph , yes I am well aware of that, and it is important. However, there is a significant downside of requiring absolute URIs, as issue #12 explains. IMO the way things are right now, rigid adherence to global identifiers is causing more harm than good.

If we had an easy way for people to allocate mnemonic permanent absolute URIs (instead of relative URIs) then that might be a better solution. But so far I have not seen any approach that is easy enough, and many have been proposed. Any approach that requires domain name ownership is too much of a barrier, and approaches like UUIDs are not mnemonic and not guaranteed unique. (In theory UUIDs should be unique enough for most use cases when generated from truly random sources. But in practice most generators are pseudo-random, with unknown entropy, which means you do not really know how unique they are, and that causes FUD.) For debugging, it is important to support URIs that are mnemonic and/or based on natural keys.

@HughGlaser
Copy link
Collaborator

HughGlaser commented Mar 30, 2019

@namedgraph says

this is by design, as one of the RDF's main strengths are global identifiers.
Relative URIs would certainly not be global.

I'm right with this.
I react with some horror to the idea that URIs might not be globally addressable, and possibly even ambiguous.
And I don't see the need to change - issue #12 has never seemed an issue to me in this respect.
People put RDF on the web/internet at some location - at the worst, the tooling should enable me to use the document (or SPARQL endpoint?) location as a base.
If you don't have that much control, where exactly are you putting the RDF that you are publishing and I am consuming?
Of course, I realise that, as always, I live in the LD world - I guess in a SemWeb world you might have emailed me the document.
But does that happen a lot?

@cyocum
Copy link

cyocum commented Mar 30, 2019

@namedgraph says

this is by design, as one of the RDF's main strengths are global identifiers.
Relative URIs would certainly not be global.

I'm right with this.
I react with some horror to the idea that URIs might not be globally addressable, and possibly even ambiguous.
And I don't see the need to change - issue #12 has never seemed an issue to me in this respect.
People put RDF on the web/internet at some location - at the worst, the tooling should enable me to use the document (or SPARQL endpoint?) location as a base.

If someone had told me many years ago that could I use relative URLs and I did not need to put money down for a DNS name, I would have started on my project then rather than years and years later when I realized I could just use "http://example.com" as my base URL and triplestores would not care. I assumed that I had to shell out money for a DNS before I could even get started. If someone had basically said: "You do not need to have all this infrastructure (web server, dns, sparql enpoint, setup, maintenance, etc.) and you just play in your own little pool until you feel comfortable enough to open things up.", I would have been able to move forward more quickly.

Or maybe I am just slow? Either way, I feel relative URIs would make playing with this stuff far easier for people who just want to get started and not have to deal with everything else.

@namedgraph
Copy link

In my world there is not a reason that would justify not having absolute URIs at the model (not syntax) level, so to me this discussion is moot.

@chiarcos
Copy link

chiarcos commented Mar 30, 2019

@namedgraph: I doubt anyone would seriously contest the need for absolute URIs at the model level, and the interpretation of relative URIs relative the the URI of a named graph is an easy way to enforce this. But even if the model spec is done without reference to any particular serialization, it would be a good place to articulate the recommendation that serializations of the RDF data model should support relative URIs as syntactic sugar that automatically resolves into absolute URIs, as this would improve the acceptability of RDF to newbies.

@afs
Copy link
Contributor

afs commented Mar 31, 2019

Out of curiosity, in what way is this not the case currently? N-Triples doesn't; Turtle and JSON-LD do.

That's not say work isn't need - the implications of relative URIs are not easy as documents get moved or cached, and the resolution mechanism (RFC 3986 sec 5.1) could be explained in ways more specific to use with RDF documents.

@azaroth42
Copy link

We have seen a lot of interest in document and base vocabulary relative IRIs in the JSON-LD space.

Which is not to say that I personally am looking for this functionality, but to endorse that it has proponents in practice.

@aucampia
Copy link

I assumed that I had to shell out money for a DNS before I could even get started. If someone had basically said: "You do not need to have all this infrastructure (web server, dns, sparql enpoint, setup, maintenance, etc.) and you just play in your own little pool until you feel comfortable enough to open things up.", I would have been able to move forward more quickly.

Some options that are available now:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Category: language features For language features of RDF itself -- model and syntax
Projects
None yet
Development

No branches or pull requests

9 participants