Skip to content

Conversation

@domel
Copy link
Contributor

@domel domel commented Oct 23, 2025

fixes #89

Summary

This PR clarifies that Turtle processors MUST treat absolute IRIs (with a scheme) as verbatim and MUST NOT apply RFC 3986 §5 reference resolution or path normalization to them. Only relative IRI references are resolved against the current base IRI per RFC 3986 §5.1–§5.2.

Rationale

Without this clarification, an absolute IRI containing dot-segments can be altered depending on whether a BASE directive is in scope, leading to context-dependent RDF terms. The change aligns Turtle with RDF 1.2 Concepts: IRI equality uses simple string comparison (no further normalization), while relative IRI references must be resolved against a base.

Changes

  • Add a normative “Reference Resolution” subsection under IRIs:
    • Absolute IRIs: use verbatim; no resolution/normalization.
    • Relative IRIs: resolve against base IRI as per RFC 3986 §5.1–§5.2.
  • Add concise examples showing unchanged absolute IRIs and resolved relatives.
  • Add an informative note referencing RDF Reference IRIs guidance for data publishers.

Preview | Diff

@domel domel requested a review from afs October 23, 2025 18:19
Co-authored-by: Ted Thibodeau Jr <[email protected]>
@domel
Copy link
Contributor Author

domel commented Oct 24, 2025

I forgot to mention that the above (draft) PR is based on w3c/rdf-concepts#255

@afs
Copy link
Contributor

afs commented Oct 28, 2025

w3c/rdf-concepts#255 isn't finished yet.

I don't think a sub-section 2.5.1 (non-normative) in Turtle is the right thing to do. There is section 6.3 IRI References and 7.2 RDF Term Constructors for the processing of IRIREF.

I believe we should focus on useful IRIs, not write at length about corner cases. Also - reflect that Turtle (human readable) and N-Triples (database dump) have different goals.

dot-segments are only intended are use in the first segment of relative-paths in IRI references [see below] - no leading / on the path. The RFC grammar does not capture this but the grammar is not the whole of the definition (as it isn't in SPARQL). (Writing a comprehensive BNF grammar would worsen readability if it is possible at all.)

Processing IRI references is defined by RFC 3986, not RDF. By defining in RDF as well as RFC 3986, misalignment or misunderstanding will occur. Sometimes, that text isn't ideal for RDF but it is what it is.

N-Triples IRIs says:
"IRIs may be written only as resolved IRIs."

whether a BASE directive is present.

There is always a base URI (https://datatracker.ietf.org/doc/html/rfc3986#section-5.1).
BASE is only one of the ways the base URI is determined. In fact, BASE only changes the base because it is in-document.

BASE <abcd> is legal as the first line in a Turtle file. The <abcd> is processed as an IRIREF before the action of BASE.

dot-segments have a unique property. An IRI Reference always resolves to the same IRI regardless of the base; it does not even have to be the same scheme. This is regardless of strict or non-strict parser choice 5.2.2.

About the use of dot-segments

RFC 3986 3.3 Path text:

" " "
The path segments "." and "..", also known as dot-segments, are
defined for relative reference within the path name hierarchy. They
are intended for use at the beginning of a relative-path reference
(Section 4.2) to indicate relative position within the hierarchical
tree of names.
" " "
relative-path (not relative-ref)

RFC 3986 4.2 Relative Reference text:

" " "
A relative reference that begins with a single slash character is
termed an absolute-path reference. A relative reference that does
not begin with a slash character is termed a relative-path reference.
" " "

relative-path does not start with `/'.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Different parsing of the same absolute IRI with or without base IRI

4 participants