In a recent project I needed to use SPARQL CONSTRUCT queries to reconstruct an RDF list in triples. I needed to minimize the number of triples in the result and support additional graph patterns on list items. The solution did not need to handle empty lists.
RDF lists are notoriously unwieldy, and SPARQL 1.1 has no syntactic sugar to simplify working with them. I found a few web pages with hints on how to approach the problem:
None of the solutions addressed all my requirements, and I had to come up with my own query.
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX schema: <http://schema.org/>
CONSTRUCT {
<http://example.com/dummysubject> <http://example.com/dummypredicate> ?rdfList .
?rdfList rdf:first ?rdfListItem0 .
?rdfListItem0 schema:name ?rdfListItem0Name .
?rdfList rdf:rest ?rdfListRest0 .
?rdfListRestN rdf:first ?rdfListItemN .
?rdfListItemN schema:name ?rdfListItemNName .
?rdfListRestN rdf:rest ?rdfListRestNBasic .
} WHERE {
<http://example.com/dummysubject> <http://example.com/dummypredicate> ?rdfList .
?rdfList rdf:first ?rdfListItem0 .
?rdfListItem0 <http://schema.org/name> ?rdfListItem0Name .
?rdfList rdf:rest ?rdfListRest0 .
OPTIONAL {
?rdfListRest0 rdf:rest+ ?rdfListRestN .
?rdfListRestN rdf:first ?rdfListItemN .
?rdfListItemN schema:name ?rdfListItemNName .
?rdfListRestN rdf:rest ?rdfListRestNBasic .
}
}
<http://example.com/dummysubject> <http://example.com/dummypredicate> ?rdfList .
This could be any pattern, depending on the application.
?rdfList rdf:first ?rdfListItem0 .
The query treats rdf:first
and rdf:rest
separately because the former points to a list item. We need that because we need to …
?rdfListItem0 <http://schema.org/name> ?rdfListItem0Name .
This is application-specific, and could be any set of graph patterns.
?rdfList rdf:rest ?rdfListRest0 .
This is needed to reconstruct the list. One of the StackOverflow solutions was missing tihs.
OPTIONAL { ... }
For lists with multiple items.
?rdfList rdf:rest+ ?rdfListRestN .
?rdfListRestN rdf:rest ?rdfListRestNBasic .
The rdf:rest+
one-or-more property path matches every object of rdf:rest
reachable from the ?rdfList
. CONSTRUCT graph patterns can’t contain property paths, so the second WHERE graph pattern ensures the rdf:rest
triples are captured in the CONSTRUCT
.
?rdfListRestN rdf:first ?rdfListItemN .
?rdfListItemN schema:name ?rdfListItemNName .
For every list tail, match its item and the additional graph pattern on items.
Insert the following test data:
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX schema: <http://schema.org/>
INSERT DATA {
<http://example.com/2> schema:name "test2" .
<http://example.com/1> schema:name "test1" .
<http://example.com/dummysubject> <http://example.com/dummypredicate> (<http://example.com/1> <http://example.com/2>) .
}
which contains the following 7 triples (expanded from the RDF list syntactic sugar):
<http://example.com/1> <http://schema.org/name> "test1" .
<http://example.com/dummysubject> <http://example.com/dummypredicate> _:ec9519d9f6cb0c9c81662fee09b4b6b3 .
<http://example.com/2> <http://schema.org/name> "test2" .
_:e5a3560dac4aee8d386839c6b568ada2 <http://www.w3.org/1999/02/22-rdf-syntax-ns#rest> <http://www.w3.org/1999/02/22-rdf-syntax-ns#nil> .
_:e5a3560dac4aee8d386839c6b568ada2 <http://www.w3.org/1999/02/22-rdf-syntax-ns#first> <http://example.com/2> .
_:ec9519d9f6cb0c9c81662fee09b4b6b3 <http://www.w3.org/1999/02/22-rdf-syntax-ns#rest> _:e5a3560dac4aee8d386839c6b568ada2 .
_:ec9519d9f6cb0c9c81662fee09b4b6b3 <http://www.w3.org/1999/02/22-rdf-syntax-ns#first> <http://example.com/1> .
Running the query produces exactly those triples.
The query only handles non-empty lists, since I never serialize empty lists. To handle an empty list, you’d need to UNION
the query with an rdf:nil
pattern, similar to the following:
{
<http://example.com/dummysubject> <http://example.com/dummypredicate> rdf:nil .
}
UNION
{
<http://example.com/dummysubject> <http://example.com/dummypredicate> ?rdfList .
?rdfList rdf:first ?rdfListItem0 .
?rdfListItem0 <http://schema.org/name> ?rdfListItem0Name .
?rdfList rdf:rest ?rdfListRest0 .
OPTIONAL {
?rdfListRest0 rdf:rest+ ?rdfListRestN .
?rdfListRestN rdf:first ?rdfListItemN .
?rdfListItemN schema:name ?rdfListItemNName .
?rdfListRestN rdf:rest ?rdfListRestNBasic .
}
}
The query has two variables for items: item0
for the first item in the list and itemN
for subsequent items. I wasn’t able to find a solution that used a single item
variable without producing extraneous triples. Unfortunately, that means that additional item patterns (like the one with schema:name
above) must be duplicated for item0
and itemN
.