Queries: using SPARQL#

Binder

SimPhoNy sessions store the ontology individual information using the RDF standard in an RDF graph object from the RDFLib library. This means that they are naturally compatible with the SPARQL 1.1 Query Language for RDF graphs.

SPARQL queries can be invoked both from a function in the search module or from the sparql method of the session object. Both are equivalent, except for the fact that a target session can be passed to the function from the search module, whereas for the sparql method of the session object, the target session is fixed.

[1]:
from simphony_osp.tools.search import sparql

Freiburg and Paris will serve as an example again to showcase this functionality.

[2]:
from simphony_osp.namespaces import city, owl, rdfs
from simphony_osp.session import core_session

# Create a city called "Freiburg"
freiburg = city.City(name="Freiburg", coordinates=[47.997791, 7.842609])
freiburg_neighborhoods = [
    city.Neighborhood(name=name, coordinates=coordinates)
    for name, coordinates in [
        ('Altstadt', [47.99525, 7.84726]),
        ('Stühlinger', [47.99888, 7.83774]),
        ('Neuburg', [48.00021, 7.86084]),
        ('Herdern', [48.00779, 7.86268]),
        ('Brühl', [48.01684, 7.843]),
    ]
]
freiburg_citizens = {
    city.Citizen(name='Nikola', age=35,
                 iri="http://example.org/entities#Nikola"),
    city.Citizen(name='Lena', age=70,
                 iri="http://example.org/entities#Lena"),
}
freiburg[city.hasPart] |= freiburg_neighborhoods
freiburg[city.hasInhabitant] |= freiburg_citizens

# Create a city called "Paris"
paris = city.City(name="Paris", coordinates=[48.85333, 2.34885])
paris_neighborhoods = {
    city.Neighborhood(name=name, coordinates=coordinates)
    for name, coordinates in [
        ('Louvre', [48.86466, 2.33487]),
        ('Bourse', [48.86864, 2.34146]),
        ('Temple', [48.86101, 2.36037]),
        ('Hôtel-de-Ville', [48.85447, 2.35902]),
        ('Panthéon', [48.84466, 2.3471]),
    ]
}
paris_citizens = {
    city.Citizen(name='François', age=32)
}
paris[city.hasPart] |= paris_neighborhoods
paris[city.hasInhabitant] = paris_citizens

Start by getting all objects connected to Freiburg. This will return a query result object. Such object inherits from RDFLib’s SPARQLResult object. The example below illustrates the its basic functionality. Check RDFLib’s documentation to learn all the capabilities of the SPARQLResult object.

[3]:
result = sparql(  # no session specified, uses the default session (Core Session in this example)
    f"""SELECT ?o WHERE {{
        <{freiburg.identifier}> ?p ?o .
    }}
    """
)

print(
    len(result),  # number of rows in the result
    bool(result)  # True when at least one match has been found
)

for row in result:  # iterating the result yields ResultRow objects
    print(row.__repr__())
    # ResultRows inherint from tuples
    # the order of the variables passed to the query is respected

    print(row[0].__repr__())  # a specific variable can be accessed using either its position,
    print(row['o'].__repr__())  # or name

    print(row.get('unknown_variable', None))  # a dict-like `get` method is available

    print(row.asdict())  # transforms the row into a dictionary

    break  # only one result is shown in order not to flood this page
10 True
(rdflib.term.Literal('13YFp0RR93AD@t&xBo{#)k4YS)LtJz', datatype=rdflib.term.URIRef('https://www.simphony-osp.eu/types#Vector')),)
rdflib.term.Literal('13YFp0RR93AD@t&xBo{#)k4YS)LtJz', datatype=rdflib.term.URIRef('https://www.simphony-osp.eu/types#Vector'))
rdflib.term.Literal('13YFp0RR93AD@t&xBo{#)k4YS)LtJz', datatype=rdflib.term.URIRef('https://www.simphony-osp.eu/types#Vector'))
None
{'o': rdflib.term.Literal('13YFp0RR93AD@t&xBo{#)k4YS)LtJz', datatype=rdflib.term.URIRef('https://www.simphony-osp.eu/types#Vector'))}

All results from the query are by default RDFLib objects (e.g. URIRef, Literal, …). However, query results from SimPhoNy feature the capability to easily convert the results to other data types using keyword arguments.

For example, to query all the citizens in the session, as well as their name name and age; and obtain the results as ontology individual objects, Python strings and Python integers; use the following.

[4]:
from simphony_osp.ontology import OntologyIndividual

result = sparql(
    f"""SELECT ?person ?name ?age WHERE {{
        ?person rdf:type <{city.Citizen.identifier}> .
        ?person <{city['name'].identifier}> ?name .
        ?person <{city.age.identifier}> ?age .
    }}
    """
)

for row in result(person=OntologyIndividual, name=str, age=int):
    print(row)
(<OntologyIndividual: http://example.org/entities#Nikola>, 'Nikola', 35)
(<OntologyIndividual: http://example.org/entities#Lena>, 'Lena', 70)
(<OntologyIndividual: d78308dc-6db8-4216-be15-76fa0072c1c7>, 'François', 32)

By default, the ontologies installed with pico are not included in the search. If you wish to make use of the terminological knowledge, pass the keyword argument ontology=True. The example below looks for persons instead of citizens, therefore including the terminological knowledge is necessary to obtain the desired results.

[5]:
result = sparql(
    f"""SELECT ?person ?name ?age WHERE {{
        ?person rdf:type/rdfs:subClassOf <{city.Person.identifier}> .
        ?person <{city['name'].identifier}> ?name .
        ?person <{city.age.identifier}> ?age .
    }}
    """,
    ontology=False
)

print("Query without ontology:", len(result), "results")

result = sparql(
    f"""SELECT ?person ?name ?age WHERE {{
        ?person rdf:type <{city.Citizen.identifier}> .
        ?person <{city['name'].identifier}> ?name .
        ?person <{city.age.identifier}> ?age .
    }}
    """,
    ontology=True
)

print("Query with ontology:", len(result), "results")

for row in result(person=OntologyIndividual, name=str, age=int):
    print(row)
Query without ontology: 0 results
Query with ontology: 3 results
(<OntologyIndividual: http://example.org/entities#Nikola>, 'Nikola', 35)
(<OntologyIndividual: http://example.org/entities#Lena>, 'Lena', 70)
(<OntologyIndividual: d78308dc-6db8-4216-be15-76fa0072c1c7>, 'François', 32)

Note

When using ontology=True, the current version of SimPhoNy assumes that there is virtually no latency between your computer and the session that is being queried. If ontology=True is used, for example, with a session connected to a triplestore located on a remote server, the query will be extremely slow.