{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Queries: using SPARQL \n", "\n", "
\n", "\n", "[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/simphony/docs/v4.0.0?filepath=docs%2Fusage%2Fsessions%2Fsparql.ipynb \"Click to run this tutorial yourself!\")\n", " \n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "SimPhoNy sessions store the ontology individual information using the [RDF standard](https://www.w3.org/TR/rdf-concepts/) in an [RDF graph object](https://rdflib.readthedocs.io/en/stable/intro_to_graphs.html) from the [RDFLib](https://github.com/RDFLib/rdflib) library. This means that they are naturally compatible with the [SPARQL 1.1 Query Language](https://www.w3.org/TR/sparql11-query/) for RDF graphs." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "SPARQL queries can be invoked both from a function in the search module or from the [sparql method of the session object](../../api_reference.md#simphony_osp.session.Session.sparql). Both are equivalent, except for the fact that a target session can be passed to the function from the search module, whereas for the sparql method of the session object, the target session is fixed." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "from simphony_osp.tools.search import sparql" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Freiburg and Paris will serve as an example again to showcase this functionality." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "from simphony_osp.namespaces import city, owl, rdfs\n", "from simphony_osp.session import core_session\n", "\n", "# Create a city called \"Freiburg\"\n", "freiburg = city.City(name=\"Freiburg\", coordinates=[47.997791, 7.842609])\n", "freiburg_neighborhoods = [\n", " city.Neighborhood(name=name, coordinates=coordinates)\n", " for name, coordinates in [\n", " ('Altstadt', [47.99525, 7.84726]),\n", " ('Stühlinger', [47.99888, 7.83774]),\n", " ('Neuburg', [48.00021, 7.86084]),\n", " ('Herdern', [48.00779, 7.86268]),\n", " ('Brühl', [48.01684, 7.843]),\n", " ]\n", "]\n", "freiburg_citizens = {\n", " city.Citizen(name='Nikola', age=35,\n", " iri=\"http://example.org/entities#Nikola\"),\n", " city.Citizen(name='Lena', age=70,\n", " iri=\"http://example.org/entities#Lena\"),\n", "}\n", "freiburg[city.hasPart] |= freiburg_neighborhoods\n", "freiburg[city.hasInhabitant] |= freiburg_citizens\n", "\n", "# Create a city called \"Paris\"\n", "paris = city.City(name=\"Paris\", coordinates=[48.85333, 2.34885])\n", "paris_neighborhoods = {\n", " city.Neighborhood(name=name, coordinates=coordinates)\n", " for name, coordinates in [\n", " ('Louvre', [48.86466, 2.33487]),\n", " ('Bourse', [48.86864, 2.34146]),\n", " ('Temple', [48.86101, 2.36037]),\n", " ('Hôtel-de-Ville', [48.85447, 2.35902]),\n", " ('Panthéon', [48.84466, 2.3471]),\n", " ]\n", "}\n", "paris_citizens = {\n", " city.Citizen(name='François', age=32)\n", "}\n", "paris[city.hasPart] |= paris_neighborhoods\n", "paris[city.hasInhabitant] = paris_citizens" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Start by getting all objects connected to Freiburg. This will return a query result object. Such object inherits from [RDFLib's SPARQLResult object](https://rdflib.readthedocs.io/en/stable/apidocs/rdflib.plugins.sparql.html#rdflib.plugins.sparql.processor.SPARQLResult). The example below illustrates the its basic functionality. Check RDFLib's documentation to learn all the capabilities of the [SPARQLResult object](https://rdflib.readthedocs.io/en/stable/apidocs/rdflib.plugins.sparql.html#rdflib.plugins.sparql.processor.SPARQLResult)." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "10 True\n", "(rdflib.term.Literal('13YFp0RR93AD@t&xBo{#)k4YS)LtJz', datatype=rdflib.term.URIRef('https://www.simphony-osp.eu/types#Vector')),)\n", "rdflib.term.Literal('13YFp0RR93AD@t&xBo{#)k4YS)LtJz', datatype=rdflib.term.URIRef('https://www.simphony-osp.eu/types#Vector'))\n", "rdflib.term.Literal('13YFp0RR93AD@t&xBo{#)k4YS)LtJz', datatype=rdflib.term.URIRef('https://www.simphony-osp.eu/types#Vector'))\n", "None\n", "{'o': rdflib.term.Literal('13YFp0RR93AD@t&xBo{#)k4YS)LtJz', datatype=rdflib.term.URIRef('https://www.simphony-osp.eu/types#Vector'))}\n" ] } ], "source": [ "result = sparql( # no session specified, uses the default session (Core Session in this example)\n", " f\"\"\"SELECT ?o WHERE {{\n", " <{freiburg.identifier}> ?p ?o .\n", " }}\n", " \"\"\"\n", ")\n", "\n", "print(\n", " len(result), # number of rows in the result\n", " bool(result) # True when at least one match has been found\n", ")\n", "\n", "for row in result: # iterating the result yields ResultRow objects\n", " print(row.__repr__())\n", " # ResultRows inherint from tuples\n", " # the order of the variables passed to the query is respected\n", " \n", " print(row[0].__repr__()) # a specific variable can be accessed using either its position,\n", " print(row['o'].__repr__()) # or name\n", " \n", " print(row.get('unknown_variable', None)) # a dict-like `get` method is available\n", " \n", " print(row.asdict()) # transforms the row into a dictionary\n", " \n", " break # only one result is shown in order not to flood this page" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "All results from the query are by default RDFLib objects (e.g. `URIRef`, `Literal`, ...). However, query results from SimPhoNy feature the capability to easily convert the results to other data types using keyword arguments. \n", "\n", "For example, to query all the citizens in the session, as well as their name name and age; and obtain the results as ontology individual objects, Python strings and Python integers; use the following." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(, 'Nikola', 35)\n", "(, 'Lena', 70)\n", "(, 'François', 32)\n" ] } ], "source": [ "from simphony_osp.ontology import OntologyIndividual\n", "\n", "result = sparql(\n", " f\"\"\"SELECT ?person ?name ?age WHERE {{\n", " ?person rdf:type <{city.Citizen.identifier}> .\n", " ?person <{city['name'].identifier}> ?name .\n", " ?person <{city.age.identifier}> ?age .\n", " }}\n", " \"\"\"\n", ")\n", "\n", "for row in result(person=OntologyIndividual, name=str, age=int):\n", " print(row)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "By default, the ontologies installed with [pico](../ontologies/pico.md) are not included in the search. If you wish to make use of the terminological knowledge, pass the keyword argument `ontology=True`. The example below looks for persons instead of citizens, therefore including the terminological knowledge is necessary to obtain the desired results." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Query without ontology: 0 results\n", "Query with ontology: 3 results\n", "(, 'Nikola', 35)\n", "(, 'Lena', 70)\n", "(, 'François', 32)\n" ] } ], "source": [ "result = sparql(\n", " f\"\"\"SELECT ?person ?name ?age WHERE {{\n", " ?person rdf:type/rdfs:subClassOf <{city.Person.identifier}> .\n", " ?person <{city['name'].identifier}> ?name .\n", " ?person <{city.age.identifier}> ?age .\n", " }}\n", " \"\"\",\n", " ontology=False\n", ")\n", "\n", "print(\"Query without ontology:\", len(result), \"results\")\n", "\n", "result = sparql(\n", " f\"\"\"SELECT ?person ?name ?age WHERE {{\n", " ?person rdf:type <{city.Citizen.identifier}> .\n", " ?person <{city['name'].identifier}> ?name .\n", " ?person <{city.age.identifier}> ?age .\n", " }}\n", " \"\"\",\n", " ontology=True\n", ")\n", "\n", "print(\"Query with ontology:\", len(result), \"results\")\n", "\n", "for row in result(person=OntologyIndividual, name=str, age=int):\n", " print(row)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "
Note
\n", "\n", "When using `ontology=True`, the current version of SimPhoNy assumes that there is virtually no latency between your computer and the session that is being queried. If `ontology=True` is used, for example, with a session connected to a triplestore located on a remote server, the query will be extremely slow.\n", " \n", "
" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.7" }, "toc-showcode": false, "toc-showtags": true }, "nbformat": 4, "nbformat_minor": 4 }