{ "cells": [ { "cell_type": "markdown", "id": "boxed-professional", "metadata": {}, "source": [ "# RDF Import and export" ] }, { "cell_type": "markdown", "id": "operational-honey", "metadata": {}, "source": [ "
\n", " \n", "[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/simphony/docs/v4.0.0?filepath=docs%2Fusage%2Fsessions%2Fimport_export.ipynb \"Click to run this tutorial yourself!\")\n", " \n", "
\n", "\n", "SimPhoNy sessions store the ontology individual information using the [RDF standard](https://www.w3.org/TR/rdf-concepts/) in an [RDF graph object](https://rdflib.readthedocs.io/en/stable/intro_to_graphs.html) from the [RDFLib](https://github.com/RDFLib/rdflib) library. Exporting such RDF graph is possible using the functions [import_file](../../api_reference.md#simphony_osp.tools.import_file) and [export_file](../../api_reference.md#simphony_osp.tools.export_file)." ] }, { "cell_type": "code", "execution_count": 1, "id": "94ca2509-eabd-4311-b531-1fd2269d27ad", "metadata": {}, "outputs": [], "source": [ "from simphony_osp.tools import import_file, export_file" ] }, { "cell_type": "markdown", "id": "qualified-works", "metadata": {}, "source": [ "
\n", "
Tip
\n", " \n", "The full API specifications of the import and export functions can be found on the\n", "[API reference page](../../api_reference.md).\n", " \n", "
\n" ] }, { "cell_type": "markdown", "id": "driving-injury", "metadata": {}, "source": [ "In the examples on this page, the [city ontology](../ontologies/ontologies_included.md#the-city-ontology) is used. Make sure the city ontology is installed. If not, run the following command:" ] }, { "cell_type": "code", "execution_count": 2, "id": "dying-accreditation", "metadata": {}, "outputs": [], "source": [ "!pico install city" ] }, { "cell_type": "markdown", "id": "lonely-listening", "metadata": {}, "source": [ "Then create a few ontology individuals" ] }, { "cell_type": "code", "execution_count": 3, "id": "considered-leonard", "metadata": {}, "outputs": [], "source": [ "from simphony_osp.namespaces import city\n", "\n", "freiburg = city.City(name=\"Freiburg\", coordinates=[47.997791, 7.842609])\n", "peter = city.Citizen(name=\"Peter\", age=30)\n", "anne = city.Citizen(name=\"Anne\", age=20)\n", "freiburg[city.hasInhabitant] += peter, anne" ] }, { "cell_type": "markdown", "id": "e15dc482-63aa-4eae-9682-2f32f363bb4c", "metadata": {}, "source": [ "## Exporting individuals" ] }, { "cell_type": "markdown", "id": "worth-province", "metadata": {}, "source": [ "The `export_file` function allows to export either all the contents of a session, or select a few ontology individuals to be exported." ] }, { "cell_type": "markdown", "id": "cf167dfe-aa6d-4b12-955f-edf3d64cf9b0", "metadata": {}, "source": [ "For example, exporting Freiburg and Peter" ] }, { "cell_type": "code", "execution_count": 4, "id": "monthly-anxiety", "metadata": {}, "outputs": [], "source": [ "export_file({freiburg, peter}, file='./data.ttl', format='turtle')" ] }, { "cell_type": "markdown", "id": "795aea94-ded2-43e0-8037-07412a9b93b0", "metadata": {}, "source": [ "creates the file `data.ttl` with the following content." ] }, { "cell_type": "code", "execution_count": 5, "id": "outstanding-wound", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "@prefix ns1: .\n", "@prefix xsd: .\n", "\n", " a ns1:City ;\n", " ns1:coordinates \"13YFp0RR93AD@t&xBo{#)k4YS)LtJz\"^^ ;\n", " ns1:hasInhabitant ;\n", " ns1:name \"Freiburg\"^^xsd:string .\n", "\n", " a ns1:Citizen ;\n", " ns1:age 30 ;\n", " ns1:name \"Peter\"^^xsd:string .\n", "\n" ] } ], "source": [ "from sys import platform\n", "\n", "if platform == 'win32':\n", " !more data.ttl\n", "else:\n", " !cat data.ttl" ] }, { "cell_type": "markdown", "id": "e10d2793-7763-4929-b017-e5e66087b09a", "metadata": {}, "source": [ "Note that when individuals that are connected are **exported together**, their connections are kept in the exported file." ] }, { "cell_type": "markdown", "id": "afd633af-b85c-449b-94c7-2ff99a097150", "metadata": {}, "source": [ "If instead, you wish to export all the individuals in a session, then pass the session object to be exported." ] }, { "cell_type": "code", "execution_count": 6, "id": "7eb2c80b-42ad-447b-9b46-47365a639cbc", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "@prefix city: .\n", "@prefix xsd: .\n", "\n", " a city:City ;\n", " city:coordinates \"13YFp0RR93AD@t&xBo{#)k4YS)LtJz\"^^ ;\n", " city:hasInhabitant ,\n", " ;\n", " city:name \"Freiburg\"^^xsd:string .\n", "\n", " a city:Citizen ;\n", " city:age 30 ;\n", " city:name \"Peter\"^^xsd:string .\n", "\n", " a city:Citizen ;\n", " city:age 20 ;\n", " city:name \"Anne\"^^xsd:string .\n", "\n" ] } ], "source": [ "from simphony_osp.session import core_session\n", "\n", "export_file(core_session, file='./data.ttl', format='turtle')\n", "\n", "if platform == 'win32':\n", " !more data.ttl\n", "else:\n", " !cat data.ttl" ] }, { "cell_type": "markdown", "id": "offshore-cotton", "metadata": {}, "source": [ "You can change the output format by entering a different value for the parameter `format`. A list of supported formats is available on [this page](../ontologies/supported_formats.html#supported-formats)." ] }, { "cell_type": "markdown", "id": "601d4d0d-b00d-4680-9b9f-1c80899cdc03", "metadata": {}, "source": [ "## Importing individuals" ] }, { "cell_type": "markdown", "id": "derived-advancement", "metadata": {}, "source": [ "To import data, use the `import_file` method. Let's assume we wish to import the data from the previous example in a new session. The following code will help us achieve our aim:" ] }, { "cell_type": "code", "execution_count": 7, "id": "stable-session", "metadata": {}, "outputs": [], "source": [ "from simphony_osp.session import Session\n", "from simphony_osp.tools import import_file\n", "\n", "session = Session(); session.locked = True\n", "\n", "with session:\n", " import_file('./data.ttl')" ] }, { "cell_type": "markdown", "id": "virgin-river", "metadata": {}, "source": [ "
\n", "
Note
\n", " \n", "The format is automatically inferred from the file extension. To specify it explicitly, you can add the `format` parameter, like so: `import_file('./data.ttl', format='turtle')`.\n", " \n", "
" ] }, { "cell_type": "markdown", "id": "40390384-0a1c-4920-98af-3f09f03c6bec", "metadata": {}, "source": [ "
\n", "
Note
\n", " \n", "A `session` keyword argument can be optionally provided. When not specified, data is imported to the default sesion. You can specify the session explicitly like so: `import_file('./data.ttl', session=session)`.\n", " \n", "
" ] }, { "cell_type": "markdown", "id": "higher-short", "metadata": {}, "source": [ "Now we can verify the data was indeed imported:" ] }, { "cell_type": "code", "execution_count": 8, "id": "suspended-albert", "metadata": {}, "outputs": [ { "data": { "image/svg+xml": [ "\n", "\n", "\n", "\n", "\n", "\n", "SimPhoNy semantic2dot\n", "\n", "\n", "\n", "https___www.simphony-osp.eu_entity#70d74233-989f-4cb2-9b67-3635801b2037\n", "\n", "70d74233...037\n", "classes: Citizen (city)\n", "age: 20\n", "name: Anne\n", "\n", "\n", "\n", "https___www.simphony-osp.eu_entity#3153c78e-a5ee-4065-913a-776c33c30c9e\n", "\n", "3153c78e...c9e\n", "classes: Citizen (city)\n", "age: 30\n", "name: Peter\n", "\n", "\n", "\n", "https___www.simphony-osp.eu_entity#e0eb6516-b2f8-4323-a407-f4a98bf46a61\n", "\n", "e0eb6516...a61\n", "classes: City (city)\n", "coordinates: [47.997791  7.842609]\n", "name: Freiburg\n", "\n", "\n", "\n", "https___www.simphony-osp.eu_entity#e0eb6516-b2f8-4323-a407-f4a98bf46a61->https___www.simphony-osp.eu_entity#70d74233-989f-4cb2-9b67-3635801b2037\n", "\n", "\n", "has inhabitant (city)\n", "\n", "\n", "\n", "https___www.simphony-osp.eu_entity#e0eb6516-b2f8-4323-a407-f4a98bf46a61->https___www.simphony-osp.eu_entity#3153c78e-a5ee-4065-913a-776c33c30c9e\n", "\n", "\n", "has inhabitant (city)\n", "\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from simphony_osp.tools import semantic2dot\n", "\n", "semantic2dot(session)" ] }, { "cell_type": "markdown", "id": "8b84a2b7-0bd9-4cf4-aa09-a1b2138ba43e", "metadata": {}, "source": [ "## Interpretation of RDF files" ] }, { "cell_type": "markdown", "id": "81211841-d564-4974-b0a7-eef4982fa3e4", "metadata": {}, "source": [ "The [ontology languages supported by SimPhoNy](../ontologies/supported_formats.md) can be serialized as RDF files, but the [RDF standard](https://www.w3.org/TR/rdf-concepts/) can store data that does not necessarily have anything to do with an ontology. Moreover, as described in the [introduction to sessions](introduction.ipynb), SimPhoNy sessions have been designed to exlusively store assertional knowledge (ontology individuals). \n", "\n", "Due to these factors, SimPhoNy enforces a few constraints when importing and exporting individuals from/to RDF files, that can, however, **also be disabled if desired**.\n", "\n", "1. No terminological knowledge can be present in the file. Any RDF triple with predicate `rdf:type` and one of `owl:Class`, `rdfs:Class`, `owl:AnnotationProperty`, `rdf:Property`, `owl:DatatypeProperty`, `owl:ObjectProperty`, `owl:Restriction` as object raises an exception.\n", "\n", "2. The subjects of any RDF triple with predicate `rdf:type` and an IRI not listed above as object are considered to be the identifiers of ontology individuals. Therefore, SymPhoNy will look into its installed ontologies for a class matching the object of the triple. If this lookup fails, an exception is raised. This behavior is meant to prevent you from making the mistake of importing data for which you have not installed the corresponding ontology.\n", "\n", "3. If an RDF triple where the subject has been succesfully identified as an ontology individual has a predicate different from `rdf:type` that cannot be recognized as an annotation, relationship nor an attribute, an exception will be raised.\n", "\n", "4. If an RDF triple where the subject has been succesfully identified as an ontology individual has a predicate that can be recognized as a relationship, but has an object that cannot be recognized as an ontology individual (for example, because no `rdf:type` has been defined for it), then an exception will be raised.\n", "\n", "5. If an RDF triple where the subject has been succesfully identified as an ontology individual has a predicate that can be recognized as a relationship, but has a literal as an object, an exception will be raised.\n", "\n", "6. If an RDF triple where the subject has been succesfully identified as an ontology individual has a predicate that can be recognized as an attribute, but has a IRI as an object, an exception will be raised.\n", "\n", "7. Any triples whose subject cannot be identified as terminological knowledge, as ontology individuals, or for which (2) does not apply are **not imported**. No warning nor exception of any kind is raised for such triples." ] }, { "cell_type": "markdown", "id": "97478c07-c469-4f64-b219-ff1d12e3b322", "metadata": {}, "source": [ "
\n", "
Warning
\n", " \n", "Point (7) implies that using the default options, you can lose data that was originally in the source, **without** warnings nor errors to notify you about it. Keep reading to learn how to prevent it. \n", " \n", "
" ] }, { "cell_type": "markdown", "id": "29fbacc8-3c05-4f4c-a7e2-f8836c1c7535", "metadata": {}, "source": [ "There are two keyword arguments that can be passed to the import and export functions to bypass these checks.\n", "\n", "- `all_triples`: When set to `True`, no exceptions will be raised for points (2)*, (3), (4), (5), (6). Warnings will still be emitted.\n", "- `all_statements`: When set to `True`, none of the points above apply. All RDF triples are imported, and **no information is lost**. No warnings are emitted." ] }, { "cell_type": "markdown", "id": "e3208b1d-bc73-4a8c-a541-dd30f0a970f8", "metadata": {}, "source": [ "
\n", "
Note
\n", " \n", "\\* To be recognized as an ontology individual, at least one of the types of the subject need to be defined in the [installed ontologies](../ontologies/pico.md). If you use `all_triples=True` when none of the types are defined in the installed ontologies, then (7) applies to such subject (its information is lost).\n", " \n", "
" ] }, { "cell_type": "markdown", "id": "b15e4a71-b146-45da-9b55-e7306725dba0", "metadata": {}, "source": [ "
\n", "
Warning
\n", " \n", "Be careful when using `all_triples` and especially, when using `all_statements`. SimPhoNy's sessions have been designed to work only with ontology individuals. If you use `all_statements=True`, then also classes, relationships, annotations and attributes will be imported to the session, but as SimPhoNy is not ready to deal with this situation, this may lead to errors.\n", " \n", "
\n" ] }, { "cell_type": "markdown", "id": "00cbc181-d4b3-4c12-847e-5b3c97f88579", "metadata": {}, "source": [ "Below there is sample RDF graph that helps understanding all the cases." ] }, { "cell_type": "markdown", "id": "96b841a5-80fa-4a9b-9a07-44bde08c31c0", "metadata": {}, "source": [ "![Sample RDF graph exemplifying cases where the constraints above apply](../../_static/importexport_graph.png)" ] }, { "cell_type": "markdown", "id": "d8a3156b-c999-4097-8478-b003f6a357f6", "metadata": {}, "source": [ "1. The triple (`example:Skaters`, `rdf:type`, `owl:Class`) would trigger this case, because it defines terminological knowledge.\n", "2. The triple (`example:Marco`, `rdf:type`, `foaf:Person`) matches this case, because the namespace `foaf` is not installed.\n", "3. The triple (`example:Klaus`, `foaf:knows`, `example:Lena`) raises an exception, because the namespace `foaf` is not installed.\n", "4. The triple (`example:Freiburg`, `city:hasInhabitant`, `example:Rob`) triggers this case, because `example:Rob` does not have any type assigned and therefore, it is not identified as an ontology individual.\n", "5. The triple (`example:Freiburg`, `city:hasWorker`, `Lena (xsd:string)`) matches this case, as \"has worker\" is a relationship, but the object is a literal.\n", "6. The triple (`example:Freiburg`, `city:coordinates`, ``) raises an exception, because \"coordinates\" is an attribute, but the object is an IRI.\n", "7. The triple (`example:Something`, `example:predicate`, `example:Thing`) fits into this case." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.12" }, "metadata": { "interpreter": { "hash": "301cd6007de04cbbf15bca26f0bc1cb48004d089278091d760363de622bdd0c8" } } }, "nbformat": 4, "nbformat_minor": 5 }