Published October 18, 2021
| Version v1
Conference paper
Biodiversity Knowledge Graphs: Time to move up a gear!
Contributors
Others:
- Web-Instrumented Man-Machine Interactions, Communities and Semantics (WIMMICS) ; Inria Sophia Antipolis - Méditerranée (CRISAM) ; Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Scalable and Pervasive softwARe and Knowledge Systems (Laboratoire I3S - SPARKS) ; Laboratoire d'Informatique, Signaux, et Systèmes de Sophia Antipolis (I3S) ; Université Nice Sophia Antipolis (1965 - 2019) (UNS) ; COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-Centre National de la Recherche Scientifique (CNRS)-Université Côte d'Azur (UCA)-Université Nice Sophia Antipolis (1965 - 2019) (UNS) ; COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-Centre National de la Recherche Scientifique (CNRS)-Université Côte d'Azur (UCA)-Laboratoire d'Informatique, Signaux, et Systèmes de Sophia Antipolis (I3S) ; Université Nice Sophia Antipolis (1965 - 2019) (UNS) ; COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-Centre National de la Recherche Scientifique (CNRS)-Université Côte d'Azur (UCA)-Université Nice Sophia Antipolis (1965 - 2019) (UNS) ; COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-Centre National de la Recherche Scientifique (CNRS)-Université Côte d'Azur (UCA)
- Scalable and Pervasive softwARe and Knowledge Systems (Laboratoire I3S - SPARKS) ; Laboratoire d'Informatique, Signaux, et Systèmes de Sophia Antipolis (I3S) ; Université Nice Sophia Antipolis (1965 - 2019) (UNS) ; COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-Centre National de la Recherche Scientifique (CNRS)-Université Côte d'Azur (UCA)-Université Nice Sophia Antipolis (1965 - 2019) (UNS) ; COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-Centre National de la Recherche Scientifique (CNRS)-Université Côte d'Azur (UCA)
- Muséum national d'Histoire naturelle (MNHN)
Description
Harnessing worldwide biodiversity data requires integrating myriad pieces of information, often sparse and incomplete, into a global, coherent data space. To do so, projects like the Global Biodiversity Information Facility, Catalog of Life and Encyclopedia of Life have set up platforms that gather, consolidate, and centralize billions of records from multiple data sources. This approach lowers the entry barrier for scientists willing to consume aggregated biodiversity data but tends to build silos that hamper cross-platform interoperability.The Web of Data embodies a different approach underpinned by the Linked Open Data (LOD) principles (Heath and Bizer 2011). These principles bring about the building of a large, distributed, cross-domain knowledge graph (KG), wherein data description relies on vocabularies with shared, formal, machine-processable semantics. So far however, little biodiversity data have been published this way. Early efforts focused primarily on taxonomic registers, such as NCBI, VTO and AGROVOC. More recent efforts have started paving the way for the publication of more diverse biodiversity KGs (Page 2019, Penev et al. 2019, Michel et al. 2017).Today, we believe that it is time for more biodiversity data producers to join in and start publishing connected KGs spanning a much broader set of domains, far beyond just taxonomic registers. In this talk, we wish to present an on-going endeavor in line with this vision. In a previous work, we published TAXREF-LD (Michel et al. 2017), a LOD representation of the French taxonomic register developed and maintained by the French National Museum of Natural History. We modeled nomenclatural information as a thesaurus of scientific names, taxonomic information as an ontology of classes denoting taxa, and additional information such as ranks and vernacular names. Recently, we have extended the scope of TAXREF-LD to represent and interlink data as various as geographic locations, species interactions, development stages, trophic levels, as well as conservation, biogeographic, and legal status (regulations, protections, etc.).We put a specific effort into working out a model that accurately accounts for the semantics of the data while respecting knowledge engineering practices. For instance, a common design shortcoming is to attach all information as properties of a taxon. This is a rightful choice for some properties like a scientific name or conservation status, but properties that actually pertain to biological individuals themselves, e.g. habitat and trophic level, should better be attched to class members. With the presentation of this work, we wish to advance the discussion about integration scenarios based on knowledge graphs with the different biodiversity data stakeholders.
Abstract
International audienceAdditional details
Identifiers
- URL
- https://hal.archives-ouvertes.fr/hal-03373536
- URN
- urn:oai:HAL:hal-03373536v1
Origin repository
- Origin repository
- UNICA