Published January 13, 2019 | Version v1
Conference paper

OligoArchive: Using DNA in the DBMS storage hierarchy

Others:
Eurecom [Sophia Antipolis]
Institut de pharmacologie moléculaire et cellulaire (IPMC) ; Université Nice Sophia Antipolis (1965 - 2019) (UNS) ; COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-Centre National de la Recherche Scientifique (CNRS)-Université Côte d'Azur (UCA)
Laboratoire d'Informatique, Signaux, et Systèmes de Sophia-Antipolis (I3S) / Projet MEDIACODING ; Signal, Images et Systèmes (Laboratoire I3S - SIS) ; Laboratoire d'Informatique, Signaux, et Systèmes de Sophia Antipolis (I3S) ; Université Nice Sophia Antipolis (1965 - 2019) (UNS) ; COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-Centre National de la Recherche Scientifique (CNRS)-Université Côte d'Azur (UCA)-Université Nice Sophia Antipolis (1965 - 2019) (UNS) ; COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-Centre National de la Recherche Scientifique (CNRS)-Université Côte d'Azur (UCA)-Laboratoire d'Informatique, Signaux, et Systèmes de Sophia Antipolis (I3S) ; Université Nice Sophia Antipolis (1965 - 2019) (UNS) ; COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-Centre National de la Recherche Scientifique (CNRS)-Université Côte d'Azur (UCA)-Université Nice Sophia Antipolis (1965 - 2019) (UNS) ; COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-Centre National de la Recherche Scientifique (CNRS)-Université Côte d'Azur (UCA)
Imperial College London

Description

The demand for data-driven decision making coupled with need to retain data to meet regulatory compliance requirements has resulted in a rapid increase in the amount of archival data stored by enterprises. As data generation rate far outpaces the rate of improvement in storage density of media like HDD and tape, researchers have started investigating new architectures and media types that can store such "cold", infrequently accessed data at very low cost. Synthetic DNA is one such storage media that has received some attention recently due to its high density and durability. In this paper, we investigate the problem of integrating DNA in the database storage hierarchy. More specifically, we ask the following two questions: (i) how can database knowledge help optimize DNA encoding and decoding? and (ii) how can biochemical mechanisms used for DNA manipulation be used to perform in-vitro, near-data SQL query processing? In answering these questions, we present OligoArchive, an architecture for using DNA-based storage system as the archival tier of a relational database. We demonstrate that OligoArchive can be realized in practice by building archiving and recovery tools (pg_oligo_dump and pg_oligo_restore) for PostgreSQL that perform schema-aware encoding and decoding of relational data on DNA, and using these tools to archive a 12KB TPC-H database to DNA, perform in-vitro computation, and restore it back again.

Abstract

International audience

Additional details

Created:
December 3, 2022
Modified:
December 1, 2023