Coverage-based rewriting for data preparation

Accinelli C.; Minisi S.; Catania B.

Published 2020 | Version v1

Publication Metadata-only

Coverage-based rewriting for data preparation

Contributors

Others:

The development of technological solutions satisfying non discriminating requirements is currently one of the main challenges for data processing. Concepts like fairness, i.e., lack of bias, and diversity, i.e., the degree to which different kinds of objects are represented in a dataset, have been recently taken into account in designing non-discriminating set selection, ranking, and OLAP approaches. Information extraction is however also at the basis of back-end data processing, for preparing, e.g., extracting and transforming data, usually based on SQL queries, before loading them inside a data warehouse for further front-end processing. The impact of an unfair data preparation process might have a relevant impact on front-end analysis. As an example, an underrepresented category in the warehouse might lead to an underrepresentation of that category in most of the following processes. This kind of guarantee is known as coverage. In this paper, we start from this consideration and we propose an approach for automatically rewriting back-end queries, whose results do not guarantee some coverage constraints, into the "closest" queries satisfying those constraints. Through rewriting, coverage-based modifications of data preparation steps are traced for further processing. We also present some preliminary experimental results and we identify some directions for future works.

Additional details

URL: http://hdl.handle.net/11567/1019006
URN: urn:oai:iris.unige.it:11567/1019006

Origin repository: UNIGE

	All versions	This version
Views	0	0
Downloads	0	0
Data volume	0 Bytes	0 Bytes

Coverage-based rewriting for data preparation

Creators

Contributors

Others:

Description

Additional details

Identifiers

Origin repository