Poisoning complete-linkage hierarchical clustering

Creators: BIGGIO, BATTISTA; Rota Bulò S; Pillai I; Mura M; Mequanint E. Z; Pelillo M; ROLI, FABIO

Others:: Biggio, Battista; Rota Bulò, S; Pillai, I; Mura, M; Mequanint, E. Z.; Pelillo, M; Roli, Fabio

Description

Clustering algorithms are largely adopted in security applications as a vehicle to detect malicious activities, although few attention has been paid on preventing deliberate attacks from subverting the clustering process itself. Recent work has introduced a methodology for the security analysis of data clustering in adversarial settings, aimed to identify potential attacks against clustering algorithms and to evaluate their impact. The authors have shown that single-linkage hierarchical clustering can be severely affected by the presence of a very small fraction of carefully-crafted poisoning attacks into the input data, highlighting that the clustering algorithm may be itself the weakest link in a security system. In this paper, we extend this analysis to the case of complete-linkage hierarchical clustering by devising an ad hoc poisoning attack. We verify its effectiveness on artificial data and on application examples related to the clustering of malware and handwritten digits.

Poisoning complete-linkage hierarchical clustering

Description

Additional details