We investigate the implementation of multi-label classification algorithms with a reject option, as a mean to reduce the time required to human annotators and to attain a higher classification accuracy on automatically classified samples than the one which can be obtained without a reject option. Based on a recently proposed model of manual...
-
2011 (v1)PublicationUploaded on: May 13, 2023
-
2013 (v1)Publication
We consider multi-label classification problems in application scenarios where classifier accuracy is not satisfactory, but manual annotation is too costly. In single-label problems, a well known solution consists of using a reject option, i.e., allowing a classifier to withhold unreliable decisions, leaving them (and only them) to human...
Uploaded on: April 14, 2023 -
2014 (v1)Publication
Developing learning algorithms for multilabel classification problems, when the goal is to maximizing the micro-averaged F measure, is a difficult problem for which no solution was known so far. In this paper we provide an exact solution for the case when the popular binary relevance approach is used for designing a multilabel classifier. We...
Uploaded on: May 13, 2023 -
2013 (v1)Publication
Many multi-label classifiers provide a real-valued score for each class. A well known design approach consists of tuning the corresponding decision thresholds by optimising the performance measure of interest. We address two open issues related to the optimisation of the widely used F measure and precision–recall (P–R) curve, with respect to...
Uploaded on: April 14, 2023 -
2006 (v1)Publication
In recent years anti-spam filters have become necessary tools for Internet service providers to face up to the continuously growing spam phenomenon. Current server-side anti-spam filters are made up of several modules aimed at detecting different features of spam e-mails. In particular, text categorisation techniques have been investigated by...
Uploaded on: April 14, 2023 -
2011 (v1)Publication
While it is known that multiple classifier systems can be effective also in multi-label problems, only the classifier fusion approach has been considered so far. In this paper we focus on the classifier selection approach instead. We propose an implementation of this approach specific to multi-label classifiers, based on selecting the outputs...
Uploaded on: May 13, 2023 -
2012 (v1)Publication
When a multi-label classifier outputs a real-valued score for each class, a well known design strategy consists of tuning the corresponding decision thresholds by optimising the performance measure of interest on validation data. In this paper we focus on the F-measure, which is widely used in multi-label problems. We derive two properties of...
Uploaded on: May 13, 2023 -
2011 (v1)Publication
In their arms race against developers of spam filters, spammers have recently introduced the image spam trick to make the analysis of emails' body text ineffective. It consists in embedding the spam message into an attached image, which is often randomly modified to evade signature-based detection, and obfuscated to prevent text recognition by...
Uploaded on: April 14, 2023 -
2007 (v1)Publication
We address the problem of recognizing the so-called image spam, which consists in embedding the spam message into attached images to defeat techniques based on the analysis of e-mails' body text, and in using content obscuring techniques to defeat OCR tools. We propose an approach to recognize image spam based on detecting the presence of...
Uploaded on: February 7, 2024 -
2011 (v1)Publication
A rapid diffusion of stereoscopic image acquisition devices is expected in the next years. Among the different potential applications that depth information can enable, in this paper we focus on its exploitation as a novel information source in the task of scene classification, and in particular to discriminate between indoor and outdoor...
Uploaded on: February 14, 2024 -
2008 (v1)Publication
No description
Uploaded on: May 13, 2023 -
2007 (v1)Publication
We address the problem of filtering image spam, a rapidly spreading kind of spam in which the text message is embedded into attached images to defeat spam filtering techniques based on the analysis of e-mail's body text. We propose an approach based on low-level image processing techniques to detect one of the main characterstics of most image...
Uploaded on: May 13, 2023 -
2007 (v1)Publication
In this paper we focus on the so-called image spam, which consists in embedding the spam message into images attached to e-mails to circumvent statistical techniques based on the analysis of body text of e-mails (like the "bayesian filters"), and in applying content obscuring techniques to such images to make them unreadable by standard OCR...
Uploaded on: February 11, 2024 -
2007 (v1)Publication
No description
Uploaded on: February 7, 2024 -
2013 (v1)Publication
Clustering algorithms have been increasingly adopted in security applications to spot dangerous or illicit activities. However, they have not been originally devised to deal with deliberate attack attempts that may aim to subvert the clustering process itself. Whether clustering can be safely adopted in such settings remains thus questionable....
Uploaded on: February 14, 2024 -
2014 (v1)Publication
Clustering algorithms are largely adopted in security applications as a vehicle to detect malicious activities, although few attention has been paid on preventing deliberate attacks from subverting the clustering process itself. Recent work has introduced a methodology for the security analysis of data clustering in adversarial settings, aimed...
Uploaded on: February 14, 2024