To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
In this chapter, we address so-called “error generic” data poisoning (DP) attacks (hereafter called DP attacks) on classifiers. Unlike backdoor attacks, DP attacks aim to degrade overall classification accuracy. (Previous chapters were concerned with “error specific” DP attacks involving specific backdoor patterns and source and target classes for classification applications.) To effectively mislead classifier training using relatively few poisoned samples, an attacker introduces “feature collision” to the training samples by, for example, flipping the class labels of clean samples. Another possibility is to poison with synthetic data, not typical of any class. The information extracted from the clean and poisoned samples labeled to the same class (as well as from clean samples that originate from the same class as the (mislabeled) poisoned samples) is largely inconsistent, which prevents the learning of an accurate class decision boundary. We develop a BIC based framework for both detection and cleansing of such data poisoning. This method is compared with existing DP defenses for both image data domains and document classification domains.
In this chapter, we introduce the design of statistical anomaly detectors. We discuss types of data – continuous, discrete categorical, and discrete ordinal features – encountered in practice. We then discuss how to model such data, in particular to form a null model for statistical anomaly detection, with emphasis on mixture densities. The EM algorithm is developed for estimating the parameters of a mixture density. K-means is a specialization of EM for Gaussian mixtures. The Bayesian information criterion (BIC) is discussed and developed – widely used for estimating the number of components in a mixture density. We also discuss parsimonious mixtures, which economize on the number of model parameters in a mixture density (by sharing parameters across components). These models allow BIC to obtain accurate model order estimates even when the feature dimensionality is huge and the number of data samples is small (a case where BIC applied to traditional mixtures grossly underestimates the model order). Key performance measures are discussed, including true positive rate, false positive rate, and receiver operating characteristic (ROC) and associated area-under-the-curve (ROC AUC). The density models are used in attack detection defenses in Chapters 4 and 13. The detection performance measures are used throughout the book.
In this chapter we consider attacks that do not alter the machine learning model, but “fool” the classifier (plus supplementary defense, including human monitoring) into making erroneous decisions. These are known as test-time evasion attacks (TTEs). In addition to representing a threat, TTEs reveal the non-robustness of existing deep learning systems. One can alter the class decision made by the DNN by making small changes to the input, changes which would not alter the (robust) decision-making of a human being, for example performing visual pattern recognition. Thus, TTEs are a foil to claims that deep learning, currently, is achieving truly robust pattern recognition, let alone that it is close to achieving true artificial intelligence. Thus, TTEs are a spur to the machine learning community to devise more robust pattern recognition systems. We survey various TTE attacks, including FGSM, JSMA, and CW. We then survey several types of defenses, including anomaly detection as well as robust classifier training strategies. Experiments are included for anomaly detection defenses based on classical statistical anomaly detection, as well as a class-conditional generative adversarial network, which effectively learns to discriminate “normal” from adversarial samples, and without any supervision (no supervising attack examples).