We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Biometric systems are increasingly being used in many applications that require positive identification of individuals for access authorization. Traditional biometric systems rely on a single-biometric, single-sensor paradigm for authentication or identification. For high-security and real-world requirements, this paradigm is inadequate when it comes to reliably providing a high level of accuracy performance. Multibiometrics, the technique of using multiple biometric modalities and sensors, promises to rise to the challenge of making biometric authentication truly robust and reliable. Moreover, using multiple biometrics enhances the coverage of the section of the population that is not able to provide any single biometrics. Multiple biometrics is naturally more robust against spoof attacks as well, since hackers have to contend with more than one biometrics. Further, fusing multibiometrics enables indexing of large databases for identification of individuals.
Compared with other books on the same topic, the key features of this book are the following:
It includes the entire gamut of multibiometrics topics, including multimodal, multisensory levels of fusion, multiple algorithms, and multiple data acquisition instances.
It includes chapters on the latest sensing devices for novel multibiometrics modalities, security assessment of multibiometrics systems and their dynamic management, and theoretically sound and novel approaches for fusion.
It provides information on publicly available multibiometrics databases and addresses research issues related to performance modeling, prediction, and validation of multibiometrics systems.
Detecting whether the video of a speaking person in frontal head pose corresponds to the accompanying audio track is of interest in numerous multimodal biometrics-related applications. In many practical occasions, the audio and visual modalities may not be in sync; for example, we may observe static faces in images, the camera may be focusing on a nonspeaker, or a subject may be speaking in a foreign language with audio being translated to another language. Spoofing attacks in audiovisual biometric systems also often involve audio and visual data streams that are not in sync. Audiovisual (AV) synchrony indicates consistency between the audio and visual streams and thus the reliability for the segments to belong to the same individual. Such segments could then serve as building blocks for generating bimodal fingerprints of the different individuals present in the AV data, which can be important for security, authentication, and biometric purposes. AV segmentation can also be important for speaker turn detection, as well as automatic indexing and retrieval of different occurrences of a speaker.
The problem of AV synchrony detection has already been considered in the literature. We refer to Bredin and Chollet (2007) for a comprehensive review on this topic, where the authors present a detailed discussion on different aspects of AV synchrony detection, including feature processing, dimensionality reduction, and correspondence detection measures. In that paper, AV synchrony detection is applied to the problem of identity verification, but the authors also mention additional applications in sound source localization, AV sequence indexing, film postproduction, and speech separation.
Face recognition stands as the most appealing biometric modality, since it is the natural mode of identification among humans and is totally unobtrusive. At the same time, however, it is one of the most challenging modalities (Zhao et al. 2003). Several face recognition algorithms have been developed in recent years, mostly in the visible and a few in the infrared domains. A serious problem in visible face recognition is light variability, due to the reflective nature of incident light in this band. This can clearly be seen in Figure 4.1. The visible image of the same person in Figure 4.1(a) acquired in the presence of normal light appears totally different from that in Figure 4.1(b), which was acquired in low light.
Many of the research efforts in thermal face recognition were narrowly aiming to see in the dark or reduce the deleterious effect of light variability (Figure 4.1) (Socolinsky et al. 2001; Selinger and Socolinsky 2004). Methodologically, such approaches did not differ very much from face recognition algorithms in the visible band, which can be classified as appearance-based (Chen et al. 2003) and feature-based (Buddharaju et al. 2004). Recently attempts have been made to fuse the visible and infrared modalities to increase the performance of face recognition (Socolinsky and Selinger 2004a; Wang et al. 2004; Chen et al. 2005; Kong et al. 2005).
The human body has fascinated scientists for thousands of years. Studying the shape of the human body offers opportunities to open up entirely new areas of research. The shape of the human body can be used to infer personal characteristics and features. Body type and muscle strength, for instance, can be used to distinguish gender. The presence or absence of wrinkles around the eyes and loose facial skin suggests a person's age. In addition, the size and shape of a person's face, belly, thighs, and arms can determine a body habitus: slim, healthy, or overweight. The length of individual limbs such as legs and their postural sway when a person walks suggests an underlying misalignment of the bone structure. It is, in fact, possible to identify people by their physical body shape of the entire body. Unlike traditional physical measures of height, width, and length of body parts, the body shape is represented by a closed surface of the entire human body as a 2-manifold. The surface is digitally captured and described by geometric primitives such as vertices, lines, and curves. It is useful for health professionals and technical experts to retrieve personal data from a shared database whenever the need arises. Using a large number of body shape measurements, valuable personal characteristics can be statistically analyzed. If there is a strong correlation between body shape and medical disease, for instance, we can avoid invasive diagnostic procedures such as medical imaging methods that utilize electromagnetic radiation.
The term “biometrics” defines the analysis of unique physiological or behavioral characteristics to verify the claimed identity of an individual. Biometric identification has eventually assumed a much broader relevance as a new technological solution toward more intuitive computer interfaces (Hong et al. 1999; Jain et al. 1999).
Multibiometric systems Jain and Ross (2004) have been devised to overcome some of the limitations of unimodal biometric systems. In general terms, the combination of multiple biometric traits is operated by grouping multiple sources of information. These systems utilize more than one physiological or behavioral characteristic, or a combination of both, for enrollment and identification. For example, the problem of nonuniversality can be overcome, because multiple traits together always provide a sufficient population coverage. Multibiometrics also offers an efficient countermeasure to spoofing, because it would be difficult for an impostor to spoof multiple biometric traits of a genuine user simultaneously (Jain and Ross 2004). In some cases the sensor data can be corrupted or noisy, the use of multiple biometric traits always allow to reduce the effects of errors and noise in the data.
Ross and Jain (2003) presented a wide overview of multimodal biometrics describing different possible levels of fusion, within several scenarios, modes of operation, integration strategies, and design issues.
Biometrics-based personal identification systems offer automated or semiautomated solutions to various aspects of security management problems. These systems ensure controlled access to the protected resources and provide higher security and convenience to the users. The security of the protected resources and information can be further enhanced with the usage of multibiometrics systems. The multibiometric systems are known to offer enhanced security and antispoofing capabilities while achieving higher performance. These systems can utilize multiple biometric modalities, multiple biometric samples, multiple classifiers, multiple features, and/or normalization schemes to achieve performance improvement (refer to chapter x for more details). However, the higher security and reliability offered by multibiometrics systems often come with additional computational requirements and user inconvenience, which can include privacy and hygienic concerns. Therefore the deployment of multibiometrics systems for civilian and commercial applications is often a judicious compromise between these conflicting requirements. The management of multibiometric systems to adaptively ensure the varying level of security requirements, user convenience, and constraints has invited very little attention in the literature. Very little work has been done on the theory, architecture, implementation, or performance estimation of multibiometrics that dynamically ensure the varying level of security requirements.
Why Dynamic Security Management?
The expected security requirements from the multibiometrics systems are typically expressed in terms of error rates and reliability of the employed system.
These error rates correspond to false acceptance rate (FAR), which is the rate at which imposters are accepted as genuine users, or false rejection rate (FRR), which is the rate at which genuine users are rejected by the system as imposters.
Biometric systems deployed in current real-world applications are primarily unimodal – they depend on the evidence of a single biometric marker for personal identity authentication (e.g., ear or face). Unimodal biometrics are limited, because no single biometric is generally considered both sufficiently accurate and robust to hindrances caused by external factors (Ross and Jain 2004).
Some of the problems that these systems regularly contend with are the following: (1) Noise in the acquired data due to alterations in the biometric marker (e.g., surgically modified ear) or improperly maintained sensors. (2) Intraclass variations that may occur when a user interacts with the sensor (e.g., varying head pose) or with physiological transformations that take place with aging. (3) Interclass similarities, arising when a biometric database comprises a large number of users, which results in an overlap in the feature space of multiple users, requires an increased complexity to discriminate between the users. (4) Nonuniversality – the biometric system may not be able to acquire meaningful biometric data from a subset of users. For instance, in face biometrics, a face image may be blurred because of abrupt head movement or partially occluded because of off-axis pose. (5) Certain biometric markers are susceptible to spoof attacks – situations in which a user successfully masquerades as another by falsifying their biometric data.
from
PART I
-
MULTIMODAL AND MULTISENSOR BIOMETRIC SYSTEMS
By
Ying Hao, National Laboratory of Pattern Recognition,
Zhenan Sun, National Laboratory of Pattern Recognition,
Tieniu Tan, National Laboratory of Pattern Recognition
Edited by
Bir Bhanu, University of California, Riverside,Venu Govindaraju, State University of New York, Buffalo
In everyday life, human beings use hand to perceive and reconstruct surrounding environments. Therefore, its prevalence in the field of biometrics is not surprising. Along with the maturity of fingerprint and hand geometry recognition, palmprint and palm/palm-dorsa vein recognition have become new members in the hand-based biometric family. Although increasingly higher recognition rates are reported in the literature, the acquisition of a hand image usually relies on contact devices with pegs, which brings hygiene concerns and reluctance to use (Kong 2009; Morales 2008; Michael 2008). Recently a growing trend toward relieving users from a contact device has emerged, and the idea of peg-free, or further contact-free, palm biometrics has been proposed. However, accuracy of hand-based biometric systems degrades along with the removal of the peg and contact plane (Han 2007a, b; Doublet 2007; Michael 2008). The underlying reason lies in the fact that the hand is essentially a three-dimensional (3D) object with a large number of degrees of freedom. For this reason, naturally stretched-out hands of different subjects may appear substantially different on an image plane. Scale changes, in-depth rotation, and nonlinear skin deformation originating from pose changes are the most commonly encountered image variations in a touch-free environment (Morales 2008; Michael 2008).
To increase the accuracy of an individual human recognition, sensor fusion techniques (Bhanu and Govindaraju 2009; Kuncheva 2006; Ross et al. 2004) for multimodal biometrics systems are widely used today. For example, for a biometrics system, one can combine a face recognition system and a fingerprint recognition system to perform better human recognition. By fusing different biometrics, we may achieve the following benefits with the availability of more meaningful information: improved recognition/identification; reduction of false alarm, and broadening the range of populations for which the fused system will function satisfactorily.
There are four levels of fusion possibilities in a multimodal biometrics system (Waltz and Llinas 1990; Hall and Llinas 2001): data level, feature level, score level, and decision level. Data-level fusion is the combination of unprocessed data to produce new data expected to be more informative and synthetic than the single biometric data (Borghys et al. 1998). This kind of fusion requires a pixel-level registration of the raw images. When the sensors are alike, we can consider all data at the data-level fusion, but it is more complicated when several different sensors are used because of the problems associated with resolution, registration, etc. (Han and Bhanu 2007; Nadimi and Bhanu 2004). Feature level is believed to be promising because feature sets can provide more information about the input biometrics than other levels (Ross and Jain 2004). However, sometimes different feature sets are in conflict and may not be available, which makes the feature-level fusion more challenging than other levels of fusion (Ross and Govindarajan 2005; Zhou and Bhanu 2008).
With the wider deployment of biometric authentication systems and the increased number of enrolled persons in such systems, the problem of correctly predicting the performance has become important. The number of available testing samples is usually smaller than the number of enrolled persons that the biometric system is expected to handle. The accurate performance prediction allows system integrators to optimally select the biometric matchers for the system, as well as to properly set the decision thresholds.
Research in predicting the performance in large-scale biometric systems is still limited and mostly theoretical. Wayman (1999) introduced multiple operating scenarios for biometric systems and derived the equations for predicted performance assuming that the densities of genuine and impostor scores are known. Jarosz et al. (2005) presented an overview of possible performance estimation methods including extrapolation of large-scale performance given the performance on smaller-scale databases, binomial approximation of performance, and the application of extreme value theory. Bolle et al. (2005) derived the performance of identification systems (CMC curve) assuming that the performance of the corresponding biometric verification system (ROC curve) is known. The major assumption used in all these works is that the biometric match scores are independent and identically distributed, that is, genuine scores are randomly drawn from a genuine score distribution, and impostor scores are randomly and independently drawn from an impostor score distribution. As we will show in this chapter, this assumption does not generally hold, and using it leads to the underestimation of identification performance.
Face recognition is one of the most widely researched topics in computer vision because of a wide variety of applications that require identity management. Most existing face recognition studies are focused on two-dimensional (2D) images with nearly frontal-view faces and constrained illumination. However, 2D facial images are strongly affected by varying illumination conditions and changes in pose. Thus, although existing methods are able to provide satisfactory performance under constrained conditions, they are challenged by unconstrained pose and illumination conditions.
FRVT 2006 explored the feasibility of using three-dimensional (3D) data for both enrollment and authentication (Phillips et al. 2007). The algorithms using 3D data have demonstrated their ability to provide good recognition rates. For practical purposes, however, it is unlikely that large scale deployments of 3D systems will take place in the near future because of the high cost of the hardware. Nevertheless, it is not unreasonable to assume that an institution may want to invest in a limited number of 3D scanners, if having 3D data for enrollment can yield higher accuracy for 2D face authentication/identification.
In this respect we have developed a face recognition method that makes use of 3D face data for enrollment while requiring only 2D data for authentication. During enrollment, different from the existing methods (e.g., Blanz and Vetter 2003) that use a 2D image to infer a 3D model in the gallery, we use 2D+3D data (2D texture plus 3D shape) to build subject-specific annotated 3D models.
With the ever increasing demand for security and identification systems, the adoption of biometric systems is becoming widespread. There are many reasons for developing multibiometric systems; for example, a subject may conceal or lack the biometric a system is based on. This can be a significant problem with noncontact biometrics in some applications (e.g., surveillance). Many noncontact biometric modalities exist. Of these face recognition has been the most widely studied, resulting in both its benefits and drawbacks being well understood. Others include gait, ear, and soft biometrics. Automatic gait recognition is attractive because it enables the identification of a subject from a distance, meaning that it will find applications in a variety of different environments (Nixon et al. 2005). The advantage of the ear biometric is that the problems associated with age appear to be slight, though enrolment can be impeded by hair (Hurley et al. 2008). There are also new approaches to using semantic descriptions to enhance biometric capability, sometimes known as soft biometrics (Samangooei et al. 2008). The semantic data can be used alone, or in tandem with other biometrics, and are suited particularly to analysis of surveillance data.
The deployment of mutilbiometric systems is largely still at a research phase (Ross et al. 2006). Of the biometrics discussed here, some approaches fuse face with gait (Shakhnarovich and Darrell 2002; Liu and Sarkar 2007; Zhou and Bhanu 2007; Yamauchi et al. 2009) and some that fuse ear with face (Chang et al. 2003). No approach has fused gait and ear data.