Real-time Identity Recognition of Moving Faces

Next: Exploitation and Future Work Up: Summary of Objectives and Previous: Real-Time View Alignment for

Real-time Identity Recognition of Moving Faces

A face appears in the real world invariably almost always in motion, either due to the relative motion of other observers or the movement of the person's head and body. Perceiving a moving face involves more than that of a static picture of a face. Studies on face recognition has been too often only focused on examining picture recognition rather than real face recognition. In the course of this project, we have developed computationally efficient methods and systems for tracking and recognising moving faces in real-time for both near frontal views [11, 12, 16] and across views [7, 8]. In these studies, the effect of the following issues were addressed:

the ratio of the head/face region to the video frame size (deciding the camera field of view),
the acceptable tracking continuity and head movement variation,
the scalability of learned models based on sparse samples,
the ability to generalise across views,
the conditions under which sufficient moving face sequences can be sampled for learning,
the degree of immunity to ``distortion'' such as spectacle, facial hair and hair-style changes.

In particular, the issue of recognising moving faces across views was the focus of the study throughout this project. To be able to recognise faces of moving people not only requires the ability to label novel face images with known identities, but also needs detecting and tracking of faces over time. We refer to this as the task of associating faces. We adopt the view such a task can be better achieved using view-based appearance models rather than explicit 3D models. One of the difficulties in associating faces using view-based representations is that face images of the same person from different viewpoints are significantly more dissimilar than images of different people appearing in the same view. However, the task can be significantly simplified if poses are known. The ability to estimate and predict the 3D orientation of faces and the ways in which they change over time also imposes temporal continuity in recognition. Consequently, the ability to locate, track and predict head pose of a moving person is an integral part of recognition. Here we present a method for learning to associate faces across the view-sphere based on similarity measures to prototypes in multiple views. Although similar work was proposed for recognition using similarity measures, and for novel view generalisation and synthesis using linear combination of prototypes, our method extends the idea to a unified method that addresses the problems of both pose and identity recognition and tracking over time. The method uses training data from a database of 3D pose labeled face images across the view-sphere captured by an automated data acquisition system. The problems addressed here are

automatic acquisition of labeled face data across the view-sphere for learning,
real-time recognition and tracking of face image location, scale, and 3D pose,
identities over time.

The proposed method takes a view-based approach in which face appearance models are learned from example views without recourse to any explicit 3D model. Furthermore, the views are aligned using only simple image-plane transformations such as translation and scaling, or at most affine transformation. In particular, no dense correspondences between feature points on different faces are required and as a result, real-time performance is obtained. The models are constructed from aligned image data labeled with pose angles. Efficient focus of attention based on colour and motion cues is used to bootstrap face image search in the image plane.

Next: Exploitation and Future Work Up: Summary of Objectives and Previous: Real-Time View Alignment for