Next: Exploitation and Future Work
Up: Summary of Objectives and
Previous: Real-Time View Alignment for
A face appears in the real world invariably almost always in motion, either
due to the relative motion of other observers or the movement of the
person's head and body. Perceiving a moving face involves more
than that of a static picture of a face. Studies on face recognition
has been too often only focused on examining picture recognition
rather than real face recognition. In the course of this
project, we have developed computationally efficient methods and
systems for tracking and recognising moving faces in real-time for
both near frontal views
[11, 12, 16] and across
views [7, 8]. In these studies, the effect
of the following issues were addressed:
- the ratio of the
head/face region to the video frame size (deciding the camera field of
view),
- the acceptable tracking continuity and head movement
variation,
- the scalability of learned models based on sparse samples,
- the ability to generalise across
views,
- the conditions under which sufficient moving face sequences
can be sampled for learning,
- the degree of immunity to
``distortion'' such as spectacle, facial hair and hair-style changes.
In particular, the issue of recognising moving faces across views was
the focus of the study throughout this project. To be able to
recognise faces of moving
people not only requires the ability to label novel face images with
known identities, but also needs detecting and tracking of faces over
time. We refer to this as the task of associating faces. We
adopt the view such a task can be better achieved using view-based
appearance models rather than explicit 3D models. One of the
difficulties in associating faces using view-based representations is
that face images of the same person from different viewpoints are
significantly more dissimilar than images of different people
appearing in the same view. However, the task can be significantly
simplified if poses are known. The ability to estimate and predict the
3D orientation of faces and the ways in which they change over time
also imposes temporal continuity in recognition. Consequently, the
ability to locate, track and predict head pose of a moving person is
an integral part of recognition. Here we present a method for learning
to associate faces across the view-sphere based on similarity measures
to prototypes in multiple views. Although similar work was proposed
for recognition using similarity measures,
and for novel view generalisation and synthesis using linear
combination of prototypes, our method extends
the idea to a unified method that addresses the problems of both pose
and identity recognition and tracking over time. The method uses
training data from a database of 3D pose labeled face images across
the view-sphere captured by an automated data acquisition system.
The problems addressed here are
- automatic acquisition of labeled
face data across the view-sphere for learning,
- real-time recognition and tracking of face image location, scale, and 3D pose,
- identities over time.
The proposed method takes a view-based approach in which
face appearance models are learned from example views without
recourse to any explicit 3D model. Furthermore, the views are aligned
using only simple image-plane transformations such as translation and
scaling, or at most affine transformation. In particular, no dense
correspondences between feature points on different faces are required
and as a result, real-time performance is obtained. The models are
constructed from aligned image data labeled with pose
angles. Efficient focus of attention based on colour and motion cues
is used to bootstrap face image search in the image plane.
Next: Exploitation and Future Work
Up: Summary of Objectives and
Previous: Real-Time View Alignment for