VideoBased Face Recognition Using Identity Surfaces
Yongmin Li ,
Shaogang Gong and
Heather
Liddell
 Identity Surfaces
 VideoBased Face Recognition
 Constructing Identity Surfaces of Faces
 Pattern Distances and Trajectory Distances to Identity Surfaces
 Relavant Publications
Assuming that only the appearance variation caused by rotation in
depth is concerned, i.e. the variation from expression, illumination
and facial makeup is excluded, each face class can be represented by
a unique hyper surface based on the pose information. For each pose
(tilt and yaw angles), there is one unique ``point'' for a face
class. We call this surface an identity surface. Then face
recognition can be performed by computing and comparing the distances
between a given pattern and a set of identity surfaces.
Figure 1:
Identity surfaces for face recognition

Psychological and physiological research suggests that modelling and
recognising moving faces dynamically are potential to achieve a
superior performance against that on static images. As shown in Figure 1, when a face is
detected and tracked in an input video sequence, one obtains the
object trajectory of the face in the feature space. Also, its
projection on each of the identity surface with the same
poses and temporal order forms a model trajectory of the
specific face class. Then face recognition can be carried out by
matching the object trajectory with a set of model
trajectories. Compared to face recognition on static images, this
approach can be more robust and accurate. For example, it is difficult
to decide whether the pattern X in Figure 1 belongs to subject A
or B for a single pattern, however, if we know that X is
tracked along the object trajectory, it is much clear that it is more
likely to be subject A than B.
If sufficient patterns of a face class in different views are
available, the identity surface of this face class can be
constructed precisely. However, we do not presume such a strict
condition. In this work, we develop a method to synthesise the
identity surface of a face class from a small sample of face
patterns which sparsely cover the view sphere. The basic idea is to
approximate the identity surface using a set of Np
planes separated by a number of Nv predefined views. The
problem can be finally defined as a quadratic optimisation problem
which can be solved using the interior point method.
Figure 2 shows real
identity surface of a face class from all 45 views and the synthesised
identity surface from only 15 views. Note that a sparse sample of face
patterns can provide satisfactory results.
Figure 2:
The identity surface constructed from all 45 views (first row) and
that synthesised from 15 prototype patterns (second row). Only the
first three KDA components are shown here.

The pattern distance of an unknown face pattern to one of the
identity surfaces can be computed as the Euclidean distance
between the pattern and the corresponding point on the identity
surface. It is important to note that the Euclidean distance may
be more appropriate for LDA (Linear Discriminant Analysis) or KDA
(Kernel Discriminant Analysis) while the Mahalanobis distance is more
efficient when PCA (Principal Component Analysis) or KPCA (Kernel
Principal Component Analysis) is adopted, since the discriminant
feature is crucial in the former case while the general variation of
all patterns is concerned in the latter.
Figure 3 shows the
pattern distances of a set of 45 face patterns from a same subject to
12 identity surfaces. The results from the groundtruth face
class are highlighted with solid line and circles. In this experiment,
KDA was adopted to represent the face pattern. All the 45 faces were
correctly recognised.
Figure 3:
Recognising multiview faces using distances to the identity
surfaces. The solid line denotes the results of the groundtruth
subject.

When a face is tracked continuously from a video sequence, face
recognition can be performed by computing and matching the object and
model trajectories. These trajectories encode the spatiotemporal
information of a moving face. A preliminary realisation of this
approach is implemented by computing the weighted summation of the
pattern distances in all frames up to the current time.
We demonstrate the performance of this approach on a small scale
multiview face recognition problem. Twelve sequences, each from a set
of 12 subjects, were used as training sequences to construct the
identity surfaces. The number of frames contained in each
sequence varies from 40 to 140. Only 10 KDA dimensions were used to
construct the identity surfaces. Then recognition was
performed on new test sequences of these subjects. Figure 4 shows the sample
images fitted by our multiview dynamic model and the warped
shapeandposefree texture patterns from a test sequence.
Figure 4:
Videobase multiview face recognition. From top to bottom, sample
images from a test sequence with an interval of 10 frames, images
fitted by the multiview dynamic face model, and the shapeandposefree
texture patterns.

The object and model trajectories (in the first two KDA dimensions)
are shown in Figure 5.
Figure 5:
The object and model trajectories in the first two KDA
dimensions. The object trajectories are the solid lines with dots
denoting the face patterns in each frame. The others are model
trajectories where the ones from the groundtruth subject highlighted
with solid curves.

The pattern distances from the identity surfaces in each
individual frame are shown in the left side of Figure 6, while the
trajectory distances shown in the right side. These results depict
that a more robust performance is achieved when recognition is carried
out using the trajectory distances which include the accumulated
evidence over time though the pattern distances to the identity
surfaces in each individual frame already provides a sufficient
recognition accuracy.
Figure 6:
Pattern distances and trajectory distances. The groundtruth subject
is highlighted with solid lines. By using KDA and identity
surfaces, the pattern distances can already give an accurate result.
However, the trajectory distances provide a more robust performance,
especially its accumulated effects (i.e. discriminating ability) over
time.

Relavant Publications

Y. Li, S. Gong, and H. Liddell.
Videobased
online face recognition using identity surfaces.
Technical
report, Queen Mary, University of London, 2001.

Y. Li, S. Gong, and H. Liddell.
Modelling
faces dynamically across views and over time.
Technical
report, Queen Mary, University of London, 2001.

Y. Li, S. Gong, and H. Liddell.
Recognising
the dynamics of faces across multiple views.
In British
Machine Vision Conference, pages 242251, Bristol, England, 9 2000.