Queen Mary University of London Department of Electronic Engineering
   Home | Publications | Research | Teaching | StudentsBio Email  



Video object extraction
Many computer vision problems require change detection as their preliminary step. Change detection is a process of identifying differences in the state of objects or phenomena by observing them at different times. Applications that benefit from change detection are multimedia, advanced video surveillance, remote sensing, interactive and immersive gaming to mention a few. They require efficient and accurate procedures to detect and to label the changes in image sets or image sequences.

Here on the left, you can see the result of a novel technique for the extraction of multiple moving objects (the detected objects are shown, while the background is color-coded in white).



Video object tracking

  Object-based coding and description offer a new range of capabilities, where the objects are separately coded and described. Industry standards, such as MPEG-4 and MPEG-7, provide the user with flexibility in content-based access and manipulation of multimedia data. To maximize the benefits of object-based representation, these standards need to be complemented with automatic techniques for
extracting the video objects from video data, a problem that still remains largely unsolved.

An object-based representation requires therefore prior decomposition of sequences into semantically meaningful, physical objects. In a way, this corresponds to retracing the steps in the video-creation
process. This complex problem can be formulated as one of identifying objects in the scene and separating them from the background. Tracking is a fundamental step in video object extraction. In this
framework, the goal of tracking is to follow video objects in the scene and to update their 2D shape from frame to frame. After a frame of the image sequence has been segmented into objects, the objects are
tracked in the subsequent frames. The aim of temporal tracking is to establish a correspondence between instances of video objects over frames.

By clicking on the image on  the left, you can see the result of a tracking  technique that exploits an image representation as partition hierarchy and tracks video objects based on interactions between different levels of the hierarchy. The hierarchy is composed of a semantic level and a region level. The semantic level defines the topology of the video objects. The region level defines the topology of homogeneous areas constituting the objects.

Collaboration Olivier Steiger


Shadow detection in video sequences

Shadows can provide important information about the scene represented in a digital image. They contain cues about the shapes and the relative positions of objects, as well as about the characteristics of surfaces and light sources in the scene. On the other hand, the presence of shadows represents a problem for applications requiring the identification of objects through segmentation. The shape and the color of segmented objects is in fact modified by the presence of shadows. Therefore, in order to provide a correct description of the objects, shadows have to be identified.

Here on the left, you can see the result of a novel technique for shadow recognition (the detected shadows are color-coded in green).

Collaboration Elena Salvador




Semantic transcoding

 By clicking on the image on  the left, you can see the result of an automatic content-based video transcoding algorithm which is based on how humans perceive visual information. The transcoder support multiple video objects and their description. First the video is decomposed into meaningful objects through semantic segmentation. Then the transcoder adapts its behaviour to code relevant (foreground) and non relevant objects differently. Both object-based and frame-based encoders are combined with semantic segmentation. Experimental results show that the use of semantics and description prior to transcoding reduces the bandwidth requirements and makes it possible to adapt the video representation to limited network and terminal device capabilities still retaining the essential information.

Collaboration Olivier Steiger




  Home | Publications | Research | Teaching | StudentsBio

 Queen Mary, University of London 2003