Efficient and Scalable Video Coding
Digital video is now pervasive in every domain, in applications that range from digital cinema through broadcast TV, corporate messaging and training, science, medical, military and consumer social networks.
As the quantity, volume and resolution of media increase, conventional compression schemes are running into serious difficulties. Over the last few years, the MMV group at QMUL has been working on enabling compression technology that allows full usage of available bandwidth and end-user terminals. It is the technology needed to fully exploit video streaming over the Internet, where bandwidth variability is high and dependent on users connectivity and traffic.
Scalable Video Coding (SVC) is built on the principle "encode once, decode many times at any resolution, capacity and/or quality". It means that once the complex encoding process has been done, a simple bit stream truncation over the delivery chain will adapt the video stream according to bandwidth limitations to follow bandwidth variability. Furthermore, the received “truncated” stream can be decoded and played in any terminal: from high-definition television to mobile handsets. The impact of SVC technology is vast in any application involving video transmission, from high resolution video broadcasting to narrow-band video streaming over the Internet and wireless channels. The uniqueness of the SVC framework is due to two main characteristics: a) It provides fine granularity (at bit level) scalability. b) It relies on a wavelet-based model contrasting the block-based approach adopted by other relevant frameworks. aceSVC allows optimal use of available channel capacity and best visual quality when displayed. Work related to the development of aceSVC has impacted the research community by providing an added value alternative to available coding standards and to some extent influencing the development of related commercial products .
Impact on StandardsAs referenced in  coding techniques developed by the MMV group has been submitted the MPEG standardisation body and it has a) triggered comparative performance evaluation with best codecs world-wide; b) improved coding performance and led to a patent application [1a] and c) used to perform several subjective and objective comparison analysis towards definition of the MPEG reference software.
Impact on large Industry-Academia grants
This work led to an invitation from European Commission, DG Networked Electronic Media, to serve as member of the task force for the Future Media Internet (12 invited experts across Europe).
The most recent secured Grant builds on previous cooperation with UTRC. The large EU Lasie project involves UTRC and the MMV group and will bring a grant with a total project cost of 1,033,600 Euros for the MMV group.
In addition recently the BBC offered studentship support for a total of 4,000 GBP and started a TSB project proposal involving the BBC and the MMV group with a potential grant of 90,000 GBP.
Conventional video coders consist of two main modules: encoder and decoder. aceSVC also contains these two modules. However, they are implemented according to a completely different theoretical coding models including wavelet analysis, motion estimation and compensation, rate-distortion optimization and embedded entropy coding. The encoder is the cornerstone of aceSVC and produces compressed content represented in an embedded fashion. It also performs the most computationally complex operations.
The decoder decompresses and plays the video. Furthermore, an additional very low complexity module is also available: the aceSVC extractor. It is the simplest component in the system and is used to achieve real-time adaptation of the coded video. It parses the scalable coded bit-stream, selecting only relevant portions according to the available transmission or display capacity or other adaptation parameters. The truncated stream can be read and played by a low complexity aceSVC decoder.
Currently the MMV group is working with The BBC on further developments targeting Highly Efficient Scalable Video Coding. Once a compressed file has been created, the motion estimation data is only used for decoding purposes: after decompression the motion vectors (which take about 90% of the computation effort to calculate) are usually discarded. So to perform any image processing or quality control functions, or transcode into a different format, the motion vectors must be calculated all over again.
The need to revert back to baseband, and re-compress media every time we want to personalise, manipulate or resend it, imposes a very significant additional time, cost and computational load, which gets in the way. A core aim of the work with The BBC is to find more efficient schemas for motion estimation targeting real-time encoding. This is also the main scope of the most recent patent filed [1a].
The development of aceSVC has contributed to delivering highly skilled people with skills highly related to advanced video coding technology: 10 PhD graduated, 2 post-doctoral researcher went to work for key industrial players, employed 8 postdoctoral RAs, attracted 9 external postdoctoral RAs, 6 research students currently working on aceSVC.
Improving performance of existing businesses
- 3 relevant contracts (United Technologies, Motorola and STMicroelectronics)
- Over 10 relevant papers and reports co-authored with industrial players
- Over 40,000£ income from intellectual property
- 3 patents co-authored with or sold to industrial players 
The SVC framework developed at the Multimedia and Vision Research Group has been released under the acronym aceSVC. Its uniqueness is due to three main characteristics:
- It provides fine granularity (at bit level) scalability. Observe that other relevant “SVC coders” are not fully scalable.
- It relies on a wavelet-based model contrasting the block-based approach adopted by other relevant frameworks.
- aceSVC compresses the video stream in such a way that it can be optimally truncated according to the channel capacity and display size of the end-user device. Consequently, it allows optimal use of available channel capacity and best visual quality when displayed.
Work on the aceSVC led to substantial funding from the EU including the integrating project aceMedia [5d]. Initially, the MMV group received over half a Million Pounds funding to develop the basis for the system over a period of three years (2005-2008). During this time the basic architecture of the framework and its main functionality were developed and implemented. A working encoder/decoder was integrated into the aceMedia framework [3a-b]. The developed system consists of two main modules: encoder and decoder.
These modules are implemented according to a completely different theoretical coding models including:
- wavelet analysis
- motion estimation and compensation
- rate-distortion optimization
- embedded entropy coding.
The encoder is the cornerstone of aceSVC and produces compressed content represented in an embedded fashion. It also performs the most computationally complex operations [3a-b]. The decoder decompresses and plays the video. Furthermore, an additional very low complexity module is also available: the aceSVC extractor. It is the simplest component in the system and is used to achieve real-time adaptation of the coded video. It parses the scalable coded bit-stream, selecting only relevant portions according to the available transmission or display capacity or other adaptation parameters. The truncated stream can be read and played by a low complexity aceSVC decoder.
Additional research innovations underpinning this technology include:
- a novel highly flexible decomposition scheme [1a]
- adaptive decorrelation functions. A part of this technology was bought and patented by Motorola research UK in 2008, (see reference [2b]).
- a framework for generalised spatio-temporal decomposition that enables adaptive scalability range including complexity scalability [4b]
- design of flexible bit-stream organisation supporting the encoder, adaptation tools and decoder [1b]
Most of these technological developments stretched beyond the aceMedia project. In 2008 the MMV group received over 0.8 Million Pounds funding from the EU to develop several extensions and applications (PetaMedia [5b]). These extensions are underpinned by relevant research innovations including:
- post-compression rate-distortion optimisation applied to the entropy coder of transform coefficients [1c]
- introduction and application of the idea of motion adaptive spatial wavelet transform [1b]
- development of connectivity-map decomposition scheme with application in motion adaptive transform and object based video coding .
In cooperation with United Technologies (USA) the framework was further improved and adapted for specific surveillance applications during 2009 and 2010 [3c], [3d], [4a]. UTRC gave the group and industrial grant worth 240,000 USD for this cooperative work over 3 years.
1 Patents co-authored with or sold to industrial players
- Ebroul Izquierdo, Saverio Blasi, UK Patent Application No. 1203952.5
An enhanced inter-frame predictor based on on-the-fly transformation of the reference frame, Filed an April 2012.
- Ebroul Izquierdo, Marta Mrak, Nikola Sprljan, Intra-Block Adaptive Spatial Wavelet Transform for Enhanced Video Coding (Docket No. CML02951EV), described, illustrated and claimed in an application for Letters Patent of the United States of America having U.S. Patent Application Number 12/089785, Filed on April 10, 2008,
Sold to Motorola for £12,000 and £60,894 on Joint EPSRC-Motorola Industry Case award.
- E. Izquierdo and L. Q. Xu, “Data-Driven Nonlinear Diffusion for Object Segmentation”, with British Telecom patent, PCT Application Number: GB00/04707, BT Case Ref: A25861, 48 pages script.
Contributions to standards
- G.C.K. Abhayaratne, N. Sprljan, M. Mrak and E. Izquierdo, “Response to CFP on scalable video coding”, ISO/IEC JTC1/SC29/WG11 MPEG 2004/M10569-S24, Mar. 2004, Munich, Germany
- T. Zgaljic, M. Mrak, and Y. Andreopoulos, E. Izquierdo, “Proposal to Draft a Call for Evidence Test in AIC,” Tech. Rep. N4190, 41th JPEG meeting, ISO/IEC JTC1/SC29/WG1, San Jose, USA, Nov. 2007.
- N. Adami, and T. Zgaljic, “Preliminary Proposal of an Application Scenario and Test Conditions in AIC”, Tech. Rep. N4117, 40th JPEG meeting, ISO/IEC JTC1/SC29/WG1, Jeju,, Korea, Nov. 2006.
- T. Zgaljic, M. Mrak, N. Ramzan, and E. Izquierdo, “Scalable Coding Using Motion Adapted Wavelet Transform,” Tech. Rep. N4048, 40th JPEG meeting, ISO/IEC JTC1/SC29/WG1, Jeju, Korea, Nov. 2006.
- N. Adami, E. Izquierdo, R. Leonardi, M. Mrak, A. Signoroni, and T. Zgaljic, “Efficient Wavelet-based Video Compression”, Tech. Rep. N3954, 39th JPEG meeting, ISO/IEC JTC1/SC29/WG1, Perugia, Italy, July 2006.
- M. Mrak, N. Sprljan, T. Zgaljic, N. Ramzan, S. Wan, and E. Izquierdo, “Performance Evidence of Software Proposal for Wavelet Video Coding Exploration Group,” Tech. Rep. M13146, 76th MPEG Meeting, ISO/IEC JTC1/SC29/WG11/MPEG2005, Montreux, Switzerland, Apr. 2006.
- N. Sprljan, M. Mrak, T. Zgaljic, and E. Izquierdo, “Software Proposal for Wavelet Video Coding Exploration Group,” Tech. Rep. M12941, 75th MPEG Meeting, ISO/IEC JTC1/SC29/WG11/MPEG2005, Bangkok, Thailand, Jan. 2006.
Selected papers and reports co-authored with industrial players
- MOTOROLA: P. Hobson, E. Izquierdo, “Knowledge-based Media Analysis”, ISBN 0-902-23810-8, 2004, 466 pages
- UTC, USA: T. Zgaljic, N. Ramzan, M. Akram, E. Izquierdo, R. Caballero, A. Finn, H. Wang and Z. Xiong, “Surveillance Centric Coding”, Proc. 5th International Conference on Visual Information Engineering (VIE), Xi’an, China, Aug. 2008.
- UTC, USA: N. Ramzan, T. Zgaljic, E. Izquierdo, “An Efficient Optimisation Scheme for Scalable Surveillance Centric Video Communications”, Signal Processing: Image Communication, Vol. 24, No. 6, July 2009, 14 pages.
Relevant contracts with key industrial players
- British Broadcasting Corporation studentship support, Start date: 01/03/2013, End date: 31/07/2013, MMV funding: $ 4,000. Principal investigator: Prof Ebroul Izquierdo
- Surveillance Centric Codec (2006-2010), Industrial project funded by United Technologies, USA, Start date: 01/03/2006, End date: 02/10/2010, MMV funding: $ 240,000. Principal investigator: Prof Ebroul Izquierdo
- Motion adaptive spatial wavelet transform (2005), Industrial project funded by Motorola, UK, MMV funding: £ 11,162, Principal investigator: Prof Ebroul Izquierdo
- Optimization of aceSVC (2007), ST Microelectronics, MMV funding: £ 16,000, Principal investigator: Dr. Y. Andreopoulos
Relevant grants that enabled this underpinning research
- SARACEN (2010-2012), Socially Aware, collaboRative, scAlable Coding mEdia distribution EU FP7 funded project, Start date: 01/01/2010, End date: 31/12/2012, MMV funding: € 380,188, Principal investigator: Prof Ebroul Izquierdo
- PetaMedia (2007-2011), Peer-to-peer Tagged Media, EU funded IST project, MMV funding: € 806,000, Principal investigator: Prof Ebroul Izquierdo
- MESH (2006-2009), Multimedia Semantic Syndication for Enhanced News Services, EU funded IST project, MMV funding: € 840,000, Principal investigator: Prof Ebroul Izquierdo
- aceMedia (2004-2007), Integrating knowledge, semantics and content for user-centred intelligent media services, EU funded IST project MMV funding: £ 509,102, Principal investigator: Prof Ebroul Izquierdo
The impact of aceSVC is evidenced by a series of eight papers published in the IEEE transactions with highest impact factor in the field. Four large cooperative projects were the main originators of this work: aceMedia, RUSHES, MESH and PetaMedia. This work led to over £2.5 Million funding for the MMV group from the EU, EPSRC and Royal Society.
aceSVC technological developments were also proposed to the MPEG standardisation body, triggering comparative performance evaluation with best codecs world-wide.
Based on this work, MMV received over €1million EU funding. This work led to an invitation from European Commission, DG Networked Electronic Media, to serve as member of the task force for the Future Media Internet (12 invited experts across Europe). Two patents pending on this work.
Scientists to give evidence work’s impact within existing business: Alan Finn, United Technologies, East Hartford, USA
The latest work on aceSVC has led to cooperation with the BBC.