Multiple Cameras in Smart Rooms: Analysis Strategies
Prof. Josep R. Casas
Smart rooms are a paradigm of ambient intelligence and pervasive computing, providing an interesting combination of analysis from sensors, response through actuators and modeling of the situation for the development of consistent strategies in a particular "service" provided by the room. Technically speaking, a smart room is just an advanced computer interface, equipped with sensors and actuators. The tutorial starts from these concepts in order to justify the need of multiple cameras in a smart room for the unobtrusive analysis of the scene. After a brief discussion on the camera setup, framing and spatial coverage, the tutorial focuses on several visual analysis strategies and algorithms which may provide valid scene descriptions for the computer interface to work properly. A review on different fusion approaches for visual analysis in the context of a smart room introduces the description of the low level visual analysis tasks of person location and tracking, person identification, articulated body tracking and head-pose estimation. Higher level analysis requirements allow providing an insight into the more semantically meaningful analysis tasks of detection of focus of attention, gesture recognition or activity and event classification. Finally, the concept of smart room "service" is revisited for the closing of the tutorial, so that the set of visual analysis strategies described before are shown as integrated in a service environment.
Josep R. Casas is Professor Titular (Associate Professor) at the Department of Signal Theory and Communications, Technical University of Catalonia (UPC) in Barcelona. He graduated in Telecommunications Engineering in 1990 and received the PhD in 1996, both from UPC. He is currently teaching Signals and Systems, Image Processing and Television Systems at the School of Telecommunications Engineering. He was visiting researcher at CSIRO Mathematics & Information Sciences in Canberra, Australia in 2000. Josep R. Casas is Principal investigator of the project PROVEC ("Video Processing for Controlled Environments") of the Spanish R&D&I Plan, and has led or contributed to a number of industry-sponsored projects and European projects. In particular, he coordinated UPC contribution to CHIL ("Computers in the Human Interaction Loop"), an IP of the IST/EU 6th Framework Program in the strategic objective of Multimodal Interfaces, involving video, audio and natural language technologies. In the framework of this project, a Smart Room was built at UPC, equipped with 12 cameras and 100 microphones. The room provides synchronous audiovisual data for algorithmic development and testing in the area of multimodal interfaces. The presenter has authored over 10 papers in international journals, 12 papers in LNCS, 50 contributions to conferences and 4 book chapters and a teaching book in the areas of video coding, analysis, indexing and image processing.
Multimodal Human-centered Vision Systems
Prof. Nicu Sebe and Prof. Hamid Aghajan
In this tutorial, we take a holistic approach to the human-centered visionsystems problem. We aim to identify the opportunities in addressing novel applications, and the potentials for fruitful future research directions in this area. In particular, we introduce key concepts, discuss technical approaches and open issues in three areas: (1) multimodal interaction: visual (body, gaze, gesture) and audio (emotion) analysis; (2) smart environments; (3) distributed and collaborative fusion of visual information. The tutorial sets forth application design examples in which a user-centric methodology is adopted across the different stages from feature and pose estimation in early vision to user behavior modeling in high-level reasoning. The role of query for user's feedback will be discussed with examples in smart home applications. The course will motivate the use of multiple sensors in the environment as well as contextual information for effective data and decision fusion, and will focus on the user interaction techniques formulated from the perspective of key human factors such as adaptation to user preferences and behavior models. Several applications based on the notion of user-centric design will be introduced and discussed.
Nicu Sebe is with the Faculty of Cognitive Sciences, University of Trento, Italy, where he is leading the research in the areas of multimedia information retrieval and human-computer interaction in computer vision applications. He is the author of Robust Computer Vision - Theory and Applications (Kluwer, April 2003) and of Machine Learning in Computer Vision (Springer, May 2005). He was involved in the organization of the major conferences and workshops addressing the computer vision and human-centered aspects of multimedia information retrieval, among which as a General Co-Chair of the IEEE Automatic Face and Gesture Recognition Conference, FG 2008 and ACM International Conference on Image and Video Retrieval (CIVR) 2007, and as one of the initiators and a Program Co-Chair of the Human-Centered Multimedia track of the ACM Multimedia 2007 conference. He is the general chair of WIAMIS 2009, ACM CIVR 2010 and a track chair of WWW 2009 and ICPR 2010. He has served as the guest editor for several special issues in IEEE Computer, Computer Vision and Image Understanding, Image and Vision Computing, Multimedia Systems, and ACM TOMCCAP. He has been a visiting professor in Beckman Institute, University of Illinois at Urbana-Champaign and in the Electrical Engineering Department, Darmstadt University of Technology, Germany. He was the recipient of a British Telecomm Fellowship. He is the co-chair of the IEEE Computer Society Task Force on Human-centered Computing and is an associate editor of IEEE Transactions on Multimedia, Machine Vision and Applications, Image and Vision Computing, Electronic Imaging and of Journal of Multimedia.
Hamid Aghajan is a professor of Electrical Engineering (consulting) at Stanford University since 2003. Areas of research in his group consist of multi-camera networks and human interfaces for ambient intelligence and smart environments, with application to smart homes, occupancy-based services, assisted living and well being, ambience control, smart meetings and speaker assistance systems, and avatar-based communication and social interactions. Hamid is co-editor-in-chief of the Journal of Ambient Intelligence and Smart Environments. He has co-authored 3 edited volumes on: Multi-Camera Networks - Principles and Applications, Human-centric Interfaces for Ambient Intelligence, and Handbook of Ambient Intelligence and Smart Environments. He has been editorial board member of the book series on Artificial Intelligence and Smart Environments by IOS Press, associate editor of Machine Vision and Applications, guest editor of IEEE J-STSP special issue on Distributed Processing in Vision Networks, and guest editor of CVIU special issue on Multimodal Sensor Fusion. Hamid has been co-founder and technical co-chair of the first International Conference on Distributed Smart Cameras (ICDSC 2007), and general co-chair of ICDSC 2008. He has organized short courses on Distributed Vision Processing in Multi-Camera Networks at CVPR 2007, 2008, ACIVS 2007, ICASSP 2009, ICDSC 2009, and ICIAP 2009, and has served as chair at: special session on Distributed Processing in Smart Camera Networks at ICASSP 2007, workshop on Behavior Monitoring and Interpretation at German AI Conference 2008, 2009, special session on Vision-based Reasoning at AITAmI workshop at ECAI 2008, workshop on Multi-camera and Multi-modal Sensor Fusion Algorithms and Applications at ECCV 2008, special session on Multi-Sensor HCI for Smart Environments at Face and Gesture Conference 2008, workshop on Vision Networks for Behavior Analysis (VNBA) at ACM Multimedia 2008, workshop on Human Computer Interaction at ICCV 2009, and workshop on Use of Context in Vision Processing at ICMI-MLMI 2009. Hamid obtained his Ph.D. degree in electrical engineering from Stanford University in 1995.
Multi-camera and distributed video surveillance
Prof. Rita Cucchiara
This tutorial addresses algorithms and techniques of computer vision and pattern recognition for multi-camera and distributed video surveillance. When multiple (heterogeneous) cameras are connected in a forest of sensors, standard techniques used in single- fixed camera surveillance are not sufficient anymore. Different approaches should be taken into account depending on the camera layout (e.g., with overlapped or not overlapped field of view), the camera motion (e.g., fixed or PTZ cameras), the network capability and the availability of computational resource in the smart camera for early processing. The tutorial aims at presenting a short survey of the research activities in this area, mainly focusing on people surveillance; models and algorithms for object segmentation and tracking in multi-camera environments will be presented in details with several demos from ImageLab of Modena. Techniques for people detection in cluttered environment will be presented. Finally, recent advances in trajectory analysis for people behaviour classification in distributed cameras systems will be discussed. Benchmark videos with ground truth and tutorial material will be available for the tutorial attendees.
Rita Cucchiara ('89 MS in Electronic Engineering, '93 PhD in computer Engineering at University of Bologna) is Full Professor in Computer Architecture and Computer Vision at University of Modena and Reggio Emilia. She is Vice-Dean of the Faculty of Engineering in Modena and Coordinator of the Phd Course in "Computer Engineering and Science" of the Doctorate School in ICT in Modena. She is Director of the Softech center in software technologies for enterprise and coordinates the ImageLab laboratory of Modena. Her current interests include pattern recognition and computer vision for video surveillance, medical imaging, machine vision and multimedia. Video surveillance activity is devoted to new models of object segmentation, shadow detection, tracking, and people behaviour analysis for indoor and outdoor applications. Rita Cucchiara is responsible of many Italian and International projects. Example are BESAFE Nato project I the Science for Peace program; FREESURF Italian PRIN Project, Abandoned Pack detection funded by Australian Council. She Coordinates the EU JPR project THIS (Transportation hubs intelligent surveillance) and collaborates in EU projects VIDI-VIDEO and SAFIRE. Rita Cucchiara is author of more than 150 papers in national and international journals, and conference proceedings. Since 2006 she is fellow of IAPR. In 2009 she will be general chair of Italian Conference in Artificial Intelligence.