Logo multitel

5th Multitel Spring workshop on video analysis – June 15, 2010

Organized in the framework of portfolio and platform.

  • 10:00 Welcome and coffee
  • 10:25 Opening Remarks and Welcome

Paper Session 1: Computer Vision Framework & Sytems

  • 10:30 “Generic and optimized framework for multi-content analysis based on learning approaches“, Quentin Besnehard, Barco, Medical Imaging R&D Department.

During the European Cantata project (ITEA project, 2006-2009), a Multi-Content Analysis framework for the classification of compound images in various categories (text, graphical user interface, medical images, other complex images) was developed within Barco. The framework consists of six parts: a dataset, a feature selection method, a machine learning based Multi-Content Analysis (MCA) algorithm, a Ground Truth, an evaluation module based on metrics and a presentation module. This methodology was built on a cascade of decision tree-based classifiers combined and trained with the AdaBoost meta-algorithm. In order to be able to train these classifiers on large training datasets without excessively increasing the training time, various optimizations were implemented. These optimizations were performed at two levels: the methodology itself (feature selection / elimination, dataset pre-computation) and the decision-tree training algorithm (binary threshold search, dataset presorting and alternate splitting algorithm). These optimizations have little or no negative impact on the classification performance of the resulting classifiers. As a result, the training time of the classifiers was significantly reduced, mainly because the optimized decision-tree training algorithm has a lower algorithmic complexity. The time saved through this optimized methodology was used to compare the results of a greater number of different training parameters.

  • 11:00 “Browsing multimedia libraries with the MediaCycle software“, Xavier Siebert, UMONS

The MediaCycle project aims at developing a novel browsing environment for multimedia (sounds, images, videos) databases, that offers an alternative to conventional search-by-query. We will discuss techniques to organize and to display databases, and present an application for navigating in a dance videos collection. The resulting software offers a wide range of potential applications, from browsing a medical images library to manipulating sounds for an artistic performance. MediaCycle is part of the numediart research program (www.numediart.org).

  • 11:30 “The Kernel, an Expert System Dedicated to Image Processing“ Sebastien Delhay, Ecole Royale Militaire, département CISS.

Image interpreters at the Belgian Defense Geographic Service have to face the following challenge: vectorizing huge amounts of scanned maps in limited time and with constrained human capacity. They have thus asked us to implement a computer-based solution that would carry out the whole processing chain, from raw map image to useable information. A first issue lies in the plethora of available image processing techniques and the automatic choice of an optimal processing strategy based upon those techniques. Another problem resides in the difficulty to map low-level features extracted from images to high-level, meaningful information, also known as the “semantic gap” problem. We address both issues with the Kernel, an expert system dedicated to image processing. Indeed, the Kernel enables the formalization of high-level knowledge which it uses to propose strategies to the user, much as a human expert would do. Moreover the system applies knowledge on the results of the chosen strategy in order to derive high-level information, hence encompassing the whole processing chain. Finally, we intend to run the system in-house as a didactic platform for documentation and testing of the tools available at the lab, enabling a better exchange of knowledge between researchers.

  • 12:00 “Live, interactive, 3D-stereo, full-HD, high bandwidth capture, transmission, and projection of a neurosurgical operation“, Jacques Verly, Ulg, Jérôme Meessen, Intopix.

In the framework of the 9th edition of the ImagéSanté festival, an operation of neurosurgery was captured and retransmitted live and in 3D stereo (and in full HD). The surgical operation was carried out at the University Hospital of Liège and was watched by a mesmerized public in the movie theater Sauvenière, 16 km (10 miles) away from the hospital. The numerous spectators, in an overcrowded theater, were able to interact live with the neurosurgeon throughout the operation thanks to an audio duplex system linking the two sites. Professor Jacques G. Verly and his team from the Laboratory for Signal and Image Exploitation of the Department of Electrical Engineering and Computer Science of the University of Liège designed the architecture of the entire chain of capture, transmission, and projection, and provided the technical management of this 3D event. Electronic boards PRISTINE from IntoPIX (implementing the JPEG 2000 standard) and the computers conceived at the University of Liege allowed one to perform the transmission by maintaining the left and right stereoscopic streams separated, and by compressing them simultaneously for a total transmission rate of 500 Mbit/s. This corresponds to half of the capacity of each optical fiber used. It is useful to indicate that, for example, the rate at the input of the encoding board, thus before compression, is on the order of 2.6 Gbit/s. One should note that the conventional strategy for this type of live 3D transmission is to mix the two streams (while reducing their resolution) to allow the use of traditional transmission channels, which was not the case here.

Lunch break

(A buffet lunch will be offered by Multitel)

Paper Session 2: Low Level Features Extraction

  • 13:30 “Particle tracking for pedestrians and vehicles detection“, Christophe Chaudy, Frédéric Salvador ACIC.

In recent R&D projets, ACIC has investigating the applications of the particle tracking method. That approach use many simple particles with the ability to follow moving objects, randomly generated over the image. Particles implement motion estimation to update their positions and died when they stop or leave the image. An aggregation process analyse the behaviour of groups of particles and produces object level detection. That technique has been used to detect pedestrian on crosswalk and stopped vehicle on highway. We will show results for both applications and discuss the qualities and drawbacks of particle tracking.

  • 14:00 “3D scene reconstruction using a Time-Of-Flight camera“ , Fabien Lavigne.

In the 3DMEDIA project, we investigate the use of Time-Of-Flight (TOF) cameras (sensor providing a dense 3D map of the scene at high frame rates) combined to the Simultaneous Localization And Mapping (SLAM) problem. In brief, in the context of a static scene filmed by a moving camera, we want to recover camera trajectory and build a 3D representation of the environment. It could provide useful applications for example in robotic (autonomous navigation) or augmented reality. The proposed application based on the following framework: camera acquisition, data filtering, motion recovery and 3D map building allows a fine reconstruction of the observed scene.

  • 14:30 “Ball detection in a multiview basketball game“, Pascaline Parisot, UCL.

The APIDIS european projet aims at summarizing automatically a basketball game captured by several cameras. To interpert the actions of the basketball game, the detection of the players and of the ball is essential. In the case of the detection of the ball, we propose an approach that is not based on pattern matching and that is independent of the actual texture and color of the ball. Our detection is based on three features that are, for each frame: the foreground mask, a 3D distance map and an isophote center map.

Paper Session 3: Human Behavior Modelling

  • 15:00 “Extracting Human Behaviour from Video Stream“, Ioan Marius Bilasco, LIFL, FOX-MIIRE.

This presentation gives an overview of the research topics of FOX team from LIFL research lab related to the extraction of human behavior from video streams. We address two classes of approaches: when the subject is a single person (eye gaze, individual motion recognition), and when the subject is a crowd (abnormal event detection, crowd behavior). The first type of situation can be exploited in applications where implicit feedback from the person is needed. We have experimented such techniques in applications related to the way user perceived multimedia content, or related to the identification of regions that are visually perceived by a person in a constrained environment (shelf in a store, large display). The second class of approaches is more related to crowd behavior observation and analysis. We exploit flow-related information in order to detect unexpected behavior (falling down, disordered movements, etc.) or specific behaviors/events (crowd joining, crowd splitting).

  • 15:30 “Dense crowd analysis through bottom-up and top-down attention“, Matei Mancas, UMONS.

Video analysis of difficult scenarios like dense crowds can highly benefit from the use of algorithms which model part of human attention. Interesting motion which is new or surprising can be computed on large groups of people based on a two step approach. A bottom-up attention model built upon motion rarity compared to the rest of the motion in the same frame and a top-down approach which inhibits regions from the image which have a too repetitive behavior. This algorithm points out abnormal activities which can be used in surveillance but also to analyze and even foster social interaction.

  • 16:00 Discussion & Closing