Logo multitel

6th Multitel Spring workshop on video analysis – June 28, 2011

Organized in the framework of portfolio and platform.

  • 10:00 Welcome and coffee
  • 10:25 Opening Remarks and Welcome

Paper Session 1: Database indexing and image retrieval

  • 10:30 “A Higher-level Visual Representation for Semantic Learning in Image Databases“, Ismail El Sayad, LIFL.

With the availability of massive amounts of digital images in personal and on-line collections, effective techniques for navigating, indexing and searching images become more crucial. In this thesis, we rely on the image visual content as the main source of information to represent images. Starting from the bag of visual words (BOW) representation, a higher-level visual representation is learned where each image is modeled as a mixture of visual topics depicted in the image and related to high-level topics. First, we enhance the BOW representation by characterizing the spatial-color constitution of an image with a mixture of n Gaussians in the feature space. This leads to propose a novel descriptor, the Edge Context, which plays a role as a complementary descriptor in addition to the SURF descriptor. Such enhancements incorporate different image content information. Second, we introduce a new probabilistic topic model, Multilayer Semantic Significance Analysis (MSSA) model, in order to study a semantic inference of the constructed visual words. Consequently, we generate the Semantically Significant Visual Words (SSVWs). Third, we strengthen the discrimination power of SSVWs by constructing Semantically Significant Visual Phrases (SSVPs) from frequently co-occurring SSVWs that are semantically coherent. We partially bridge the intra-class visual diversity of the images by re-indexing the SSVWs and the SSVPs based on their distributional clustering. This leads to generate a Semantically Significant Invariant Visual Glossary (SSVIG) representation. Finally, we propose a new spatial weighting scheme and a Multiclass Vote-Based Classifier (MVBC) based on the proposed SSVIG representation. The large-scale extensive experimental results show that the proposed higher-level visual representation outperforms the traditional part-based image representations in retrieval, classification and object recognition.

  • 11:00 “Content-based browsing of multimedia libraries with the MediaCycle software“, Xavier Siebert, UMONS.

The MediaCycle software aims at providing a content-based browsing environment for multimedia (sounds, images, videos, text, …) databases. We will discuss recent advances in the development of the software, ranging from text analysis to the incorporation of segmentation tools for videos. Practical applications of the software in artistic projects will also be presented. MediaCycle is part of the numediart research program (www.numediart.org).

Paper Session 2: 3D scenes analysis and projection.

  • 11:30 “Spatio-temporal template matching for ball detection“, Amit Kumar K.C., Pascaline Parisot, Christophe De Vleeschouwer, UCL.

The talk considers the detection of the ball in a basketball game covered by multiple synchronized cameras. First, plausible ball candidates are detected on the nodes of a 3D grid plotted around the basket, by correlating independently in each view the spatial template of the ball with a pre-computed foreground mask. Efficient implementation of this step relies on the use of integral images. Afterwards, false positives are filtered out based on a temporal analysis of the ball trajectory. This analysis builds on the RANSAC method, with a ballistic trajectory model.

  • 12:00 “Monumental Projections“, Radhwan BenMadhkour, UMONS.

The numediart institute has developped a software that manages multi-projectors projection. This software aims at solving the problem of synchronisation of each projection, the projectors calibration and the interactivity. We present state of the art approaches and the way of improving those techniques to simplify the calibration process.

Lunch break

(A buffet lunch will be offered by Multitel)

Paper Session 3: Surveillance and applications

  • 14:00 “Particle-Based Tracking Model for Automatic Anomaly Detection“, Erwan Jouneau, Multitel.

We present a new method to automatically discover recurrent activities occurring in a video scene, and to identify the temporal relations between these activities, e.g. to discover the different flows of cars at a road intersection, and to identify the traffic light sequence that governs these flows.
The proposed method is based on particle-based trajectories, analyzed through a cascade of Hidden Markov Model (HMM) and Hierarchical Dirichlet Process HMM (HDP-HMM) models. We demonstrate the effectiveness of our model for scene activity recognition task on a road intersection dataset. We last show that our model is also able to perform on the fly abnormal events detection (by identifying activities or relations that do not fit in the usual/discovered ones), with encouraging performances.

  • 14:30 “Human behavior analysis from video using optical flow“, Yassine Benabbas, LIFL.

Behavior recognition and prediction in public and private areas are still major concerns in video surveillance. Automatic surveillance systems are widely used to extract and analyze the complicated behaviors through logical and mathematical rules.
We propose in our work three applications: crowd event detection, human activity recognition and motion pattern extraction. To achieve this goal, we followed an approach based on three levels of analysis. The first level is the detection of low-level features which are retrieved from the pixels of each video frame. The second level is the intermediate level descriptor which is extracted from low-level features and have more semantics. The third level uses the features of the intermediate level in order to produce human readable and useful information. For example, it will be responsible for displaying which event occurred and when. The advantage of this approach is that we can reuse the information of certain levels for different applications. This allowed us to use the optical flow as a low level feature and the direction model as an intermediate level descriptor in all our applications. We have tested our approach using different datasets and different types of video such as day life actions videos and traffic videos.

  • 15:00 “Computational Attention and Social Signal Processing for Video Surveillance“, Matei Mancas, UMONS.

The detection of abnormal behavior in scenes with views at different scales is a challenge for the surveillance systems. The use of computational attention algorithms which look for rare events and social signal related features as the proxemy may help in addressing part of the problem.

  • 15:30 Discussion & Closing