The EU-funded CARETAKER project (2006-2008) aimed at studying, developing and assessing multimedia knowledge-based content analysis, knowledge extraction components, and metadata management sub-systems in the context of automated situation awareness, diagnosis and decision support. More precisely, CARETAKER focused on the extraction of a structured knowledge from large multimedia collections recorded over networks of camera and microphones deployed in real sites. The produced audio-visual streams, in addition to surveillance and safety issues, represented a useful source of information when automatically analysed, in urban planning and resource optimisation, environment planning.

CARETAKER modelled and accounted for two types of knowledge. On one hand, for safety operators and decision-makers, CARETAKER provided contextual description of the sensory data. On the other hand, the content knowledge, characterised by a basic level of primitive events (e.g. object trajectories), followed by a second layer of higher semantic events, provided useful knowledge and insight into complex events and interacting relationships. Patterns of behaviour over extended periods of time have also been extracted. Both knowledge types were modeled through a specified language scheme, also known as an ontology. The extracted meta-data was incorporated into knowledge management systems and used in two capacities:

  • an alarm mode, where the system detects a specific condition which needs operator attention, for instance, a fight in a public transport system;
  • a data mining mode, with knowledge discovery and retrieval capabilities.

The main areas of scientific innovation of the project were as follows:

  1. The processing of large amounts of audio-visual data from multiple sources, and the development of innovative techniques for event detection, recognition, and multi-modal event tracking.
  2. The specification of an ontology that represented user and scene knowledge. This was a standard by which a surveillance database can be queried, and was flexible in order to accommodate new scenarios.
  3. Systematic testing and evaluation procedures, and the definition of suitable representative metrics for assessing the system performance.

The project made use of video and audio data collected in public spaces such as Roma and Torino underground stations.

ROMA metro

TORINO metro

Annotated Datasets

Data annotations are provided for one type of video analysis: Train stop detection.

These Annotations provide meta-data about train stop times.

