Probabalistic Framework for Multiple Video Streams

We propose to design, implement, and test a system that can detect and track humans automatically. Our system will recognize the activities of individuals and patterns of activities within and between groups. This information could be used to provide alerts of potential threats to facilities and personnel.
Our system will be composed of a hierarchy of stages:
> Take as input video streams from a network of cameras, which collectively monitor a site
> Detect and track humans at the "kinematic chain" level (treat the body as articulated and determine the position and velocities of individual links such as left upper arm, torso, head) using probabilistic models and particle-filtering techniques
>Analyze the movement of each individual in terms of "movemes", brief packets of human motion that can be used as a vocabulary to characterize general human movements
> Recognize actions and activities, and where possible hostile intent, from a library of models
>Provide situation awareness by categorizing the activities or patterns of activity
A system this complicated must be built in a principled way. Probabilistic reasoning supplies an appropriate framework of principle in which to combine different sources of (uncertain) evidence - bottom-up and top-down, from multiple cameras, and over varying spaciotemporal scales. Learning models for prototypical movemes and activities can then be posed as the problem of estimating appropriate generative and discriminative models.
We will build four testbeds to test both the components as well as the entire system. Our research will be enriched by a major collaboration with the U.C. Police Department, who use video surveillance extensively and have an established need for automatic methods to support various police activities. Their needs are similar to defense related needs.
We have assembled a team of researchers from U.C. Berkeley (J. Malik, J. Canny, D. Forsyth, M. Jordan, S. Russell), Stanford University (C. Bregler), California Institute of Technology (P. Perona), and University of Southern California (M. Mataric) to accomplish this task. This team has proven excellence both in the principles and the applications of machine vision, human-machine interfaces, and machine learning. Furthermore, it has a history of established collaborations and a track record of successful delivery of working systems.