Using a set of 15 videos (those enclosed by the red boxes below), peoples heads have been manually annotated for each frame (therefore, not all videos are annotated as that would be a major task). Each separate individual is given a unique identifier (to allow testing people tracking approaches). For each frame, each individual is represented by a rectangle (centroid coordinates, width and height in pixels). The annotations are contained in *.xgtf (xml) files (they were prepared using the Viper-GT annotation tool). Each XML file is organised by person. To convert this format to simpler CSVs files, you can use the software provided here (this is provided "as is" with no support). These CSV files are also provided in this website for convenience. Each line in a CSV file contains on a frame-by-frame basis bounding box coordinates for each head (<class> is always "head"):
<frame_number> <person_id> <class> <top_x> <top_y> <width> <height>
where the top left corner of each image has coordinates 0, 0
| The videos are taken from the 2008/Overview/0mm (floor
height)/800mm(door width) section of the full dataset.
A=Alight (get off), B=Board (get on)
To download each video from this section of the dataset, please click on the the links below.
|Process||Door width||Height||Video||Ground truth||CSV