MuHAVi:
Multicamera Human Action Video Data
including selected
action sequences with
MAS:
Manually Annotated Silhouette Data
MuHAVi-uncut: Full videos with realistic Silhoutte Data
for the evaluation of
human action recognition methods
Last updated September 2017
Originally part of the REASON project
funded by the
UK's Engineering and
Physical
Sciences Research Council (EPSRC)
Then part of the OBSERVE project, funded by the Fondecyt
Regular Program of
the Chilean Research
Council for Science and Technology (Conicyt),
grant number 1140209.
Currently part of
the UC3M-Conex project CloseVU (Close View Visual Understanding)
project, funded by the Marie Curie EU Programme.
This
dataset was originally put
together by a team based at Kingston University's then Digital Imaging Research Centre
and continued by a team at the Department of
Informatic Engineering at the University of Santiago de Chile.
The updates in Feb-July
2014 and from Sept 2015 have been done
as part of a Chair of Excellence/Marie Curie Professorship stay by
Prof. Sergio A Velastin at the
Applied
Artificial
Intelligence Group of the Universidad
Carlos III de Madrid
Organizing the
experiments and data collection were performed by the project's team
based at Kingston University together with the partners based at Reading University
and
UCL's Pamela Laboratory.
NOTE: We have also
made our virtual human action silhouette data available online (visit
our
ViHASi page).
If you publish work that uses this
dataset, please use the following references:
For MuHAVi-uncut:
@article{murtaza2016multi,
title={Multi-view human action recognition using 2D motion
templates based on MHIs and their HOG description},
author={Murtaza, Fiza and Yousaf, Muhammad Haroon and Velastin,
Sergio A},
journal={IET Computer Vision},
volume={10},
number={7},
pages={758--767},
year={2016},
publisher={IET Digital Library}
}
DOI 10.1049/iet-cvi.2015.0416
For MuHAVi-MAS:
@inproceedings{singh2010muhavi,
title={Muhavi: A multicamera human action video dataset for the
evaluation of action recognition methods},
author={Singh, Sanchit and Velastin, Sergio A and Ragheb,
Hossein},
booktitle={Advanced Video and Signal Based Surveillance (AVSS),
2010 Seventh IEEE International Conference on},
pages={48--55},
year={2010},
organization={IEEE}
}
DOI: 10.1109/AVSS.2010.63
You can also find here a list
of publications that use the MuHAVi dataset.
New (12.09.2017):
- A brand new set of temporal ground truths for
MuHAVi-uncut has been prepared that defines start-end of each
"sub-action" in an action
- Silhouttes for each of the sub actions in MuHAVi-uncut are now available
Introduction
We
have collected a large body of human action video (MuHAVi) data using 8
cameras. There are 17 action classes as listed in Table 2 performed by
14 actors. We initially processed videos corresponding to 7 actors in
order to split the actions and provide the JPG image frames. These
include included some image frames before and after the actual action,
for the purpose of background subtraction, tracking, etc. The longest
pre-action frames correspond to the actor called Person1. Note that
what we provide is therefore temporally pre-segmented actions as this
was
typical when the dataset was first released. We now (see
below) provide long unsegmented sequences
for people to work on temporal segmentation.
Each actor performs each action several times in the
action zone highlighted using white tapes on the scene floor. As actors
were amateurs, the leader had to interrupt the actors in some cases and
ask them to redo the action for consistency. As shown in Fig. 1 and
Table 1, we have used 8 CCTV Schwan cameras located at 4 sides and 4
corners of a rectangular platform. Note that these cameras are not synchronised. Camera calibration information may be
included here in the future. Meanwhile, one can use the patterns on the
scene floor to calibrate the cameras of interest.
Note that to prepare training data for action recognition
methods, each of our action classes may be broken into at least two
primitive actions. For instance, the action "WalkTurnBack" consist of
walk and turn back primitive actions. Further, although it is not quite
natural to have a collapse action due to shotgun followed by standing
up action, one can simply split them into two separate action classes.
We
make the data available to the researchers in computer vision community
through a password protected server at the University Carlos III de Madrid, Spain. The data may be accessed by
sending an Email (subjected "MuHAVi-MAS Data") to Prof Sergio A
Velastin at
sergio.velastin@ieee.org
giving the names, email addresses and institution(s) of the researchers who wish to use
the data and their main purposes. We request this only to build a list
of people using this dataset to form a "MuHAVi community" with whom to
communitcate. The only requirement for using the MuHAVi data is to
refer to this site and to our publication(s) in the corresponding publications.
Figure 1. The top
view of the configuration of 8 cameras used to capture the actions in
the blue action zone (which is marked with white tapes on the scene
floor).
camera
symbol |
camera name
|
Table 1. Camera view
names appearing in the MuHAVi data folders and the corresponding
symbols used in Fig. 1.
*** This section is mainly of historical interest. It is better to download the data in the MuHAVi-uncut set ****
On the table below,
you can click on the links to download the data (JPG images) for the
corresponding action
Important: We noted
that some earlier versions of that earlier versions of MS Internet
Explorer could not download files over 2GB size, so we recomment to use
alternative browsers such as Firefox or Chrome.
Each
tar file contains 7 folders corresponding to 7 actors (Person1 to
Person7) each of which contains 8 folders corresponding to 8 cameras
(Camera_1 to Camera_8). Image frames corresponding to every combination
of action/actor/camera are named with image frame numbers starting from
00000001.jpg for simplicity. The video frame rate is 25 frames per
second and the resolution of image frames (except for Camera_8) is 720
x 576 Pixels (columns x rows). The image resolution is 704 x 576 for
Camera_8.
action
class |
action name
|
size |
Table 2. Action
class names appearing in the MuHAVi data folders and the corresponding
symbols used in Fig. 3.
Table 3. Actor names
appearing in the MuHAVi data folders and the corresponding symbols used
in Fig. 3.
*** end of historical note
NEW MuHAVi "uncut" (November
2014)
So far, MUHaVi has consisted of
- (Temporally) manually pre-segmented
action sequences (in JPEG files)
- Manually annotated silhouettes for a
small sub-set of actors/actions/cameras (MuHAVi-MAS
dataset).
Thanks to work done by Dr Zezhi Chen of Kingston University, Jorge
Sepúlveda of the University of Santiago de Chile (USACH) and Prof.
Sergio A Velastin from
the Universidad
Carlos III de Madrid, we are now able to provide:
- Un-cut original video sequences
(mainly in MPEG2) for each camera (the recordings are continuous and
contain the acted actions but also the gaps and breaks in
between).
- Ground truth describing times of
start and completion (frame numbers) of each sub-action in each video file
by each actor (Note: the community's views are wellcome to agree on a
set
of metrics to evaluate temporal segmentation methods)
- Silhouettes computed by Z.Chen´s algorithm
(the rationale is that these are realistic silhouettes typical of the
state of the art and people are invited to test the robustness of their
human action recognition and temporal segmentation algorithms based
such realistic, and "imperfect", segmentation)
Here are a couple of samples that do not need a user name and password to download:
Camera2A video sample
Camera2A sample silhouttes (parameter= 6.0)
You can now download the full-length
videos from here:
Because of the
length of these videos, use RightClick/"Save Link As" and use a high
speed network:
(A and B occur because in the original recordings, the
recorder had to be stopped to change media!)
We provide these sets of silhouettes,
generated by varying one parameter in the foreground detection
algorithm (this is explained after the table):
Each compressed archive contains files named %d.png where the number is
the frame number. In each, file black (0) represents the background,
white (255) the foreground i.e. the silhouette and grey (127) is a
detected shadow
(normally to be considered as background).
The "Parameter" is one of the factors that affect foreground detection
in
terms of true positives vs false positives. When tested against the manually annotated silhouettes,
a value of 3.0 produces a TPR (true positives rate) of around 0.78 and
a FPR (false positives rate) of around 0.027
while 4.0 gives around 0.71 and 0.013 and 5.0 gives 0.625 and less than
0.01 (i.e. less noise but less foreground). As many of the false
positives tend to be noise outside the main silhouette, we expect that
most people will use the set with higher TPR and reduce the false
positives e.g. with morphologial filtering. When publishing results can
you please ensure that you give full details of any pre-processing of
this kind.
**** Historical note
When we first published MuHAVi we provided a spreadsheet with the times (frame numbers of when an action started and when it finished). Incidentally, it also described how
the MuHAVi JPEG sequences were obtained from AVI files
extracted from manually obtained temporal markers. Please also note
that we discovered that there was a bug in mplayer (that converted from
AVI to JPEGs)
that resulted in some skipped frames in the JPEG sequences). In any case, we have found this to be of less use than we expected because:
- Each action (e.g. "walk and turn back") was
conducted by each actor a number of times (typically 3), but the
annotation only contained the start and end of the (3) actions as a
whole and not of each one separately.
- Actions such as "walk and turn back" could really
regarded as two or three sub-actions: walk (toward one end of the
stage), turn, walk (back to the other end of the stage) and it would be
nice to annotate them separately
- Finally, we found that there were errors in the annotation!
**** end of historical note
The ground truth file can be obtained here
in spreadsheet format (we are grateful to Erwann Nguyen-van-Sang,
intern MSc student from the U. of Strasbourg, who spent many hours to
produce this annotation).
Below is an extract from the spreadsheet.
- The first column refers to the
camera and actor numbers.
- The second column header gives the
action (e.g. "WalkTurnBack", "RunStop").
- The numbers on the second column for
each person give the frame number, in the video sequence, where the
action starts and in the third column where it ends (this is somewhat
subjective, of course and the community needs to agree on a metric that
would not unjustly penalise algorithms).
- If the action was repeated (that is
almost always the case) the start and end frames are given in the
fourth and fifth columns and so on
Camera 2 from dvcam3-1-6.0 and dvcam3-2-6.0 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
WalkTurnBack |
|
|
|
|
RunStop |
|
|
|
|
|
|
|
|
Start S1 |
End S1 |
Start S2 |
End S2 |
|
Start S1 |
End S1 |
Start S2 |
End S2 |
Start S3 |
End S3 |
Start S4 |
End S4 |
Actor 1 |
377 |
607 |
627 |
867 |
|
7387 |
7526 |
7527 |
7666 |
7667 |
7776 |
7777 |
7837 |
Actor 2 |
1297 |
1517 |
1537 |
1777 |
|
8267 |
8366 |
8367 |
8446 |
8447 |
8546 |
8547 |
8597 |
Actor 3 |
2087 |
2327 |
2347 |
2597 |
|
8867 |
8946 |
8947 |
9046 |
9047 |
9156 |
9157 |
9227 |
Actor 4 |
2967 |
3187 |
3217 |
3457 |
|
9717 |
9796 |
9797 |
9886 |
9887 |
9976 |
9977 |
10037 |
Actor 5 |
3847 |
4077 |
4097 |
4367 |
|
10347 |
10446 |
10447 |
10546 |
10547 |
10626 |
10627 |
10687 |
Actor 6 |
4747 |
5017 |
5047 |
5337 |
|
11187 |
11296 |
11297 |
11396 |
11397 |
11496 |
11497 |
11577 |
Actor 7 |
5677 |
5947 |
5967 |
6207 |
|
11927 |
12036 |
12037 |
12126 |
12127 |
12226 |
12227 |
12307 |
To help those working with this data, we have extracted (from the long
silhouettes sequences) each sub-action described in the above
spreadsheet into separate sub-sequences (divide up by actor, action and
camera). As there are many of those, it is best to download the whole
set from here (2.7GB)
**** Historical note
Note: this material is only
historical (as we do not have well documented sources of these results)
Masks
obtained by applying two different Tracking/Background Subtraction
Methods to some of our Composite Actions
Each zip file
contains masks (in their bounding boxes) corresponding to several
sequences of composite actions performed by the actor A1 and captured
from two camera views (V3 and V4) for the purpose of testing
silhouette-based action recognition methods against more realistic
input data (in conjunction with our MAS training data provided below),
where the need for a temporal segmentation method is also clear.
More data and
information to be added ...
*** end of historical section
MuHAVi-MAS: Manually Annotated Silhouette
Data
We
recommend using this subset of MuHAVi to test Human Action Recognition
(HAR) algorithms independently of the quality of silhouettes. For a
fuller evalution of a HAR algorithm, please consider using MuHAVi
"uncut" instead.
We
have selected 5 action classes and manually annotated the corresponding
image frames to generate the corresponding silhouettes of the actors.
These actions are listed in Table 4. It can be seen that we have only
selected 2 actors and 2 camera views for these 5 actions. The
silhouettes images are in PNG format and each action combination can be
downloaded as a small zip file (between 1 to 3 MB). We have also added
3 constant characters "GT-" to the beginning of every original image
name to label them as ground truth images.
On the table below,
you can click on the links to download the silhouette data for the
corresponding action combinations.
action
class |
action name
|
combinations
for silhouette annotation |
Table 4. Action
combinations corresponding to the MAS data for which ground truth
silhouettes have been generated.
NEW! The table
below contains links to the corresponding AVI video files (in MPEG2)
from which the JPEG file sequences were extracted which were then used
by the manual annotators to get the silhouettes (Note that due to a
software bug the JPEG sequences had a couple of frames missing towards
the end of the sequence, therefore the AVI files would not exactly
correspond to the silhouette frames. As this happens toward the end, it
should not significantly affect work that evaluates automatic
silhouette segmentation and that uses performance metrics based on
aggregating an averaging results over the whole sequence)
action
class |
action name
|
combinations
for silhouette annotation |
Table 5. Action
combinations corresponding to the MAS data for which ground truth
silhouettes have been generated.
Finally, the following table documents the frames that
were manually segmented so that you can test foreground segmentation
algorithms (i.e. this table tells you the correspondence between JPEG,
AVI and PNG frames in the dataset). Please note that the human
annotators worked on the JPEG files and hence there is a one to one
correspondence between JPEG and PNG files. Because of a bug we later
discovered on the version of mplayer
that was used to generate the JPEG frames, there is small difference in
the number of frames in the AVI files, but we still suggest you use the
AVI files as the JPEG were effectively transcoded from the original
MPEG2 videos)
ActionActorCamera |
GT
InitFrame |
GT EndFrame |
GT
NFrames |
JPG
Nframes |
AVI
Nframes |
KickPerson1Camera3 |
2370 |
2911 |
542 |
3001 |
3003 |
KickPerson1Camera4 |
2370 |
2911 |
542 |
2997 |
2999 |
KickPerson4Camera3 |
200 |
628 |
429 |
731 |
733 |
KickPerson4Camera4 |
200 |
628 |
429 |
721 |
723 |
PunchPerson1Camera3 |
2140 |
2607 |
468 |
2746 |
2748 |
PunchPerson1Camera4 |
2140 |
2607 |
468 |
2750 |
2752 |
PunchPerson4Camera3 |
92 |
536 |
445 |
642 |
643 |
PunchPerson4Camera4 |
92 |
536 |
445 |
645 |
647 |
RunStopPerson1Camera3 |
980 |
1418 |
439 |
1572 |
1574 |
RunStopPerson1Camera4 |
980 |
1418 |
439 |
1572 |
1574 |
RunStopPerson4Camera3 |
293 |
618 |
326 |
751 |
753 |
RunStopPerson4Camera4 |
293 |
618 |
326 |
749 |
751 |
ShotGunCollapsePerson1Camera3 |
267 |
1104 |
838 |
1444 |
1446 |
ShotGunCollapsePerson1Camera4 |
267 |
1104 |
838 |
1443 |
1445 |
ShotGunCollapsePerson4Camera3 |
319 |
1208 |
890 |
1424 |
1426 |
ShotGunCollapsePerson4Camera4 |
319 |
1208 |
890 |
1424 |
1426 |
WalkTurnBackPerson1Camera3 |
216 |
682 |
467 |
866 |
868 |
WalkTurnBackPerson1Camera4 |
216 |
682 |
467 |
860 |
862 |
WalkTurnBackPerson4Camera3 |
207 |
672 |
466 |
836 |
838 |
WalkTurnBackPerson4Camera4 |
207 |
672 |
466 |
839 |
841 |
GTInitFrame: Frame number for the start of the
manual annotation
GTEndFrame: Frame number for
the end of the manual annotation
GTNFrames: Number of manually
annotated frames = (GTEndFrame-GTInitFrame+1)
JPGNFrames: Total number of frames in
the JPEG sequence (slightly less than AVINFrames)
AVINFrames: Total number of frames in
the AVI sequence
We have reorganized
these 5 composite action classes as 14 primitive action classes as
shown in the table below.
You may
download the data by clicking
here
(32MB).
primitive
action class |
primitive action name
|
no.
of samples |
C1 |
CollapseRight |
4 * 2 = 8 |
C2 |
CollapseLeft |
4 * 2 = 8 |
C3 |
StandupRight |
4 * 2 = 8 |
C8 |
GuardToPunch |
4 * 4 =16 |
C9 |
RunRightToLeft |
4 * 2 = 8 |
C10 |
RunLeftToRight |
4 * 2 = 8 |
C11 |
WalkRightToLeft |
4 * 2 = 8 |
C12 |
WalkLeftToRight |
4 * 2 = 8 |
C13 |
TurnBackRight |
4 * 2 = 8 |
C14 |
TurnBackLeft |
4 * 1 = 4 |
These 14 primitive
action classes may also be reorganized in 8 classes where similar
actions make a single class as shown in the table below.
primitive
action class |
primitive action name
|
no.
of samples |
C1 |
Collapse
(Right/Left) |
4 * 4 =
16 |
C2 |
Standup
(Right/Left) |
4 * 3 =
12 |
C5 |
Guard
(ToKick/Punch) |
4 * 8 =32 |
C6 |
Run
(Right/Left) |
4 * 4 =
16 |
C7 |
Walk
(Right/Left) |
4 * 4 =
16 |
C8 |
TurnBack
(Right/Left) |
4 * 3 =
12 |
Figure 2. Sample
images of annotated silhouettes from the MAS data (for actor A1)
corresponding to 20 selected action sequences (5 action classes, 2
actors and 2 cameras) from the MuHAVi data (as listed in Table 4).
Figure 3. Sample
image frames from the MuHAVi data for 17 action classes, 7 actors and 8
camera views (as listed in Table 1, 2 and 3, and, shown in Fig. 1).
Publications that use
the MuHAVi dataset (if you have any not listed here please let me know):
Singh, Sanchit, Sergio A. Velastin, and Hossein
Ragheb. "Muhavi: A multicamera human action video dataset for the
evaluation of action recognition methods." In Advanced Video and Signal
Based Surveillance (AVSS), 2010 Seventh IEEE International Conference
on, pp. 48-55. IEEE, 2010.
Marlon Alcântara, Thierry Moreira, and Hélio
Pedrini, “Real-time action recognition based on cumulative motion
shapes,” in Acoustics, Speech and Signal Processing (ICASSP), 2014.
@inproceedings{alcantara2014,
author = {Alc\^antara, Marlon and Moreira, Thierry and Pedrini,
H\'elio},
title = {Real-Time Action Recognition Based On Cumulative Motion
Shapes},
booktitle = {Acoustics, Speech and Signal Processing (ICASSP)},
year = {2014},
}
A. A. Chaaraoui, P. Climent-Pérez, and F.
Flórez-Revuelta, “A review on vision techniques applied to Human
Behaviour Analysis for Ambient-Assisted Living,” Expert Systems
with Applications, vol. 39, no. 12, pp. 10873–10888, 2012.
Available at ScienceDirect: http://www.sciencedirect.com/science/article/pii/S0957417412004757
-
Climent-Pérez,
Pau, Alexandros Andre Chaaraoui, and Francisco Flórez-Revuelta. "Useful
Research Tools for Human Behaviour Understanding in the Context of
Ambient Assisted Living." In Ambient Intelligence-Software and
Applications, pp. 201-205. Springer Berlin Heidelberg, 2012.
Available at SpringerLink: http://link.springer.com/chapter/10.1007%2F978-3-642-28783-1_25?LI=true
Also uploaded at ResearchGate: http://www.researchgate.net/publication/224960636_Useful_Research_Tools_for_Human_Behaviour_Understanding_in_the_Context_of_Ambient_Assisted_Living
Chaaraoui,
Alexandros Andre, Pau Climent-Pérez, and Francisco Flórez-Revuelta. "An
efficient approach for multi-view human action recognition based on
bag-of-key-poses." In Human Behavior Understanding, pp. 29-40. Springer
Berlin Heidelberg, 2012.
Available at SpringerLink: http://link.springer.com/chapter/10.1007%2F978-3-642-34014-7_3?LI=true
Also uploaded at ResearchGate: http://www.researchgate.net/publication/232297472_An_Efficient_Approach_for_Multi-view_Human_Action_Recognition_Based_on_Bag-of-Key-Poses
A. A. Chaaraoui, P.
Climent-Pérez, and F. Flórez-Revuelta, “Silhouette-based Human Action
Recognition using Sequences of Key Poses,” Pattern Recognition
Letters, vol. 34, no. 15, pp. 1799-1807, 2013.
Available at ScienceDirect: http://www.sciencedirect.com/science/article/pii/S0167865513000342
Also uploaded at ResearchGate: http://www.researchgate.net/publication/236306638_Silhouette-based_Human_Action_Recognition_using_Sequences_of_Key_Poses
Chaaraoui, Alexandros Andre, and Francisco Flórez-Revuelta. "Human
action recognition optimization based on evolutionary feature subset
selection." In Proceeding of the fifteenth annual conference on Genetic
and evolutionary computation conference, pp. 1229-1236. ACM, 2013.
Available at: http://hdl.handle.net/10045/33675
A. A. Chaaraoui, and F. Flórez-Revuelta, “Optimizing
human action recognition based on a cooperative coevolutionary
algorithm,” Engineering Applications of Artificial Intelligence,
Available online 30 October 2013, ISSN 0952-1976,
http://dx.doi.org/10.1016/j.engappai.2013.10.003. Available at
ScienceDirect: http://www.sciencedirect.com/science/article/pii/S0952197613002066
A. A. Chaaraoui, and F. Flórez-Revuelta, “Vision-based
Recognition of Human Behaviour for Intelligent Environments”, PhD
Thesis, University of Alicante, 2014.
Available at: http://hdl.handle.net/10045/36395
Chaaraoui, Alexandros Andre, José Ramón Padilla-López, Francisco Javier
Ferrández-Pastor, Mario Nieto-Hidalgo, and Francisco Flórez-Revuelta.
"A Vision-Based System for Intelligent Monitoring: Human Behaviour
Analysis and Privacy by Context." Sensors 14, no. 5 (2014): 8895-8925.
Cheema, Shahzad, Abdalrahman Eweiwi, Christian Thurau, and Christian
Bauckhage. "Action recognition by learning discriminative key poses."
In Computer Vision Workshops (ICCV Workshops), 2011 IEEE International
Conference on, pp. 1302-1309. IEEE, 2011.
Moghaddam, Zia, and Massimo Piccardi. "Robust density modelling using
the student's t-distribution for human action recognition." In Image
Processing (ICIP), 2011 18th IEEE International Conference on, pp.
3261-3264. IEEE, 2011.
Martinez-Contreras, Francisco, Carlos Orrite-Urunuela, Elias
Herrero-Jaraba, Hossein Ragheb, and Sergio A. Velastin. "Recognizing
human actions using silhouette-based HMM." In Advanced Video and Signal
Based Surveillance, 2009. AVSS'09. Sixth IEEE International Conference
on, pp. 43-48. IEEE, 2009.
Eweiwi, Abdalrahman, Shahzad Cheema, Christian Thurau, and Christian
Bauckhage. "Temporal key poses for human action recognition." In
Computer Vision Workshops (ICCV Workshops), 2011 IEEE International
Conference on, pp. 1310-1317. IEEE, 2011.
Kumari, Sonal, and Suman K. Mitra. "Human Action Recognition Using
DFT." In Computer Vision, Pattern Recognition, Image Processing and
Graphics (NCVPRIPG), 2011 Third National Conference on, pp. 239-242.
IEEE, 2011.
López, Dennis Romero, Anselmo Frizera Neto, and Teodiano Freire Bastos.
"Reconocimiento en-línea de acciones humanas basado en patrones de RWE
aplicado en ventanas dinámicas de momentos invariantes." Revista
Iberoamericana de Automática e Informática Industrial RIAI 11, no. 2
(2014): 202-211.
Karthikeyan, Shanmugavadivel, Utkarsh Gaur, Bangalore S. Manjunath, and
Scott Grafton. "Probabilistic subspace-based learning of shape dynamics
modes for multi-view action recognition." In Computer Vision Workshops
(ICCV Workshops), 2011 IEEE International Conference on, pp. 1282-1286.
IEEE, 2011.
Martínez-Usó, Adolfo, G. Salgues, and Sergio A. Velastin. "Evaluation
of unsupervised segmentation algorithms for silhouette extraction in
human action video sequences." In Visual Informatics: Sustaining
Research and Innovations, pp. 13-22. Springer Berlin Heidelberg, 2011.
Piccardi, Massimo, and Zia Moghaddam. "Robust Density Modelling Using
the Student's t-distribution for Human Action Recognition." (2011).
Wu, Xinxiao, and Yunde Jia. "View-invariant action recognition using
latent kernelized structural SVM." In Computer Vision–ECCV 2012, pp.
411-424. Springer Berlin Heidelberg, 2012.
Moghaddam, Zia, and Massimo Piccardi. "Histogram-based training
initialisation of hidden markov models for human action recognition."
In Advanced Video and Signal Based Surveillance (AVSS), 2010 Seventh
IEEE International Conference on, pp. 256-261. IEEE, 2010.
Gallego, Jaime, and Montse Pardas. "Enhanced bayesian foreground
segmentation using brightness and color distortion region-based model
for shadow removal." In Image Processing (ICIP), 2010 17th IEEE
International Conference on, pp. 3449-3452. IEEE, 2010.
Rahman, Md Junaedur, J. Martínez del Rincón, Jean-Christophe Nebel, and
Dimitrios Makris. "Body Pose based Pedestrian Tracking in a Particle
Filtering Framework." (2013).
El-Sallam, Amar A., and Ajmal S. Mian. "Human body pose estimation from
still images and video frames." In Image Analysis and Recognition, pp.
176-188. Springer Berlin Heidelberg, 2010.
Htike, Zaw Zaw, Simon Egerton, and Kuang Ye Chow. "Monocular viewpoint
invariant human activity recognition." In Robotics, Automation and
Mechatronics (RAM), 2011 IEEE Conference on, pp. 18-23. IEEE, 2011.
Holte, Michael B., Cuong Tran, Mohan M. Trivedi, and Thomas B.
Moeslund. "Human action recognition using multiple views: a comparative
perspective on recent developments." In Proceedings of the 2011 joint
ACM workshop on Human gesture and behavior understanding, pp. 47-52.
ACM, 2011.
Adeli Mosabbeb, Ehsan, Kaamran Raahemifar, and Mahmood Fathy.
"Multi-View Human Activity Recognition in Distributed Camera Sensor
Networks." Sensors 13, no. 7 (2013): 8750-8770.
Cheng, Zhongwei, Lei Qin, Yituo Ye, Qingming Huang, and Qi Tian. "Human
daily action analysis with multi-view and color-depth data." In
Computer Vision–ECCV 2012. Workshops and Demonstrations, pp. 52-61.
Springer Berlin Heidelberg, 2012.
Abdul Rahman, Farah Yasmin, Aini Hussain, Wan Mimi Diyana Wan Zaki,
Halimah Badioze Zaman, and Nooritawati Md Tahir. "Enhancement of
Background Subtraction Techniques Using a Second Derivative in Gradient
Direction Filter." Journal of Electrical and Computer Engineering 2013
(2013).
Concha, Oscar Perez, Richard Yi Da Xu, and Massimo Piccardi. "Robust
Dimensionality Reduction for Human Action Recognition." In Digital
Image Computing: Techniques and Applications (DICTA), 2010
International Conference on, pp. 349-356. IEEE, 2010.
Moghaddam, Zia, and Massimo Piccardi. "Training Initialization of
Hidden Markov Models in Human Action Recognition." 1-15.
Templates, Motion. "Independent Viewpoint Silhouette-Based Human Action
Modeling and Recognition." Handbook on Soft Computing for Video
Surveillance (2012): 185.
Borzeshi, Ehsan Zare, Massimo Piccardi, and R. Y. D. Xu. "A
discriminative prototype selection approach for graph embedding in
human action recognition." In Computer Vision Workshops (ICCV
Workshops), 2011 IEEE International Conference on, pp. 1295-1301. IEEE,
2011.
Gallego, Jaime, Montse Pardàs, and Gloria Haro. "Enhanced foreground
segmentation and tracking combining Bayesian background, shadow and
foreground modeling." Pattern Recognition Letters 33, no. 12 (2012):
1558-1568.
Piccardi, Massimo, Yi Da Xu, and Ehsan Zare Borzeshi. "A discriminative
prototype selection approach for graph embedding in human action
recognition." (2011).
Borzeshi, Ehsan Zare, Oscar Perez Concha, and Massimo Piccardi. "Human
action recognition in video by fusion of structural and spatio-temporal
features." In Structural, Syntactic, and Statistical Pattern
Recognition, pp. 474-482. Springer Berlin Heidelberg, 2012.
Tweed, David S., and James M. Ferryman. "Enhancing change detection in
low-quality surveillance footage using markov random fields." In
Proceedings of the 1st ACM workshop on Vision networks for behavior
analysis, pp. 23-30. ACM, 2008.
Chen, Fan, and Christophe De Vleeschouwer. "Robust volumetric
reconstruction from noisy multi-view foreground occupancy masks." In
Asia-Pacific Signal and Information Processing Association Annual
Summit and Conference. 2011.
Määttä, Tommi, Aki Härmä, and Hamid Aghajan. "On efficient use of
multi-view data for activity recognition." In Proceedings of the Fourth
ACM/IEEE International Conference on Distributed Smart Cameras, pp.
158-165. ACM, 2010.
Nebel, Jean-Christophe, Paul Kuo, and Dimitrios Makris. "2D and 3D Pose
Recovery from a Single Uncalibrated Video." In Multimedia Analysis,
Processing and Communications, pp. 391-411. Springer Berlin Heidelberg,
2011.