Consiglio Nazionale delle Ricerche

Tipo di prodottoArticolo in rivista
TitoloLSTM-based real-time action detection and prediction in human motion streams
Anno di pubblicazione2019
Formato
  • Elettronico
  • Cartaceo
Autore/iCarrara F.; Elias P.; Sedmidubsky J.; Zezula P.
Affiliazioni autoriCNR-ISTI, Pisa, Italy; Masaryk University, Brno, Czech Republic; Masaryk University, Brno, Czech Republic; Masaryk University, Brno, Czech Republic
Autori CNR e affiliazioni
  • FABIO CARRARA
Lingua/e
  • inglese
AbstractMotion capture data digitally represent human movements by sequences of 3D skeleton configurations. Such spatio-temporal data, often recorded in the stream-based nature, need to be efficiently processed to detect high-interest actions, for example, in human-computer interaction to understand hand gestures in real time. Alternatively, automatically annotated parts of a continuous stream can be persistently stored to become searchable, and thus reusable for future retrieval or pattern mining. In this paper, we focus on multi-label detection of user-specified actions in unsegmented sequences as well as continuous streams. In particular, we utilize the current advances in recurrent neural networks and adopt a unidirectional LSTM model to effectively encode the skeleton frames within the hidden network states. The model learns what subsequences of encoded frames belong to the specified action classes within the training phase. The learned representations of classes are then employed within the annotation phase to infer the probability that an incoming skeleton frame belongs to a given action class. The computed probabilities are finally compared against a learned threshold to automatically determine the beginnings and endings of actions. To further enhance the annotation accuracy, we utilize a bidirectional LSTM model to estimate class probabilities by considering not only the past frames but also the future ones. We extensively evaluate both the models on the three use cases of real-time stream annotation, offline annotation of long sequences, and early action detection and prediction. The experiments demonstrate that our models outperform the state of the art in effectiveness and are at least one order of magnitude more efficient, being able to annotate 10 k frames per second.
Lingua abstractinglese
Altro abstract-
Lingua altro abstract-
Pagine da27309
Pagine a27331
Pagine totali-
RivistaMultimedia tools and applications
Attiva dal 1995
Editore: Kluwer Academic Publishers - Dordrecht ;
Paese di pubblicazione: Stati Uniti d'America
Lingua: inglese
ISSN: 1380-7501
Titolo chiave: Multimedia tools and applications
Numero volume della rivista78
Fascicolo della rivista-
DOI10.1007/s11042-019-07827-3
Verificato da refereeSì: Internazionale
Stato della pubblicazionePublished version
Indicizzazione (in banche dati controllate)
  • Scopus (Codice:2-s2.0-85067826053)
  • ISI Web of Science (WOS) (Codice:000485298000024)
Parole chiaveAction detection and recognition, Action prediction, LSTM, Motion capture data, Stream annotation
Link (URL, URI)https://link.springer.com/article/10.1007/s11042-019-07827-3
Titolo parallelo-
Licenza-
Scadenza embargo-
Data di accettazione-
Note/Altre informazioni-
Strutture CNR
  • ISTI — Istituto di scienza e tecnologie dell'informazione "Alessandro Faedo"
Moduli/Attività/Sottoprogetti CNR-
Progetti Europei-
Allegati
LSTM-based real-time action detection and prediction in human motion streams (documento privato )
Descrizione: published version
Tipo documento: application/pdf
LSTM-based real-time action detection and prediction in human motion streams
Descrizione: Postprint
Tipo documento: application/pdf