Detecting head movements in video-recorded dyadic conversations

Patrizia Paggio; Bart Jongejan; Manex Agirrezabal; Costanza Navarretta

doi:10.1145/3281151.3281152

Detecting head movements in video-recorded dyadic conversations

Patrizia Paggio, Bart Jongejan, Manex Agirrezabal, Costanza Navarretta

Institut for Nordiske Studier og Sprogvidenskab

Abstract

This paper is about the automatic recognition of head movements in videos of face-to-face dyadic conversations. We present an approach where recognition of head movements is casted as a multimodal frame classification problem based on visual and acoustic features. The visual features include velocity, acceleration, and jerk values associated with head movements, while the acoustic ones are pitch and intensity measurements from the co-occuring speech. We present the results obtained by training and testing a number of classifiers on manually annotated data from two conversations. The best performing classifier, a Multilayer Perceptron trained using all the features, obtains 0.75 accuracy and outperforms the mono-modal baseline classifier.

Originalsprog	Engelsk
Titel	Proceedings of the International Conference on Multimodal Interaction: Adjunct
Antal sider	6
Udgivelsessted	New York
Forlag	Association for Computing Machinery
Publikationsdato	16 okt. 2018
Sider	1-6
ISBN (Trykt)	978-1-4503-6002-9
DOI	https://doi.org/10.1145/3281151.3281152
Status	Udgivet - 16 okt. 2018

Adgang til dokumentet

10.1145/3281151.3281152

Andre filer og links

https://dl.acm.org/citation.cfm?doid=3281151.3281152

Citationsformater

Detecting head movements in video-recorded dyadic conversations. / Paggio, Patrizia ; Jongejan, Bart ; Agirrezabal, Manex et al.
Proceedings of the International Conference on Multimodal Interaction: Adjunct. New York: Association for Computing Machinery, 2018. s. 1-6.

Publikation: Bidrag til bog/antologi/rapport › Konferencebidrag i proceedings › Forskning › peer review

@inproceedings{f810685cd29b4210895ccb8b4b4e2605,

title = "Detecting head movements in video-recorded dyadic conversations",

abstract = "This paper is about the automatic recognition of head movements in videos of face-to-face dyadic conversations. We present an approach where recognition of head movements is casted as a multimodal frame classification problem based on visual and acoustic features. The visual features include velocity, acceleration, and jerk values associated with head movements, while the acoustic ones are pitch and intensity measurements from the co-occuring speech. We present the results obtained by training and testing a number of classifiers on manually annotated data from two conversations. The best performing classifier, a Multilayer Perceptron trained using all the features, obtains 0.75 accuracy and outperforms the mono-modal baseline classifier.",

author = "Patrizia Paggio and Bart Jongejan and Manex Agirrezabal and Costanza Navarretta",

year = "2018",

month = oct,

day = "16",

doi = "10.1145/3281151.3281152",

language = "English",

isbn = "978-1-4503-6002-9",

pages = "1--6",

booktitle = "Proceedings of the International Conference on Multimodal Interaction: Adjunct",

publisher = "Association for Computing Machinery",

}

TY - GEN

T1 - Detecting head movements in video-recorded dyadic conversations

AU - Paggio, Patrizia

AU - Jongejan, Bart

AU - Agirrezabal, Manex

AU - Navarretta, Costanza

PY - 2018/10/16

Y1 - 2018/10/16

N2 - This paper is about the automatic recognition of head movements in videos of face-to-face dyadic conversations. We present an approach where recognition of head movements is casted as a multimodal frame classification problem based on visual and acoustic features. The visual features include velocity, acceleration, and jerk values associated with head movements, while the acoustic ones are pitch and intensity measurements from the co-occuring speech. We present the results obtained by training and testing a number of classifiers on manually annotated data from two conversations. The best performing classifier, a Multilayer Perceptron trained using all the features, obtains 0.75 accuracy and outperforms the mono-modal baseline classifier.

AB - This paper is about the automatic recognition of head movements in videos of face-to-face dyadic conversations. We present an approach where recognition of head movements is casted as a multimodal frame classification problem based on visual and acoustic features. The visual features include velocity, acceleration, and jerk values associated with head movements, while the acoustic ones are pitch and intensity measurements from the co-occuring speech. We present the results obtained by training and testing a number of classifiers on manually annotated data from two conversations. The best performing classifier, a Multilayer Perceptron trained using all the features, obtains 0.75 accuracy and outperforms the mono-modal baseline classifier.

UR - https://dl.acm.org/citation.cfm?doid=3281151.3281152

U2 - 10.1145/3281151.3281152

DO - 10.1145/3281151.3281152

M3 - Article in proceedings

SN - 978-1-4503-6002-9

SP - 1

EP - 6

BT - Proceedings of the International Conference on Multimodal Interaction: Adjunct

PB - Association for Computing Machinery

CY - New York

ER -

Detecting head movements in video-recorded dyadic conversations

Abstract

Adgang til dokumentet

Andre filer og links

Fingeraftryk

Citationsformater