Perception of Paralinguistic Traits in Synthesized Voices

Alice Emily Baird, Stina Hasse Jørgensen, Emilia Parada-Cabaleiro, Simone Hantke, Nicholas Cummins, Bjorn Schuller

5 Citationer (Scopus)

Abstract

Along with the rise of artificial intelligence and the internet-of-things, synthesized voices are now common in daily-life, providing us with guidance, assistance, and even companionship. From formant to concatenative synthesis, the synthesized voice continues to be defined by the same traits we prescribe to ourselves. When the recorded voice is synthesized, does our perception of its new machine embodiment change, and can we consider an alternative, more inclusive form? To begin evaluating the impact of aesthetic design, this study presents a first-step perception test to explore the paralinguistic traits of the synthesized voice. Using a corpus of 13 synthesized voices, constructed from acoustic concatenative speech synthesis, we assessed the response of 23 listeners from differing cultural backgrounds. To evaluate if perception shifts from the defined traits, we asked listeners to assigned traits of age, gender, accent origin, and human-likeness. Results present a difference in perception for age and human-likeness across voices, and a general agreement across listeners for both gender and accent origin. Connections found between age, gender and human-likeness call for further exploration into a more participatory and inclusive synthesized vocal identity.

OriginalsprogEngelsk
TitelProceedings of the 12th International Audio Mostly Conference : Augmented and Participatory Sound and Music Experiences : AM '17
Antal sider5
UdgivelsesstedNew York
ForlagAssociation for Computing Machinery
Publikationsdato23 aug. 2017
Artikelnummer17
ISBN (Elektronisk)9781450353731
DOI
StatusUdgivet - 23 aug. 2017
BegivenhedAudio Mostly: Augmented and Participatory Sound/Music Experiences - Queen Mary University of London , London, Storbritannien
Varighed: 23 aug. 201726 aug. 2017
http://audiomostly.com/

Konference

KonferenceAudio Mostly
LokationQueen Mary University of London
Land/OmrådeStorbritannien
ByLondon
Periode23/08/201726/08/2017
Internetadresse

Emneord

  • Det Humanistiske Fakultet

Fingeraftryk

Dyk ned i forskningsemnerne om 'Perception of Paralinguistic Traits in Synthesized Voices'. Sammen danner de et unikt fingeraftryk.

Citationsformater