Glottal inverse filtering analysis of human voice production—A review of estimation and parameterization methods of the glottal excitation and their applications

P Alku - Sadhana, 2011 - Springer
Glottal inverse filtering (GIF) refers to methods of estimating the source of voiced speech, the
glottal volume velocity waveform. GIF is based on the idea of inversion, in which the effects …

Electroglottography–an update

CT Herbst - Journal of Voice, 2020 - Elsevier
Electroglottography (EGG) is a low-cost, noninvasive technology for measuring changes of
relative vocal fold contact area during laryngeal voice production. EGG was introduced …

Tensor fusion network for multimodal sentiment analysis

A Zadeh, M Chen, S Poria, E Cambria… - arXiv preprint arXiv …, 2017 - arxiv.org
Multimodal sentiment analysis is an increasingly popular research area, which extends the
conventional language-based definition of sentiment analysis to a multimodal setup where …

Memory fusion network for multi-view sequential learning

A Zadeh, PP Liang, N Mazumder, S Poria… - Proceedings of the …, 2018 - ojs.aaai.org
Multi-view sequential learning is a fundamental problem in machine learning dealing with
multi-view sequences. In a multi-view sequence, there exists two forms of interactions …

[HTML][HTML] Multibench: Multiscale benchmarks for multimodal representation learning

PP Liang, Y Lyu, X Fan, Z Wu, Y Cheng… - Advances in neural …, 2021 - ncbi.nlm.nih.gov
Learning multimodal representations involves integrating information from multiple
heterogeneous sources of data. It is a challenging yet crucial area with numerous real-world …

Found in translation: Learning robust joint representations by cyclic translations between modalities

H Pham, PP Liang, T Manzini, LP Morency… - Proceedings of the …, 2019 - ojs.aaai.org
Multimodal sentiment analysis is a core research area that studies speaker sentiment
expressed from the language, visual, and acoustic modalities. The central challenge in …

COVAREP—A collaborative voice analysis repository for speech technologies

G Degottex, J Kane, T Drugman… - … on acoustics, speech …, 2014 - ieeexplore.ieee.org
Speech processing algorithms are often developed demonstrating improvements over the
state-of-the-art, but sometimes at the cost of high complexity. This makes algorithm …

Multimodal language analysis with recurrent multistage fusion

PP Liang, Z Liu, A Zadeh, LP Morency - arXiv preprint arXiv:1808.03920, 2018 - arxiv.org
Computational modeling of human multimodal language is an emerging research area in
natural language processing spanning the language, visual and acoustic modalities …

[PDF][PDF] Acoustic properties of different kinds of creaky voice.

PA Keating, M Garellek, J Kreiman - ICPhS, 2015 - idiom.ucsd.edu
There is not one kind, but instead several kinds, of creaky voice, or creak. There is no single
defining property shared by all kinds. Instead, each kind exhibits some properties but not …

The role of voice quality in communicating emotion, mood and attitude

C Gobl, AN Chasaide - Speech communication, 2003 - Elsevier
This paper explores the role of voice quality in the communication of emotions, moods and
attitudes. Listeners' reactions to an utterance synthesised with seven different voice qualities …