This paper presents our investigations on emotional state categorization from
speech signals with a psychologically inspired computational model against
human performance under the same experimental setup. Based on psychological
studies, we propose a multistage categorization strategy which allows
establishing an automatic categorization model flexibly for a given emotional
speech categorization task. We apply the strategy to the Serbian Emotional
Speech Corpus (GEES) and the Danish Emotional Speech Corpus (DES), where human
performance was reported in previous psychological studies. Our work is the
first attempt to apply machine learning to the GEES corpus where the human
recognition rates were only available prior to our study. Unlike the previous
work on the DES corpus, our work focuses on a comparison to human performance
under the same experimental settings. Our studies suggest that
psychology-inspired systems yield behaviours that, to a great extent, resemble
what humans perceived and their performance is close to that of humans under
the same experimental setup. Furthermore, our work also uncovers some
differences between machine and humans in terms of emotional state recognition
from speech.