A Pattern Mining Approach for Improving Speech Emotion Recognition

Umut AvciAvci, Umut2025-10-062022-110218-00141793-638110.1142/S02180014225004582-s2.0-85143671883http://dx.doi.org/10.1142/S0218001422500458https://gcris.yasar.edu.tr/handle/123456789/6279https://doi.org/10.1142/S0218001422500458Speech-driven user interfaces are becoming more common in our lives. To interact with such systems naturally and effectively machines need to recognize the emotional states of users and respond to them accordingly. At the heart of the emotion recognition research done to this end lies the emotion representation that enables machines to learn and predict emotions. Speech emotion recognition studies use a wide range of low-to-high-level acoustic features for representation purposes such as LLDs their functionals and BoAW. In this paper we present a new method for extracting a novel set of high-level features for classifying emotions. For this purpose we (1) reduce the dimension of discrete-time speech signals (2) perform a quantization operation on the new signals and assign a distinct symbol to each quantization level (3) use the symbol sequences representing the signals to extract discriminative patterns that are capable of distinguishing different emotions from each other and (4) generate a separate set of features for each emotion from the extracted patterns. Experimental results show that pattern features outperform Energy Voicing MFCC Spectral and RASTA feature sets. We also demonstrate that combining the pattern-based features and the acoustic features further improves the classification performance.Englishinfo:eu-repo/semantics/closedAccessSpeech emotion recognition, pattern mining, feature extractionCLASSIFICATION, FEATURES, MODELPattern MiningFeature ExtractionSpeech Emotion RecognitionA Pattern Mining Approach for Improving Speech Emotion RecognitionArticle