openSMILE[2] is source-available software for automatic extraction of features from audio signals and for classification of speech and music signals. "SMILE" stands for "Speech & Music Interpretation by Large-space Extraction". The software is mainly applied in the area of automatic emotion recognition and is widely used in the affective computing research community. The openSMILE project exists since 2008 and is maintained by the German company audEERING GmbH since 2013. openSMILE is provided free of charge for research purposes and personal use under a source-available license. For commercial use of the tool, the company audEERING offers custom license options.

openSMILE
Developer(s)audEERING GmbH
Initial releaseSeptember 2010; 14 years ago (2010-09)
Stable release
3.0.1[1] / January 4, 2022; 2 years ago (2022-01-04)
Written inC++
PlatformLinux, macOS, Windows, Android, iOS
TypeMachine learning
LicenseSource-available, proprietary
Websiteaudeering.com

Application Areas

edit

openSMILE is used for academic research as well as for commercial applications in order to automatically analyze speech and music signals in real-time. In contrast to automatic speech recognition which extracts the spoken content out of a speech signal, openSMILE is capable of recognizing the characteristics of a given speech or music segment. Examples for such characteristics encoded in human speech are a speaker's emotion,[3] age, gender, and personality, as well as speaker states like depression, intoxication, or vocal pathological disorders. The software further includes music classification technology for automatic music mood detection and recognition of chorus segments, key, chords, tempo, meter, dance-style, and genre.

The openSMILE toolkit serves as benchmark in manifold research competitions such as Interspeech ComParE,[4] AVEC,[5] MediaEval,[6] and EmotiW.[7]

History

edit

The openSMILE project was started in 2008 by Florian Eyben, Martin Wöllmer, and Björn Schuller at the Technical University of Munich within the European Union research project SEMAINE. The goal of the SEMAINE project was to develop a virtual agent with emotional and social intelligence. In this system, openSMILE was applied for real-time analysis of speech and emotion. The final SEMAINE software release is based on openSMILE version 1.0.1.

In 2009, the emotion recognition toolkit (openEAR) was published based on openSMILE. "EAR" stands for "Emotion and Affect Recognition".

In 2010, openSMILE version 1.0.1 was published and was introduced and awarded at the ACM Multimedia Open-Source Software Challenge.

Between 2011 and 2013, the technology of openSMILE was extended and improved by Florian Eyben and Felix Weninger in the context of their doctoral thesis at the Technical University of Munich. The software was also applied for the project ASC-Inclusion, which was funded by the European Union. For this project, the software was extended by Erik Marchi in order to teach emotional expression to autistic children, based on automatic emotion recognition and visualization.

In 2013, the company audEERING acquired the rights to the code-base from the Technical University of Munich and version 2.0 was published under a source-available research license.

Until 2016, openSMILE was downloaded more than 50,000 times worldwide and has established itself as a standard toolkit for emotion recognition.

Awards

edit

openSMILE was awarded in 2010 in the context of the ACM Multimedia Open Source Competition. The software tool is applied in numerous scientific publications on automatic emotion recognition. openSMILE[8] and its extension openEAR[9] have been cited in more than 1000 scientific publications until today.

References

edit
  1. ^ "Release openSMILE 3.0.1". Retrieved 5 January 2022.
  2. ^ F. Eyben, M. Wöllmer, B. Schuller: „openSMILE - The Munich Versatile and Fast Open-Source Audio Feature Extractor“, In Proc. ACM Multimedia (MM), ACM, Florence, Italy, ACM, pp. 1459-1462, October 2010.
  3. ^ B. Schuller, B. Vlasenko, F. Eyben, M. Wöllmer, A. Stuhlsatz, A. Wendemuth, G. Rigoll, "Cross-Corpus Acoustic Emotion Recognition: Variances and Strategies (Extended Abstract)," in Proc. of ACII 2015, Xi'an, China, invited for the Special Session on Most Influential Articles in IEEE Transactions on Affective Computing.
  4. ^ B. Schuller, S. Steidl, A. Batliner, J. Hirschberg, J. K. Burgoon, A. Elkins, Y. Zhang, E. Coutinho: "The INTERSPEECH 2016 Computational Paralinguistics Challenge: Deception & Sincerity Archived 2017-06-09 at the Wayback Machine", Proceedings INTERSPEECH 2016, ISCA, San Francisco, USA, 2016.
  5. ^ F. Ringeval, B. Schuller, M. Valstar, R. Cowie, M. Pantic, “AVEC 2015 - The 5th International Audio/Visual Emotion Challenge and Workshop,” in Proceedings of the 23rd ACM International Conference on Multimedia, MM 2015, (Brisbane, Australia), ACM, October 2015.
  6. ^ M. Eskevich, R. Aly, D. Racca, R. Ordelman, S. Chen, G. J. Jones, "The search and hyperlinking task at MediaEval 2014".
  7. ^ F. Ringeval, S. Amiriparian, F. Eyben, K. Scherer, B. Schuller, “Emotion Recognition in the Wild: Incorporating Voice and Lip Activity in Multimodal Decision-Level Fusion,” in Proceedings of the ICMI 2014 EmotiW – Emotion Recognition In The Wild Challenge and Workshop (EmotiW 2014), Satellite of the 16th ACM International Conference on Multimodal Interaction (ICMI 2014), (Istanbul, Turkey), pp. 473– 480, ACM, November 2014
  8. ^ Eyben, Florian; Wöllmer, Martin; Schuller, Björn (26 April 2018). Opensmile: the munich versatile and fast open-source audio feature extractor. ACM. pp. 1459–1462. doi:10.1145/1873951.1874246. ISBN 978-1-60558-933-6 – via Google Scholar.
  9. ^ Eyben, Florian; Wöllmer, Martin; Schuller, Björn (26 April 2018). "OpenEAR—introducing the Munich open-source emotion and affect recognition toolkit". IEEE. pp. 1–6 – via Google Scholar.
edit