Implementation of Mel-Frequency Cepstral Coefficient as Feature Extraction using K-Nearest Neighbor for Emotion Detection Based on Voice Intonation
Abstract
Purpose: To determine emotions based on voice intonation by implementing MFCC as a feature extraction method and KNN as an emotion detection method.
Design/methodology/approach: In this study, the data used was downloaded from several video podcasts on YouTube. Some of the methods used in this study are pitch shifting for data augmentation, MFCC for feature extraction on audio data, basic statistics for taking the mean, median, min, max, standard deviation for each coefficient, Min max scaler for the normalization process and KNN for the method classification.
Findings/result: Because testing is carried out separately for each gender, there are two classification models. In the male model, the highest accuracy was obtained at 88.8% and is included in the good fit model. In the female model, the highest accuracy was obtained at 92.5%, but the model was unable to correctly classify emotions in the new data. This condition is called overfitting. After testing, the cause of this condition was because the pitch shifting augmentation process of one tone in women was unable to solve the problem of the training data size being too small and not containing enough data samples to accurately represent all possible input data values.
Originality/value/state of the art: The research data used in this study has never been used in previous studies because the research data is obtained by downloading from Youtube and then processed until the data is ready to be used for research.
Keywords
Full Text:
PDFReferences
Alghifari, M. F., Gunawan, T. S., & Kartiwi, M. (2018). Speech Emotion Recognition Using Deep Feedforward Neural Network. Indonesian Journal of Electrical Engineering and Computer Science, 10(2), 554–561. https://doi.org/10.11591/ijeecs.v10.i2.pp554-561
Al Dujaili, M. J., Ebrahimi-Moghadam, A., & Fatlawi, A. (2021). Speech emotion recognition based on SVM and KNN classifications fusion. International Journal of Electrical and Computer Engineering, 11(2), 1259–1264. https://doi.org/10.11591/ijece.v11i2.pp1259-1264
Helmiyah, S., Riadi, I., Umar, R., & Hanif, A. (2021). Speech Classification to Recognize Emotion Using Artificial Neural Network. Khazanah Informatika: Jurnal Ilmu Komputer Dan Informatika, 7(1), 12–17. https://doi.org/10.23917/khif.v7i1.11913
Liu, G., He, W., & Jin, B. (2018). Feature Fusion of Speech Emotion Recognition Based on Deep Learning. Proceedings of IC-NIDC.
Aini, Y. K., Santoso, T. B., & Dutono, D. T. (2021). Pemodelan CNN Untuk Deteksi Emosi Berbasis Speech Bahasa Indonesia. Jurnal Komputer Terapan, 7(1), 143–152. https://jurnal.pcr.ac.id/index.php/jkt/
Albon, C., 2018. Machine Learning with Python Cookbook. Sebastopol: O’Reilly Media.
Mahardika, Kukuh W., Sari, Yuita A., & Arwan, Achmad. (2018). Optimasi K-Nearest Neighbour Menggunakan Particle Swarm Optimization Optimasi K-Nearest Neighbour Menggunakan Particle Swarm Optimization pada Sistem Pakar untuk Monitoring Pengendalian Hama pada Tanaman Jeruk. Jurnal Teknologi, 2(July), 13.
Hosseini, Z., Ahadi, S. M., & Faraji, N. (2014). Speech Emotion Classification via a Modified Gaussian Mixture Model Approach. 2014 7th International Symposium on Telecommunications, IST 2014, 487–491. https://doi.org/10.1109/ISTEL.2014.7000752
Arifin, C., & Junaedi, H. (2018). Emotion Sound Classification with Support Vector Machine Algorithm. Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control, 3(2), 181–190. https://doi.org/10.22219/kinetik.v3i2.610
Helmiyah, S., Riadi, I., Umar, R., Hanif, A., Yudhana, A., & Fadlil, A. (2020). Identifikasi Emosi Manusia Berdasarkan Ucapan Menggunakan Metode Ekstraksi Ciri LPC dan Metode Euclidean Distance. Jurnal Teknologi Informasi Dan Ilmu Komputer, 7(6), 1177. https://doi.org/10.25126/jtiik.2020722693
Putra, K. T. (2017). Sistem Pengenal Wicara Menggunakan Mel-Frequency Cepstral Coefficient (Speech Recognition System Using Mel-Frequency Cepstral Coefficient). Semesta Teknika, 20(1), 75–80.
Krishna Kishore, K. V., & Krishna Satish, P. (2013). Emotion recognition in speech using MFCC and wavelet features. Proceedings of the 2013 3rd IEEE International Advance Computing Conference, IACC 2013, 842–847. https://doi.org/10.1109/IAdCC.2013.6514336
Heriyanto, H., Hartati, S., & Putra, A. E. (2018). Ekstraksi Ciri Mel Frequency Cepstral Coefficient (Mfcc) Dan Rerata Coefficient Untuk Pengecekan Bacaan Al-Qur’an. Telematika, 15(2), 99. https://doi.org/10.31315/telematika.v15i2.3123
Muljono, Prasetya, M. R., Harjoko, A., & Supriyanto, C. (2019). Speech Emotion Recognition of Indonesian Movie Audio Tracks based on MFCC and SVM. Proceedings of the 4th International Conference on Contemporary Computing and Informatics, IC3I 2019, 22–25. https://doi.org/10.1109/IC3I46837.2019.9055509
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., & Duchesnay, E. (2011). Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research, 12, 2825–2830.
Hermawan, Y. D., Hariadi, V., & Amaliah, B. (2017). Implementasi Algoritma K-Nearest Neighbors dengan Particle Swarm Optimization dalam Klasifikasi Trouble pada Base Transceiver Station ( BTS ). Jurnal Teknik ITS.
Harsemadi, G., Sudarma, M., & Pramaita, N. (2017). Implementasi Algoritma K-Nearest Neighbor pada Perangkat Lunak Pengelompokan Musik untuk Menentukan Suasana Hati. Majalah Ilmiah Teknologi Elektro, 16(1), 14–20. https://doi.org/10.24843/mite.1601.03
Nurcahyo, R., & Iqbal, M. (2022). Pengenalan Emosi Pembicara Menggunakan Convolutional Neural Networks. Jurnal RESTI (Rekayasa Sistem Dan Teknologi Informasi), 6(1), 115– 122. https://doi.org/10.29207/resti.v6i1.3726
DOI: https://doi.org/10.31315/telematika.v20i1.9518
DOI (PDF): https://doi.org/10.31315/telematika.v20i1.9518.g5400
Refbacks
- There are currently no refbacks.
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Status Kunjungan Jurnal Telematika