Wavelet packet transform and multilayer perceptron to identify voices with a mild degree of vocal deviation

Morikawa, Mateus; Hernane Spatti, Danilo; Dajer, María Eugenia

doi:10.46634/riics.126

Servicios Personalizados

Revista

Articulo

Indicadores

Citado por SciELO
Accesos

Links relacionados

Citado por Google
Similares en SciELO
Similares en Google

Permalink

Revista de investigación e innovación en ciencias de la salud

versión On-line ISSN 2665-2056

Resumen

MORIKAWA, Mateus; HERNANE SPATTI, Danilo y DAJER, María Eugenia. Wavelet packet transform and multilayer perceptron to identify voices with a mild degree of vocal deviation. Rev. Investig. Innov. Cienc. Salud [online]. 2022, vol.4, n.1, pp.16-25. Epub 06-Jun-2022. ISSN 2665-2056. https://doi.org/10.46634/riics.126.

Introduction:

Laryngeal disorders are characterized by a change in the vibratory pattern of the vocal folds. This disorder may have an organic origin described by anatomical fold modification, or a functional origin caused by vocal abuse or misuse. The most common diagnostic methods are performed by invasive imaging features that cause patient discomfort. In addition, mild voice deviations do not stop the individual from using their voices, which makes it difficult to identify the problem and increases the possibility of complications.

Aim:

For those reasons, the goal of the present paper was to develop a noninvasive alternative for the identification of voices with a mild degree of vocal deviation applying the Wavelet Packet Transform (WPT) and Multilayer Perceptron (MLP), an Artificial Neural Network (ANN).

Methods:

A dataset of 74 audio files were used. Shannon energy and entropy measures were extracted using the Daubechies 2 and Symlet 2 families and then the processing step was performed with the MLP ANN.

Results:

The Symlet 2 family was more efficient in its generalization, obtaining 99.75% and 99.56% accuracy by using Shannon energy and entropy measures, respectively. The Daubechies 2 family, however, obtained lower accuracy rates: 91.17% and 70.01%, respectively.

Conclusion:

The combination of WPT and MLP presented high accuracy for the identification of voices with a mild degree of vocal deviation.

Palabras clave : Voice; voice disorder; voice classification; voice deviation; artificial neural network; multilayer perceptron; wavelet packet transform; dysphonia; laryngeal diseases; vocal cords.

· resumen en Español · texto en Inglés · Inglés (

pdf )