Please use this identifier to cite or link to this item: http://earsiv.odu.edu.tr:8080/xmlui/handle/11489/4016
Full metadata record
DC FieldValueLanguage
dc.contributor.authorYucesoy, Ergun-
dc.date.accessioned2024-03-15T06:47:09Z-
dc.date.available2024-03-15T06:47:09Z-
dc.date.issued2020-
dc.identifier.citationYücesoy, E. (2020). Speaker age and gender classification using GMM supervector and NAP channel compensation method. J. Ambient Intell. Humaniz. Comput.. https://doi.org/10.1007/s12652-020-02045-4en_US
dc.identifier.issn1868-5137-
dc.identifier.issn1868-5145-
dc.identifier.urihttp://dx.doi.org/10.1007/s12652-020-02045-4-
dc.identifier.urihttps://www.webofscience.com/wos/woscc/full-record/WOS:000532649300003-
dc.identifier.urihttp://earsiv.odu.edu.tr:8080/xmlui/handle/11489/4016-
dc.descriptionWoS Categories: Computer Science, Artificial Intelligence; Computer Science, Information Systems; Telecommunicationsen_US
dc.descriptionWeb of Science Index: Science Citation Index Expanded (SCI-EXPANDED)en_US
dc.descriptionResearch Areas: Computer Science; Telecommunicationsen_US
dc.description.abstractOne of the most important factors affecting the performance of speech-based recognition systems is the differences between training and test conditions. The Nuisance attribute projection (NAP) is an effective method for eliminating these differences, called channel effects. In this study, the effects of the NAP approach in determining age and gender groups are investigated. Mel-frequency cepstral coefficients and delta coefficients are used as a feature and Gaussian mixture models (GMM) adapted from the universal background model by maximum-a-posteriori method are used for the modeling of age and gender classes. After the GMMs corresponding to each speech are converted into mean supervectors, they are applied to a Support Vector Machine (SVM), and speeches are classified according to the age and gender group of the speakers. While linear GMM kernel based on Kullback-Leibler divergence is used instead of standard SVM kernels, the NAP channel subspace size is changed between 20 and 200 and the number of GMM components is changed between 32 and 512 to determine the optimum values for these parameters. In the tests on the aGender database, the optimum number of components is determined as 128, and the optimum NAP channel subspace size is determined as 45. The age and gender classification accuracy of the system, which is developed using these optimum parameters, is increased from 60.52 to 62.03% with the use of NAP. In addition, age classification accuracy is increased from 60.23 to 61.82% and gender classification accuracy is increased from 91.71 to 92.30%.en_US
dc.language.isoengen_US
dc.publisherSPRINGER HEIDELBERG-HEIDELBERGen_US
dc.relation.isversionof10.1007/s12652-020-02045-4en_US
dc.rightsinfo:eu-repo/semantics/openAccessen_US
dc.subjectSpeaker age and gender classification, Gaussian mixture model (GMM), Nuisance attribute projection (NAP), Support vector machine (SVM), Maximum-A-posteriori (MAP)en_US
dc.subjectAUTOMATIC SPEAKER, FORECAST ENGINE, VERIFICATIONen_US
dc.titleSpeaker age and gender classification using GMM supervector and NAP channel compensation methoden_US
dc.typearticleen_US
dc.relation.journalJOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTINGen_US
dc.contributor.departmentOrdu Üniversitesien_US
dc.contributor.authorID0000-0003-1707-384Xen_US
Appears in Collections:Makale Koleksiyonu

Files in This Item:
There are no files associated with this item.


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.