SciELO - Scientific Electronic Library Online

 
vol.29 issue54Driver-Assistant System Using Computer Vision and Machine LearningSAM: Preliminary Hybrid Model to Support Agile Large-Scale Transformation in Software Industries author indexsubject indexarticles search
Home Pagealphabetic serial listing  

Services on Demand

Journal

Article

Indicators

Related links

  • On index processCited by Google
  • Have no similar articlesSimilars in SciELO
  • On index processSimilars in Google

Share


Revista Facultad de Ingeniería

Print version ISSN 0121-1129On-line version ISSN 2357-5328

Abstract

SOLANO-JIMENEZ, Miguel-Alexis; TOBAR-CIFUENTES, José-Julio; SIERRA-MARTINEZ PH. D, Luz-Marina  and  COBOS-LOZADA PH. D, Carlos-Alberto. Adaptation, Comparison, and Improvement of Metaheuristic Algorithms to the Part-of-Speech Tagging Problem. Rev. Fac. ing. [online]. 2020, vol.29, n.54, e11762.  Epub Dec 30, 2020. ISSN 0121-1129.  https://doi.org/10.19053/01211129.v29.n54.2020.11762.

Part-of-Speech Tagging (POST) is a complex task in the preprocessing of Natural Language Processing applications. Tagging has been tackled from statistical information and rule-based approaches, making use of a range of methods. Most recently, metaheuristic algorithms have gained attention while being used in a wide variety of knowledge areas, with good results. As a result, they were deployed in this research in a POST problem to assign the best sequence of tags (roles) for the words of a sentence based on information statistics. This process was carried out in two cycles, each of them comprised four phases, allowing the adaptation to the tagging problem in metaheuristic algorithms such as Particle Swarm Optimization, Jaya, Random-Restart Hill Climbing, and a memetic algorithm based on Global-Best Harmony Search as a global optimizer, and on Hill Climbing as a local optimizer. In the consolidation of each algorithm, preliminary experiments were carried out (using cross-validation) to adjust the parameters of each algorithm and, thus, evaluate them on the datasets of the complete tagged corpus: IULA (Spanish), Brown (English) and Nasa Yuwe (Nasa). The results obtained by the proposed taggers were compared, and the Friedman and Wilcoxon statistical tests were applied, confirming that the proposed memetic, GBHS Tagger, obtained better results in precision. The proposed taggers make an important contribution to POST for traditional languages (English and Spanish), non-traditional languages (Nasa Yuwe), and their application areas.

Keywords : computational intelligence; computational linguistics; evolutionary computing; heuristic algorithms; natural language processing; parts of speech tagging; search methods.

        · abstract in Spanish | Portuguese     · text in English     · English ( pdf )