SciELO - Scientific Electronic Library Online

 
vol.24 número51Decentralized Energy Management System Based on Multi-agents to Operate Multiple MicrogridsModeling the Evolution of SARS CoV 2 Using a Fractional Order SIR Approach índice de autoresíndice de assuntospesquisa de artigos
Home Pagelista alfabética de periódicos  

Serviços Personalizados

Journal

Artigo

Indicadores

Links relacionados

  • Em processo de indexaçãoCitado por Google
  • Não possue artigos similaresSimilares em SciELO
  • Em processo de indexaçãoSimilares em Google

Compartilhar


TecnoLógicas

versão impressa ISSN 0123-7799versão On-line ISSN 2256-5337

Resumo

LOPEZ-TRUJILLO, Sebastián  e  TORRES-MADRONERO, Maria C.. Comparison of Text Summarization Algorithms for Processing Editorials and News in Spanish. TecnoL. [online]. 2021, vol.24, n.51, pp.120-132.  Epub 04-Out-2021. ISSN 0123-7799.  https://doi.org/10.22430/22565337.1816.

Language is affected not only by grammatical rules but also by the context and socio-cultural differences. Therefore, automatic text summarization, an area of interest in natural language processing (NLP), faces challenges such as identifying essential fragments according to the context and establishing the type of text under analysis. Previous literature has described several automatic summarization methods; however, no studies so far have examined their effectiveness in specific contexts and Spanish texts. In this paper, we compare three automatic summarization algorithms using news articles and editorials in Spanish. The three algorithms are extractive methods that estimate the importance of a phrase or word based on similarity or word frequency metrics. A document database was built with 33 editorials and 27 news articles, and three summaries of each text were manually extracted employing the three algorithms. The algorithms were quantitatively compared using the Recall-Oriented Understudy for Gisting Evaluation (ROUGE) metric. We analyzed the algorithms’ potential to identify the main components of a text. In the case of editorials, the automatic summary should include a problem and the author’s opinion. Regarding news articles, the summary should describe the temporal and spatial characteristics of an event. In terms of word reduction percentage and accuracy, the method based on the similarity matrix produced the best results and can achieve a 70 % reduction in both cases (i.e., news and editorials). However, semantics and context should be incorporated into the algorithms to improve their performance in terms of accuracy and sensitivity.

Palavras-chave : Natural language processing; Recall Oriented Understudy for Gisting Evaluation; Text Analysis; Text Mining; Automatic Summarization.

        · resumo em Espanhol     · texto em Espanhol     · Espanhol ( pdf )