<?xml version="1.0" encoding="ISO-8859-1"?><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<front>
<journal-meta>
<journal-id>0124-2253</journal-id>
<journal-title><![CDATA[Revista científica]]></journal-title>
<abbrev-journal-title><![CDATA[Rev. Cient.]]></abbrev-journal-title>
<issn>0124-2253</issn>
<publisher>
<publisher-name><![CDATA[Universidad Distrital Francisco José de Caldas]]></publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id>S0124-22532022000100064</article-id>
<article-id pub-id-type="doi">10.14483/23448350.18352</article-id>
<title-group>
<article-title xml:lang="es"><![CDATA[Servicio de clasificación documental multi cliente basado en técnicas de aprendizaje de máquina y Elasticsearch]]></article-title>
<article-title xml:lang="en"><![CDATA[Multi-Client Document Classification Service Based on Machine Learning Techniques and Elasticsearch]]></article-title>
<article-title xml:lang="pt"><![CDATA[Serviço de classificação documentária multi-cliente baseado em técnicas de aprendizagem de máquina e Elasticsearch]]></article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname><![CDATA[García-Chicangana]]></surname>
<given-names><![CDATA[David-Santiago]]></given-names>
</name>
<xref ref-type="aff" rid="Aff"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Cobos-Lozada]]></surname>
<given-names><![CDATA[Carlos-Alberto]]></given-names>
</name>
<xref ref-type="aff" rid="Aff"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Mendoza-Becerra]]></surname>
<given-names><![CDATA[Martha-Eliana]]></given-names>
</name>
<xref ref-type="aff" rid="Aff"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Niño-Zambrano]]></surname>
<given-names><![CDATA[Miguel-Ángel]]></given-names>
</name>
<xref ref-type="aff" rid="Aff"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Martínez-Figueroa]]></surname>
<given-names><![CDATA[James-Mauricio]]></given-names>
</name>
<xref ref-type="aff" rid="Aff"/>
</contrib>
</contrib-group>
<aff id="Af1">
<institution><![CDATA[,Nexura S.A.S.  ]]></institution>
<addr-line><![CDATA[Cali ]]></addr-line>
<country>Colombia</country>
</aff>
<aff id="Af2">
<institution><![CDATA[,Universidad del Cauca  ]]></institution>
<addr-line><![CDATA[Popayán ]]></addr-line>
<country>Colombia</country>
</aff>
<aff id="Af3">
<institution><![CDATA[,Universidad del Cauca  ]]></institution>
<addr-line><![CDATA[Popayán ]]></addr-line>
<country>Colombia</country>
</aff>
<aff id="Af4">
<institution><![CDATA[,Universidad del Cauca  ]]></institution>
<addr-line><![CDATA[Popayán ]]></addr-line>
<country>Colombia</country>
</aff>
<aff id="Af5">
<institution><![CDATA[,Nexura S.A.S.  ]]></institution>
<addr-line><![CDATA[Cali ]]></addr-line>
<country>Colombia</country>
</aff>
<pub-date pub-type="pub">
<day>00</day>
<month>04</month>
<year>2022</year>
</pub-date>
<pub-date pub-type="epub">
<day>00</day>
<month>04</month>
<year>2022</year>
</pub-date>
<numero>43</numero>
<fpage>64</fpage>
<lpage>79</lpage>
<copyright-statement/>
<copyright-year/>
<self-uri xlink:href="http://www.scielo.org.co/scielo.php?script=sci_arttext&amp;pid=S0124-22532022000100064&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://www.scielo.org.co/scielo.php?script=sci_abstract&amp;pid=S0124-22532022000100064&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://www.scielo.org.co/scielo.php?script=sci_pdf&amp;pid=S0124-22532022000100064&amp;lng=en&amp;nrm=iso"></self-uri><abstract abstract-type="short" xml:lang="es"><p><![CDATA[Resumen Este artículo presenta un servicio de clasificación documental que permite a los sistemas de gestión documental de múltiples clientes brindar una mayor confianza y credibilidad sobre los tipos documentales asignados a los documentos que cargan los usuarios. La investigación fue realizada a través de las fases de CRISP-DM en las que se evaluaron dos modelos de representación de documentos, bolsas de palabras con n-gramas acumulativos y BERT (propuesto recientemente por Google), y cinco técnicas de aprendizaje de máquina, perceptrón multicapa, bosques aleatorios, k vecinos más cercanos, árboles de decisión y un clasificador bayesiano ingenuo. Los experimentos se realizaron con datos de dos organizaciones y los mejores resultados fueron los obtenidos por el perceptrón multicapa, los bosques aleatorios y los k vecinos más cercanos, con resultados muy similares de exactitud general y recuerdo por clase para los tres algoritmos. Los resultados no son concluyentes para ofertar el servicio a múltiples clientes con un solo modelo, ya que esto depende de los documentos y tipos documentales de cada uno de ellos. Por lo anterior, se ofrece un servicio basado en una arquitectura de microservicios que permite a cada organización la creación de su propio modelo, el monitoreo de su rendimiento en producción y su actualización cuando el rendimiento no sea adecuado.]]></p></abstract>
<abstract abstract-type="short" xml:lang="en"><p><![CDATA[Abstract This paper presents a document classification service that allows multiple client (multi-tenant) document management systems to provide greater confidence and credibility regarding the document types assigned to documents uploaded by users. The research was carried out through the phases of CRISP-DM, where two document representation models were evaluated (bags of words with cumulative n-grams and BERT, which was recently proposed by Google) and five machine learning techniques (multilayer perceptron, random forests, k-nearest neighbors, decision trees, and naïve bayes). The experiments were carried out with data from two organizations, and the best results were obtained by multilayer perceptron, random forests, and k-nearest neighbors, which showed very similar results regarding general accuracy and recall by class. The results are not conclusive with respect to the ability to offer the service to multiple clients with a single model, since this also depends on their documents and document types. Therefore, a service is offered which is based on a microservices architecture that allows each organization to create its own model, monitor its performance in production, and update it when performance is not adequate.]]></p></abstract>
<abstract abstract-type="short" xml:lang="pt"><p><![CDATA[Resumo Este artigo apresenta um serviço de classificação de documentos que permite que sistemas de gerenciamento de documentos de múltiplos clientes (multilocatário) forneçam maior confiança e credibilidade nos tipos de documentos atribuídos aos documentos carregados pelos usuários. A pesquisa foi realizada através das fases do CRISP-DM onde foram avaliados dois modelos de representação de documentos, sacos de palavras com n-gramas cumulativos e BERT (recentemente proposto pelo Google) e cinco técnicas de aprendizado de máquina, perceptron multicamadas, florestas aleatórias, k mais próximo vizinhos, árvores de decisão e bayes ingênuos. Os experimentos foram realizados com dados de duas organizações e os melhores resultados foram obtidos pelo perceptron multicamadas, as florestas aleatórias e os k vizinhos mais próximos, com resultados muito semelhantes de precisão geral e recuperação por classe para esses três algoritmos. Os resultados não são conclusivos para oferecer o serviço a vários clientes com um único modelo, pois isso depende também dos documentos e tipos de documentos de cada um deles. Portanto, um serviço é oferecido com base em uma arquitetura de microsserviços que permite a cada organização criar seu próprio modelo, monitorar seu desempenho na produção e atualizá-lo quando o desempenho não for adequado.]]></p></abstract>
<kwd-group>
<kwd lng="es"><![CDATA[analítica de datos]]></kwd>
<kwd lng="es"><![CDATA[bosques aleatorios]]></kwd>
<kwd lng="es"><![CDATA[CRISP-DM]]></kwd>
<kwd lng="es"><![CDATA[k vecinos más cercanos]]></kwd>
<kwd lng="es"><![CDATA[perceptrón multicapa]]></kwd>
<kwd lng="es"><![CDATA[sistema de gestión documental]]></kwd>
<kwd lng="es"><![CDATA[trigramas.]]></kwd>
<kwd lng="en"><![CDATA[CRISP-DM]]></kwd>
<kwd lng="en"><![CDATA[data analytics]]></kwd>
<kwd lng="en"><![CDATA[document management system]]></kwd>
<kwd lng="en"><![CDATA[k-nearest neighbors]]></kwd>
<kwd lng="en"><![CDATA[multilayer perceptron]]></kwd>
<kwd lng="en"><![CDATA[random forests]]></kwd>
<kwd lng="en"><![CDATA[trigrams.]]></kwd>
<kwd lng="pt"><![CDATA[análise de dados]]></kwd>
<kwd lng="pt"><![CDATA[CRISP-DM]]></kwd>
<kwd lng="pt"><![CDATA[florestas aleatórias]]></kwd>
<kwd lng="pt"><![CDATA[k-vizinhos mais próximos]]></kwd>
<kwd lng="pt"><![CDATA[perceptron multicamadas]]></kwd>
<kwd lng="pt"><![CDATA[sistema de gerenciamento de documentos]]></kwd>
<kwd lng="pt"><![CDATA[trigramas.]]></kwd>
</kwd-group>
</article-meta>
</front><back>
<ref-list>
<ref id="B1">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Aliwy]]></surname>
<given-names><![CDATA[A. H.]]></given-names>
</name>
<name>
<surname><![CDATA[Ameer]]></surname>
<given-names><![CDATA[E. A.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Comparative study of five text classification algorithms with their improvements]]></article-title>
<source><![CDATA[International Journal of Applied Engineering Research]]></source>
<year>2017</year>
<volume>12</volume>
<numero>14</numero>
<issue>14</issue>
<page-range>4309-19</page-range></nlm-citation>
</ref>
<ref id="B2">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Cameron-Jones]]></surname>
<given-names><![CDATA[R. M.]]></given-names>
</name>
</person-group>
<source><![CDATA[Instance selection by encoding length heuristic with random mutation hill climbing]]></source>
<year>1995</year>
<conf-name><![CDATA[ Eighth Australian Joint Conference on Artificial Intelligence]]></conf-name>
<conf-loc>Canberra </conf-loc>
<page-range>99-106</page-range></nlm-citation>
</ref>
<ref id="B3">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Cañete]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Chaperon]]></surname>
<given-names><![CDATA[G.]]></given-names>
</name>
<name>
<surname><![CDATA[Fuentes]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
<name>
<surname><![CDATA[Ho]]></surname>
<given-names><![CDATA[J.-H.]]></given-names>
</name>
<name>
<surname><![CDATA[Kang]]></surname>
<given-names><![CDATA[H.]]></given-names>
</name>
<name>
<surname><![CDATA[Pérez]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Spanish pre-trained BERT model and evaluation data]]></article-title>
<source><![CDATA[PML4DC]]></source>
<year>2020</year>
<page-range>1-10</page-range><publisher-name><![CDATA[ICLR]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B4">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Cao]]></surname>
<given-names><![CDATA[Z.]]></given-names>
</name>
<name>
<surname><![CDATA[Zhou]]></surname>
<given-names><![CDATA[Y.]]></given-names>
</name>
<name>
<surname><![CDATA[Yang]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Fu]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Contextualized Word Representations with Effective Attention for Aspect-Based Sentiment Analysis]]></article-title>
<person-group person-group-type="editor">
<name>
<surname><![CDATA[Sun]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Huang]]></surname>
<given-names><![CDATA[X.]]></given-names>
</name>
<name>
<surname><![CDATA[Ji]]></surname>
<given-names><![CDATA[H.]]></given-names>
</name>
<name>
<surname><![CDATA[Liu]]></surname>
<given-names><![CDATA[Z.]]></given-names>
</name>
<name>
<surname><![CDATA[Liu]]></surname>
<given-names><![CDATA[Y.]]></given-names>
</name>
</person-group>
<source><![CDATA[Chinese Computational Linguistics]]></source>
<year>2019</year>
<page-range>467-78</page-range><publisher-name><![CDATA[Springer]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B5">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Chen]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Yan]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Wong]]></surname>
<given-names><![CDATA[K.-C.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Verbal aggression detection on Twitter comments: Convolutional neural network for short-text sentiment analysis]]></article-title>
<source><![CDATA[Neural Computing and Applications]]></source>
<year>2020</year>
<volume>32</volume>
<numero>15</numero>
<issue>15</issue>
<page-range>10809-18</page-range></nlm-citation>
</ref>
<ref id="B6">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Chen]]></surname>
<given-names><![CDATA[T.]]></given-names>
</name>
<name>
<surname><![CDATA[Xu]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
<name>
<surname><![CDATA[He]]></surname>
<given-names><![CDATA[Y.]]></given-names>
</name>
<name>
<surname><![CDATA[Wang]]></surname>
<given-names><![CDATA[X.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Improving sentiment analysis via sentence type classification using BiLSTM-CRF and CNN]]></article-title>
<source><![CDATA[Expert Systems with Applications]]></source>
<year>2017</year>
<numero>72</numero>
<issue>72</issue>
<page-range>221-30</page-range></nlm-citation>
</ref>
<ref id="B7">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Devlin]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Chang]]></surname>
<given-names><![CDATA[M.-W.]]></given-names>
</name>
<name>
<surname><![CDATA[Lee]]></surname>
<given-names><![CDATA[K.]]></given-names>
</name>
<name>
<surname><![CDATA[Toutanova]]></surname>
<given-names><![CDATA[K.]]></given-names>
</name>
</person-group>
<source><![CDATA[BERT: Pre-training of deep bidirectional transformers for language understanding]]></source>
<year>2019</year>
<volume>1</volume>
<conf-name><![CDATA[ Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies]]></conf-name>
<conf-loc> </conf-loc>
<page-range>4171-86</page-range><publisher-name><![CDATA[Long and Short Papers]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B8">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Dorado]]></surname>
<given-names><![CDATA[H.]]></given-names>
</name>
<name>
<surname><![CDATA[Cobos]]></surname>
<given-names><![CDATA[C.]]></given-names>
</name>
<name>
<surname><![CDATA[Torres-Jimenez]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Burra]]></surname>
<given-names><![CDATA[D. D.]]></given-names>
</name>
<name>
<surname><![CDATA[Mendoza]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Jimenez]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Wrapper for building classification models using covering arrays]]></article-title>
<source><![CDATA[IEEE Access]]></source>
<year>2019</year>
<numero>7</numero>
<issue>7</issue>
<page-range>148297-312</page-range></nlm-citation>
</ref>
<ref id="B9">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Fernández-Navarro]]></surname>
<given-names><![CDATA[F.]]></given-names>
</name>
<name>
<surname><![CDATA[Hervás-Martínez]]></surname>
<given-names><![CDATA[C.]]></given-names>
</name>
<name>
<surname><![CDATA[Gutiérrez]]></surname>
<given-names><![CDATA[P. A.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[A dynamic over-sampling procedure based on sensitivity for multi-class problems]]></article-title>
<source><![CDATA[Pattern Recognition]]></source>
<year>2011</year>
<volume>44</volume>
<numero>8</numero>
<issue>8</issue>
<page-range>1821-33</page-range></nlm-citation>
</ref>
<ref id="B10">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Gitanjali]]></surname>
<given-names><![CDATA[Lakhwani, K.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[A novel approach of sensitive data classification using convolution neural network and logistic regression]]></article-title>
<source><![CDATA[International Journal of Innovative Technology and Exploring Engineering (IJITEE)]]></source>
<year>2019</year>
<volume>8</volume>
<numero>8</numero>
<issue>8</issue>
<page-range>2883-6</page-range></nlm-citation>
</ref>
<ref id="B11">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Gowda]]></surname>
<given-names><![CDATA[K.]]></given-names>
</name>
<name>
<surname><![CDATA[Krishna]]></surname>
<given-names><![CDATA[G.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[The condensed nearest neighbor rule using the concept of mutual nearest neighborhood]]></article-title>
<source><![CDATA[IEEE Transactions on Information Theory]]></source>
<year>1979</year>
<volume>25</volume>
<numero>4</numero>
<issue>4</issue>
<page-range>488-90</page-range></nlm-citation>
</ref>
<ref id="B12">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Hapsari]]></surname>
<given-names><![CDATA[D. P.]]></given-names>
</name>
<name>
<surname><![CDATA[Utoyo]]></surname>
<given-names><![CDATA[I.]]></given-names>
</name>
<name>
<surname><![CDATA[Purnami]]></surname>
<given-names><![CDATA[S. W.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Text categorization with fractional gradient descent support vector machine]]></article-title>
<source><![CDATA[Journal of Physics: Conference Series]]></source>
<year>2020</year>
<numero>1477</numero>
<issue>1477</issue>
</nlm-citation>
</ref>
<ref id="B13">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Ismael]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Okumus]]></surname>
<given-names><![CDATA[I.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Design and implementation of an electronic document management system]]></article-title>
<source><![CDATA[Journal of Applied Sciences of Mehmet Akif Ersoy University]]></source>
<year>2017</year>
<volume>1</volume>
<numero>1</numero>
<issue>1</issue>
<page-range>9-17</page-range></nlm-citation>
</ref>
<ref id="B14">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Jiang]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Liang]]></surname>
<given-names><![CDATA[Y.]]></given-names>
</name>
<name>
<surname><![CDATA[Feng]]></surname>
<given-names><![CDATA[X.]]></given-names>
</name>
<name>
<surname><![CDATA[Fan]]></surname>
<given-names><![CDATA[X.]]></given-names>
</name>
<name>
<surname><![CDATA[Pei]]></surname>
<given-names><![CDATA[Z.]]></given-names>
</name>
<name>
<surname><![CDATA[Xue]]></surname>
<given-names><![CDATA[Y.]]></given-names>
</name>
<name>
<surname><![CDATA[Guan]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Text classification based on deep belief network and softmax regression]]></article-title>
<source><![CDATA[Neural Computing and Applications]]></source>
<year>2018</year>
<volume>29</volume>
<numero>1</numero>
<issue>1</issue>
<page-range>61-70</page-range></nlm-citation>
</ref>
<ref id="B15">
<nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Kowsari]]></surname>
<given-names><![CDATA[K.]]></given-names>
</name>
<name>
<surname><![CDATA[Brown]]></surname>
<given-names><![CDATA[D. E.]]></given-names>
</name>
<name>
<surname><![CDATA[Heidarysafa]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Jafari Meimandi]]></surname>
<given-names><![CDATA[K.]]></given-names>
</name>
<name>
<surname><![CDATA[Gerber]]></surname>
<given-names><![CDATA[M. S.]]></given-names>
</name>
<name>
<surname><![CDATA[Barnes]]></surname>
<given-names><![CDATA[L. E.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[HDLTex: Hierarchical deep learning for text classification]]></article-title>
<source><![CDATA[16th IEEE International Conference on Machine Learning and Applications (ICMLA)]]></source>
<year>2017</year>
<page-range>364-71</page-range></nlm-citation>
</ref>
<ref id="B16">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Kowsari]]></surname>
<given-names><![CDATA[K.]]></given-names>
</name>
<name>
<surname><![CDATA[Meimandi]]></surname>
<given-names><![CDATA[K. J.]]></given-names>
</name>
<name>
<surname><![CDATA[Heidarysafa]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Mendu]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Barnes]]></surname>
<given-names><![CDATA[L.]]></given-names>
</name>
<name>
<surname><![CDATA[Brown]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Text classification algorithms: A survey]]></article-title>
<source><![CDATA[Information]]></source>
<year>2019</year>
<volume>10</volume>
<numero>4</numero>
<issue>4</issue>
</nlm-citation>
</ref>
<ref id="B17">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Lacunza]]></surname>
<given-names><![CDATA[A. C.]]></given-names>
</name>
</person-group>
<source><![CDATA[Implementación de un Sistema de Gestión Documental Electrónico en la Universidad Nacional de la Plata: El camino hacia el expediente electrónico]]></source>
<year>2020</year>
<publisher-loc><![CDATA[Argentina ]]></publisher-loc>
<publisher-name><![CDATA[Universidad Nacional de la Plata]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B18">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Lagrari]]></surname>
<given-names><![CDATA[F.-E.]]></given-names>
</name>
<name>
<surname><![CDATA[Ziyati]]></surname>
<given-names><![CDATA[H.]]></given-names>
</name>
<name>
<surname><![CDATA[El Kettani]]></surname>
<given-names><![CDATA[Y.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[An efficient model of text categorization based on feature selection and random forests: Case for Business documents]]></article-title>
<person-group person-group-type="editor">
<name>
<surname><![CDATA[Ezziyyani]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
</person-group>
<source><![CDATA[Advanced Intelligent Systems for Sustainable Development (AI2SD&#8217;2018)]]></source>
<year>2019</year>
<page-range>465-76</page-range><publisher-name><![CDATA[Springer]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B19">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Qin]]></surname>
<given-names><![CDATA[W.]]></given-names>
</name>
<name>
<surname><![CDATA[Guo]]></surname>
<given-names><![CDATA[W.]]></given-names>
</name>
<name>
<surname><![CDATA[Liu]]></surname>
<given-names><![CDATA[X.]]></given-names>
</name>
<name>
<surname><![CDATA[Zhao]]></surname>
<given-names><![CDATA[H.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[A novel scheme for recruitment text categorization based on KNN algorithm]]></article-title>
<person-group person-group-type="editor">
<name>
<surname><![CDATA[Qiu]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
</person-group>
<source><![CDATA[SmartCom 2019: Smart Computing and Communication]]></source>
<year>2019</year>
<page-range>376-86</page-range><publisher-name><![CDATA[Springer]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B20">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Qu]]></surname>
<given-names><![CDATA[Z.]]></given-names>
</name>
<name>
<surname><![CDATA[Song]]></surname>
<given-names><![CDATA[X.]]></given-names>
</name>
<name>
<surname><![CDATA[Zheng]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Wang]]></surname>
<given-names><![CDATA[X.]]></given-names>
</name>
<name>
<surname><![CDATA[Song]]></surname>
<given-names><![CDATA[X.]]></given-names>
</name>
<name>
<surname><![CDATA[Li]]></surname>
<given-names><![CDATA[Z.]]></given-names>
</name>
</person-group>
<source><![CDATA[Improved bayes method based on TF-IDF feature and grade factor feature for Chinese information classification]]></source>
<year>2018</year>
<conf-name><![CDATA[ IEEE International Conference on Big Data and Smart Computing (BigComp)]]></conf-name>
<conf-loc> </conf-loc>
<page-range>677-80</page-range></nlm-citation>
</ref>
<ref id="B21">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Rangel Palencia]]></surname>
<given-names><![CDATA[E. L.]]></given-names>
</name>
</person-group>
<source><![CDATA[Guía de Implementación de Un Sistema de Gestión de Documentos Electrónicos de Archivo - SGDEA]]></source>
<year>2017</year>
<publisher-name><![CDATA[Archivo General de la Nación de Colombia]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B22">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Rasjid]]></surname>
<given-names><![CDATA[Z. E.]]></given-names>
</name>
<name>
<surname><![CDATA[Setiawan]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Performance comparison and optimization of text document classification using k-NN and naïve bayes classification techniques]]></article-title>
<source><![CDATA[Procedia Computer Science]]></source>
<year>2017</year>
<numero>116</numero>
<issue>116</issue>
<page-range>107-12</page-range></nlm-citation>
</ref>
<ref id="B23">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Rodríguez Cruz]]></surname>
<given-names><![CDATA[Y.]]></given-names>
</name>
<name>
<surname><![CDATA[Castellanos Crespo]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Ramírez Peña]]></surname>
<given-names><![CDATA[Z.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Gestión documental, de información, del conocimiento e inteligencia organizacional: particularidades y convergencia para la toma de decisiones estratégicas]]></article-title>
<source><![CDATA[Revista Cubana de Información en Ciencias de la Salud]]></source>
<year>2016</year>
<volume>27</volume>
<numero>2</numero>
<issue>2</issue>
</nlm-citation>
</ref>
<ref id="B24">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Schröer]]></surname>
<given-names><![CDATA[C.]]></given-names>
</name>
<name>
<surname><![CDATA[Kruse]]></surname>
<given-names><![CDATA[F.]]></given-names>
</name>
<name>
<surname><![CDATA[Gómez]]></surname>
<given-names><![CDATA[J. M.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[A systematic literature review on applying CRISP-DM process model]]></article-title>
<source><![CDATA[Procedia Computer Science]]></source>
<year>2021</year>
<numero>181</numero>
<issue>181</issue>
<page-range>526-34</page-range></nlm-citation>
</ref>
<ref id="B25">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Selvi]]></surname>
<given-names><![CDATA[S. T.]]></given-names>
</name>
<name>
<surname><![CDATA[Karthikeyan]]></surname>
<given-names><![CDATA[P.]]></given-names>
</name>
<name>
<surname><![CDATA[Vincent]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Abinaya]]></surname>
<given-names><![CDATA[V.]]></given-names>
</name>
<name>
<surname><![CDATA[Neeraja]]></surname>
<given-names><![CDATA[G.]]></given-names>
</name>
<name>
<surname><![CDATA[Deepika]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
</person-group>
<source><![CDATA[Text categorization using Rocchio algorithm and random forest algorithm]]></source>
<year>2017</year>
<conf-name><![CDATA[ Eighth International Conference on Advanced Computing (ICoAC)]]></conf-name>
<conf-loc> </conf-loc>
<page-range>7-12</page-range></nlm-citation>
</ref>
<ref id="B26">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Shah]]></surname>
<given-names><![CDATA[N.]]></given-names>
</name>
<name>
<surname><![CDATA[Willick]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
<name>
<surname><![CDATA[Mago]]></surname>
<given-names><![CDATA[V.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[A framework for social media data analytics using Elasticsearch and Kibana]]></article-title>
<source><![CDATA[Wireless Networks]]></source>
<year>2018</year>
</nlm-citation>
</ref>
<ref id="B27">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Skalak]]></surname>
<given-names><![CDATA[D. B.]]></given-names>
</name>
<name>
<surname><![CDATA[Cohen]]></surname>
<given-names><![CDATA[W. W.]]></given-names>
</name>
<name>
<surname><![CDATA[Hirsh]]></surname>
<given-names><![CDATA[H.]]></given-names>
</name>
</person-group>
<source><![CDATA[Prototype and feature selection by sampling and random mutation hill climbing algorithms]]></source>
<year>1994</year>
<conf-name><![CDATA[ Machine Learning: Proceedings of the Eleventh International Conference]]></conf-name>
<conf-loc> </conf-loc>
<publisher-name><![CDATA[Morgan Kaufmann]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B28">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Taloba]]></surname>
<given-names><![CDATA[A. I.]]></given-names>
</name>
<name>
<surname><![CDATA[Ismail]]></surname>
<given-names><![CDATA[S. S. I.]]></given-names>
</name>
</person-group>
<source><![CDATA[An intelligent hybrid technique of decision tree and genetic algorithm for e-mail spam detection]]></source>
<year>2019</year>
<conf-name><![CDATA[ Ninth International Conference on Intelligent Computing and Information Systems (ICICIS)]]></conf-name>
<conf-loc> </conf-loc>
<page-range>99-104</page-range></nlm-citation>
</ref>
<ref id="B29">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Vijayan]]></surname>
<given-names><![CDATA[V. K.]]></given-names>
</name>
<name>
<surname><![CDATA[Bindu]]></surname>
<given-names><![CDATA[K. R.]]></given-names>
</name>
<name>
<surname><![CDATA[Parameswaran]]></surname>
<given-names><![CDATA[L.]]></given-names>
</name>
</person-group>
<source><![CDATA[A comprehensive study of text classification algorithms]]></source>
<year>2017</year>
<conf-name><![CDATA[ International Conference on Advances in Computing, Communications and Informatics (ICACCI)]]></conf-name>
<conf-loc> </conf-loc>
<page-range>1109-13</page-range></nlm-citation>
</ref>
<ref id="B30">
<nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Villegas]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Cobos]]></surname>
<given-names><![CDATA[C.]]></given-names>
</name>
<name>
<surname><![CDATA[Mendoza]]></surname>
<given-names><![CDATA[M. E.]]></given-names>
</name>
<name>
<surname><![CDATA[Herrera-Viedma]]></surname>
<given-names><![CDATA[E.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Feature selection using sampling with replacement, covering arrays and rule-induction techniques to aid polarity detection in twitter sentiment analysis]]></article-title>
<source><![CDATA[Lecture Notes in Computer Science]]></source>
<year>2018</year>
<numero>11238</numero>
<issue>11238</issue>
<page-range>467-80</page-range></nlm-citation>
</ref>
<ref id="B31">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Voit]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Stankus]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Magomedov]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Ivanova]]></surname>
<given-names><![CDATA[I.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Big data processing for full-text search and visualization with Elasticsearch]]></article-title>
<source><![CDATA[International Journal of Advanced Computer Science and Applications]]></source>
<year>2017</year>
<volume>8</volume>
<numero>12</numero>
<issue>12</issue>
</nlm-citation>
</ref>
<ref id="B32">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Wirth]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
<name>
<surname><![CDATA[Hipp]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
</person-group>
<source><![CDATA[CRISP-DM: Towards a standard process model for data mining]]></source>
<year>2000</year>
<conf-name><![CDATA[ Proceedings of the Fourth International Conference on the Practical Application of Knowledge Discovery and Data Mining]]></conf-name>
<conf-loc> </conf-loc>
<page-range>29-39</page-range></nlm-citation>
</ref>
<ref id="B33">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Wojciechowski]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Wilk]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Stefanowski]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Kurzynski]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Wozniak]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Burduk]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
</person-group>
<source><![CDATA[An algorithm for selective preprocessing of multi-class imbalanced data]]></source>
<year>2018</year>
<conf-name><![CDATA[ Proceedings of the 10th International Conference on Computer Recognition Systems CORES 2017]]></conf-name>
<conf-loc> </conf-loc>
<page-range>238-47</page-range><publisher-name><![CDATA[Springer]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B34">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Yang]]></surname>
<given-names><![CDATA[Z.]]></given-names>
</name>
<name>
<surname><![CDATA[Yang]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
<name>
<surname><![CDATA[Dyer]]></surname>
<given-names><![CDATA[C.]]></given-names>
</name>
<name>
<surname><![CDATA[He]]></surname>
<given-names><![CDATA[X.]]></given-names>
</name>
<name>
<surname><![CDATA[Smola]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Hovy]]></surname>
<given-names><![CDATA[E.]]></given-names>
</name>
</person-group>
<source><![CDATA[Hierarchical attention networks for document classification]]></source>
<year>2016</year>
<conf-name><![CDATA[ Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies]]></conf-name>
<conf-loc> </conf-loc>
<page-range>1480-9</page-range></nlm-citation>
</ref>
<ref id="B35">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Zamfir]]></surname>
<given-names><![CDATA[V.-A.]]></given-names>
</name>
<name>
<surname><![CDATA[Carabas]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Carabas]]></surname>
<given-names><![CDATA[C.]]></given-names>
</name>
<name>
<surname><![CDATA[Tapus]]></surname>
<given-names><![CDATA[N.]]></given-names>
</name>
</person-group>
<source><![CDATA[Systems monitoring and big data analysis using the elasticsearch system]]></source>
<year>2019</year>
<conf-name><![CDATA[ 22nd International Conference on Control Systems and Computer Science (CSCS)]]></conf-name>
<conf-loc> </conf-loc>
<page-range>188-93</page-range></nlm-citation>
</ref>
</ref-list>
</back>
</article>
