SciELO - Scientific Electronic Library Online

 
vol.26 número72Environmental, Biological, and Fishing Factors Influencing Fish Mortality and Development of the Cachirra event, Navío Quebrao LagoonDesign of a simulation model that represents the collective intelligence genome of (malone et al., 2010) índice de autoresíndice de assuntospesquisa de artigos
Home Pagelista alfabética de periódicos  

Serviços Personalizados

Journal

Artigo

Indicadores

Links relacionados

  • Em processo de indexaçãoCitado por Google
  • Não possue artigos similaresSimilares em SciELO
  • Em processo de indexaçãoSimilares em Google

Compartilhar


Tecnura

versão impressa ISSN 0123-921X

Resumo

CONTRERAS CONTRERAS, Ghiordy Ferney; MEDINA DELGADO, Byron; ACEVEDO JAIMES, Brayan René  e  GUEVARA IBARRA, Dinael. Development Methodology of Techniques for Data Clustering Using Machine Learning. Tecnura [online]. 2022, vol.26, n.72, pp.42-58.  Epub 18-Jun-2022. ISSN 0123-921X.  https://doi.org/10.14483/22487638.17246.

Context:

Today, the usage of large amounts of data acquired from various electronic, optical, or other measurement devices and equipment brings the problem of data analysis at the time of extracting the aimed information from the acquired samples. Where to correctly group the data is necessary to obtain relevant and accurate information to evidence the physical phenomenon that you want to address.

Methodology:

The work presents the development and evolution of a five-stage methodology for the development of a data grouping technique, using machine learning techniques and artificial intelligence. It consists of five phases called analysis, design, development, evaluation, and distribution, using open-source standards, and based on unified languages for the interpretation of software in engineering.

Results:

The validation of the methodology was developed through the creation of two data analysis methods, with an average execution time of 20 weeks, obtaining precision values 40% and 29% higher with the classic data grouping algorithms of k-means and fuzzy cmeans. Additionally, there is a massive experimentation methodology on automated unit tests, which managed to group, label, and validate 3.6 million samples accumulated in the total of 100 group runs of 900 samples in approximately 2 hours.

Conclusions:

Finally, with the results of the research was determined that the methodology intends to guide the systematic development in specific problems in quantitative databases, such as the channel parameters in a communication system or the segmentation of images using the RGB values of the pixels. Even when software is developed both hardware, the execution will be more versatile than in cases with theoretical applications.

Financing:

Universidad Francisco de Paula Santander and Universidade Federal de Minas Gerais.

Palavras-chave : data analysis; automation; algorithm; open-source software.

        · resumo em Espanhol     · texto em Espanhol     · Espanhol ( pdf )