SciELO - Scientific Electronic Library Online

 
 número79Estimación de los parámetros de neuromodulación a partir del volumen de tejido activo planeado en estimulación cerebral profundaSegmentación automática de manchas en lagartos usando un modelo de contornos activos índice de autoresíndice de materiabúsqueda de artículos
Home Pagelista alfabética de revistas  

Servicios Personalizados

Revista

Articulo

Indicadores

Links relacionados

  • En proceso de indezaciónCitado por Google
  • No hay articulos similaresSimilares en SciELO
  • En proceso de indezaciónSimilares en Google

Compartir


Revista Facultad de Ingeniería Universidad de Antioquia

versión impresa ISSN 0120-6230

Resumen

JARAMILLO-GARZON, Jorge Alberto; CASTELLANOS-DOMINGUEZ, César Germán  y  PERERA-LLUNA, Alexandre. Applicability of semi-supervised learning assumptions for gene ontology terms prediction. Rev.fac.ing.univ. Antioquia [online]. 2016, n.79, pp.19-32. ISSN 0120-6230.  https://doi.org/10.17533/udea.redin.n79a03.

Gene Ontology (GO) is one of the most important resources in bioinformatics, aiming to provide a unified framework for the biological annotation of genes and proteins across all species. Predicting GO terms is an essential task for bioinformatics, but the number of available labelled proteins is in several cases insufficient for training reliable machine learning classifiers. Semi-supervised learning methods arise as a powerful solution that explodes the information contained in unlabelled data in order to improve the estimations of traditional supervised approaches. However, semi-supervised learning methods have to make strong assumptions about the nature of the training data and thus, the performance of the predictor is highly dependent on these assumptions. This paper presents an analysis of the applicability of semi-supervised learning assumptions over the specific task of GO terms prediction, focused on providing judgment elements that allow choosing the most suitable tools for specific GO terms. The results show that semi-supervised approaches significantly outperform the traditional supervised methods and that the highest performances are reached when applying the cluster assumption. Besides, it is experimentally demonstrated that cluster and manifold assumptions are complementary to each other and an analysis of which GO terms can be more prone to be correctly predicted with each assumption, is provided.

Palabras clave : Semi-supervised learning; gene ontology; support vector machines; protein function prediction.

        · resumen en Español     · texto en Inglés     · Inglés ( pdf )