Serviços Personalizados
Journal
Artigo
Indicadores
- Citado por SciELO
- Acessos
Links relacionados
- Citado por Google
- Similares em SciELO
- Similares em Google
Compartilhar
Revista EIA
versão impressa ISSN 1794-1237
Resumo
BEDOYA, Óscar. REMOTE PROTEIN HOMOLOGY DETECTION USING PHYSICOCHEMICAL PROPERTIES. Rev.EIA.Esc.Ing.Antioq [online]. 2017, n.27, pp.111-125. ISSN 1794-1237.
A new method for remote protein homology detection, called CDA (Characteristic Distribution Analysis), is presented. The CDA method uses the distributions of physicochemical properties of amino acids for each protein. Given the training sequences of a SCOP (Structural Classification Of Proteins) family, a characteristic distribution is achieved by averaging the values of the distributions of its proteins. The hypothesis in this research is that each protein family F has a characteristic distribution that separates its sequences from the rest of the proteins in a dataset. Since there are multiple properties, close to 554 in the AAindex, a set of 72 physicochemical properties was selected to create different characteristic distributions of the same family. Each characteristic distribution is used as a classifier. Finally, a Naive Bayes classifier is trained to combine the information of the individual classifiers and obtain a better decision. We found that each family has a set of physicochemical properties that allow the discrimination of their sequences better. CDA achieves a True Positive (TP) rate of 0,793, a False Positive (FP) rate of 0,005, and a Receiver Operating Characteristic (ROC) area of 0,918. The CDA method outperforms some of the current strategies such as SVM-PCD and SVM-RQA.
Palavras-chave : Remote Homology Detection; Physicochemical Properties; SCOP Family.