Abstract (EN):
Alignment-free classifiers are especially useful in the functional classification of protein classes with variable homology and different domain structures. Thus, the Topological Indices to BioPolymers (TI2BioP) methodology (Aguero-Chapin et al., 2010) inspired in both the TOPS-MODE and the MARCH-INSIDE methodologies allows the calculation of simple topological indices (TIs) as alignment-free classifiers. These indices were derived from the clustering of the amino acids into four classes of hydrophobicity and polarity revealing higher sequence-order information beyond the amino acid composition level. The predictability power of such TIs was evaluated for the first time on the RNase Ill family, due to the high diversity of its members (primary sequence and domain organization). Three non-linear models were developed for RNase Ill class prediction: Decision Tree Model (DIM), Artificial Neural Networks (ANN)-model and Hidden Markov Model (HMM). The first two are alignment-free approaches, using Its as input predictors. Their performances were compared with a non-classical HMM, modified according to our amino acid clustering strategy. The alignment-free models showed similar performances on the training and the test sets reaching values above 90% in the overall classification. The non-classical HMM showed the highest rate in the classification with values above 95% in training and 100% in test Although the higher accuracy of the HMM, the DIM showed simplicity for the RNase Ill classification with low computational cost. Such simplicity was evaluated in respect to HMM and ANN models for the functional annotation of a new bacterial RNase Ill class member, isolated and annotated by our group.
Language:
English
Type (Professor's evaluation):
Scientific
Contact:
aantunes@ciimar.up.pt
No. of pages:
12