Abstract (EN):
Prediction of new proteins/enzymes is a main goal in drug development. In this chapter we introduce a new methodology to predict enzyme subclasses based on a new 2D approach. In this contest, Randic, Liao, Nandy, Basak, and many others developed some special types of graph-based representations for pseudofolding process of sequences guided by simple heuristics. These include geometrical constraints to node positioning (sequence pseudofolding rules) in 2D space, leading to final geometrical shapes that resemble latticelike patterns. Lattice networks have been used in the past to visually depict DNA and protein sequences, but they are very flexible. In fact, we can use this technique to create string pseudofolding lattice representations for any kind of string data. In this work, we carried out a statistical analysis of 50,000+ cases to seek and validate a new quantitative structure-activity relationship-like predictor for enzyme subclasses using a machine learning approach. The model uses spectral moments, entropy, and mean potential of pseudofolding lattice graphs as inputs. In this work we report the five best models that we found.
Language:
English
Type (Professor's evaluation):
Scientific