Abstract
Protein S-sulfenylation plays a critical role in pathology and physiology. Detecting S-sulfenylated proteins in cells is of great value in medical and life sciences. Several computational methods have been developed to predict S-sulfenylation sites. However, the prediction performances are still not ideal.
Method: We developed a computational method to predict S-sulfenylation sites by utilizing physicochemical property differences to represent sequence segments around S-sulfenylation sites. By using a clustering method to partition the training set, we developed a novel prediction method using an ensemble classifier.
Results: Our method achieves an overall accuracy of 69.88% on the benchmarking dataset. We compared our method to the other state-of-the-art methods. Our method performs better than all existing methods.
Conclusion: We proposed a computational method to predict S-sulfenylated sites, which outperforms other state-of-the-art methods.
Keywords: S-sulfenylation sites, physicochemical properties difference, partition the training set, voting scheme, sequence segments, ensemble classifier.