COMPARISON OF SUPPORT VECTOR MACHINE (SVM) AND BACK PROPAGATION NETWORK (BPN) METHODS IN PREDICTING THE PROTEIN VIRULENCE FACTORS
Machine learning algorithms are significant computational methods that are used to extract the knowledge from data. In general, neural networks and support vector machines (SVM) are the generally adopted techniques in the knowledge prediction of biological data. The availability of complete bacterial genomes information and the complexity in determining the virulence factors raised the urgency in the need of computational tools to predict the virulence factors. Thus in this study, the predictive capability of SVM and Back propagation network (BPN) algorithms and their reliability were determined by a widely used cross-validation tests in statistics. While a comparative study on the performance of the methods based on the feature representation are analyzed along with these classiï¬cation methods. SVM classifiers was trained and optimized with different kernel parameters and sequence features like composition of amino acid, combination of amino acids forming dipeptides and composite methods. In addition, BPN classifiers were also trained for the same dataset. A ten-fold cross-validation was used to evaluate the performance of both SVM and BPN classifiers. The effect of feature representation methods (AAC, DPC and Composite) on the classiï¬cation performances of SVM and BPN were evaluated. The SVM classiï¬ers trained with AAC features revealed that the accuracy of 79.13 %, while it is of 86.56% for BPN. The prediction accuracy of SVM is almost 10% and 3% greater than the BPN while using DPC and composite features respectively. Whereas, the specificity and sensitivity of SVM were found to be low than that of BPN. Thus suggesting the usages of BPN over SVM classifiers as the best classifier for predicting the proteins sequence based on their compositions.
THIRUNAVUKKARASU M, DINAKARAN K, SATHISHKUMAR E.N AND GNANENDRA S