Class
Article
College
College of Agriculture and Applied Sciences
Department
Plants, Soils, and Climate Department
Faculty Mentor
Rakesh Kaundal
Presentation Type
Oral Presentation
Abstract
Host-pathogen protein-protein interactions (HPIs) plays vital roles among several biological processes. Furthermore, there is an interest in those interactions that are related with infectious diseases, which seems crucial to understand the mechanism behind the infection process per se and therefore, unravel potential targets to develop therapeutic approaches. Recently, efforts have been made to collect most of the data present in the literature about HPIs and algorithms to transfer that knowledge into unknown systems has been implemented. Although, beyond intra-species PPIs there is not information about modeling those wide-range databases using machine learning techniques, which have been proved ideal to summarize complex systems In this study, we used support vector machines to compare the prediction of HPIs using different sequence-based features. Most of the features seem to be doing a decent job to characterize the variability between the positive database (HPIDB) and the negative (Negatome), with the Autocorrelation features (Moreau-broto and Geary) achieving the highest Mathew Correlation Coefficients(MCC) for the training set (0.94 and 0.93 respectively); and Dipeptide composition with the highest for the testing set (0.90) Machine learning has been proved useful to predict HPIs, however, there is not clarity in which of the features is doing better representing the protein-protein interactions attributes. This could be due to the added complexity to work with a two-species system (Host-Pathogen), which would require to perform additional steps prior to the generation of the model. Those alternatives will be explored in the future.
Location
Room 155
Start Date
4-11-2019 1:30 PM
End Date
4-11-2019 2:45 PM
Included in
Prediction of Host-Pathogen Protein-Protein Interactions Using Machine-Learning
Room 155
Host-pathogen protein-protein interactions (HPIs) plays vital roles among several biological processes. Furthermore, there is an interest in those interactions that are related with infectious diseases, which seems crucial to understand the mechanism behind the infection process per se and therefore, unravel potential targets to develop therapeutic approaches. Recently, efforts have been made to collect most of the data present in the literature about HPIs and algorithms to transfer that knowledge into unknown systems has been implemented. Although, beyond intra-species PPIs there is not information about modeling those wide-range databases using machine learning techniques, which have been proved ideal to summarize complex systems In this study, we used support vector machines to compare the prediction of HPIs using different sequence-based features. Most of the features seem to be doing a decent job to characterize the variability between the positive database (HPIDB) and the negative (Negatome), with the Autocorrelation features (Moreau-broto and Geary) achieving the highest Mathew Correlation Coefficients(MCC) for the training set (0.94 and 0.93 respectively); and Dipeptide composition with the highest for the testing set (0.90) Machine learning has been proved useful to predict HPIs, however, there is not clarity in which of the features is doing better representing the protein-protein interactions attributes. This could be due to the added complexity to work with a two-species system (Host-Pathogen), which would require to perform additional steps prior to the generation of the model. Those alternatives will be explored in the future.