Class

Article

College

College of Agriculture and Applied Sciences

Department

Plants, Soils, and Climate Department

Faculty Mentor

Rakesh Kaundal

Presentation Type

Oral Presentation

Abstract

Host-pathogen protein-protein interactions (HPIs) plays vital roles among several biological processes. Furthermore, there is an interest in those interactions that are related with infectious diseases, which seems crucial to understand the mechanism behind the infection process per se and therefore, unravel potential targets to develop therapeutic approaches. Recently, efforts have been made to collect most of the data present in the literature about HPIs and algorithms to transfer that knowledge into unknown systems has been implemented. Although, beyond intra-species PPIs there is not information about modeling those wide-range databases using machine learning techniques, which have been proved ideal to summarize complex systems In this study, we used support vector machines to compare the prediction of HPIs using different sequence-based features. Most of the features seem to be doing a decent job to characterize the variability between the positive database (HPIDB) and the negative (Negatome), with the Autocorrelation features (Moreau-broto and Geary) achieving the highest Mathew Correlation Coefficients(MCC) for the training set (0.94 and 0.93 respectively); and Dipeptide composition with the highest for the testing set (0.90) Machine learning has been proved useful to predict HPIs, however, there is not clarity in which of the features is doing better representing the protein-protein interactions attributes. This could be due to the added complexity to work with a two-species system (Host-Pathogen), which would require to perform additional steps prior to the generation of the model. Those alternatives will be explored in the future.

Location

Room 155

Start Date

4-11-2019 1:30 PM

End Date

4-11-2019 2:45 PM

Included in

Life Sciences Commons

Share

COinS
 
Apr 11th, 1:30 PM Apr 11th, 2:45 PM

Prediction of Host-Pathogen Protein-Protein Interactions Using Machine-Learning

Room 155

Host-pathogen protein-protein interactions (HPIs) plays vital roles among several biological processes. Furthermore, there is an interest in those interactions that are related with infectious diseases, which seems crucial to understand the mechanism behind the infection process per se and therefore, unravel potential targets to develop therapeutic approaches. Recently, efforts have been made to collect most of the data present in the literature about HPIs and algorithms to transfer that knowledge into unknown systems has been implemented. Although, beyond intra-species PPIs there is not information about modeling those wide-range databases using machine learning techniques, which have been proved ideal to summarize complex systems In this study, we used support vector machines to compare the prediction of HPIs using different sequence-based features. Most of the features seem to be doing a decent job to characterize the variability between the positive database (HPIDB) and the negative (Negatome), with the Autocorrelation features (Moreau-broto and Geary) achieving the highest Mathew Correlation Coefficients(MCC) for the training set (0.94 and 0.93 respectively); and Dipeptide composition with the highest for the testing set (0.90) Machine learning has been proved useful to predict HPIs, however, there is not clarity in which of the features is doing better representing the protein-protein interactions attributes. This could be due to the added complexity to work with a two-species system (Host-Pathogen), which would require to perform additional steps prior to the generation of the model. Those alternatives will be explored in the future.