Açık Akademik Arşiv Sistemi

A new sequence based encoding for prediction of host-pathogen protein interactions

Show simple item record

dc.contributor.authors Kosesoy, I; Gok, M; Oz, C;
dc.date.accessioned 2020-10-16T10:23:16Z
dc.date.available 2020-10-16T10:23:16Z
dc.date.issued 2019
dc.identifier.citation Kosesoy, I; Gok, M; Oz, C; (2019). A new sequence based encoding for prediction of host-pathogen protein interactions. COMPUTATIONAL BIOLOGY AND CHEMISTRY, 78, 177-170
dc.identifier.issn 1476-9271
dc.identifier.uri https://doi.org/10.1016/j.compbiolchem.2018.12.001
dc.identifier.uri https://hdl.handle.net/20.500.12619/69630
dc.description.abstract Pathogen-host interactions are very important to figure out the infection process at the molecular level, where pathogen proteins physically bind to human proteins to manipulate critical biological processes in the host cell. Data scarcity and data unavailability are two major problems for computational approaches in the prediction of pathogen-host interactions. Developing a computational method to predict pathogen-host interactions with high accuracy, based on protein sequences alone, is of great importance because it can eliminate these problems. In this study, we propose a novel and robust sequence based feature extraction method, named Location Based Encoding, to predict pathogen-host interactions with machine learning based algorithms. In this context, we use Bacillus Anthracis and Yersinia Pestis data sets as the pathogen organisms and human proteins as the host model to compare our method with sequence based protein encoding methods, which are widely used in the literature, namely amino acid composition, amino acid pair, and conjoint triad. We use these encoding methods with decision trees (Random Forest, j48), statistical (Bayesian Networks, Naive Bayes), and instance based (kNN) classifiers to predict pathogen-host interactions. We conduct different experiments to evaluate the effectiveness of our method. We obtain the best results among all the experiments with RF classifier in terms of F1, accuracy, MCC, and AUC.
dc.language English
dc.publisher ELSEVIER SCI LTD
dc.subject Computer Science
dc.title A new sequence based encoding for prediction of host-pathogen protein interactions
dc.type Article
dc.identifier.volume 78
dc.identifier.startpage 170
dc.identifier.endpage 177
dc.contributor.department Sakarya Üniversitesi/Bilgisayar Ve Bilişim Bilimleri Fakültesi/Bilgisayar Mühendisliği Bölümü
dc.contributor.saüauthor Öz, Cemil
dc.relation.journal COMPUTATIONAL BIOLOGY AND CHEMISTRY
dc.identifier.wos WOS:000459524900019
dc.identifier.doi 10.1016/j.compbiolchem.2018.12.001
dc.identifier.eissn 1476-928X
dc.contributor.author Irfan Kosesoy
dc.contributor.author Murat Gok
dc.contributor.author Öz, Cemil


Files in this item

Files Size Format View

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record