Abstract:
HIV-1 protease has been the subject of intense research for deciphering HIV-1 virus replication process for decades. Knowledge of the substrate specificity of HIV-1 protease will enlighten the way of development of HIV-1 protease inhibitors. In the prediction of HIV-1 protease cleavage site techniques, various feature encoding techniques and machine learning algorithms have been used frequently. In this paper, a new feature amino acid encoding scheme is proposed to predict HIV-1 protease cleavage sites. In the proposed method, we combined orthonormal encoding and Taylor's venn-diagram. We used linear support vector machines as the classifier in the tests. We also analyzed our technique by comparing some feature encoding techniques. The tests are carried out on PR-1625 and PR-3261 datasets. Experimental results show that our amino acid encoding technique leads to better classification performance than other encoding techniques on a standalone classifier.