What is it about?

In recent years, there has been growing interest in using machine learning for compound-protein interaction prediction which is a key part of many pharmacological applications. However, the interaction profiles of most compound-protein pairs are still unknown. There are a small number of positive pairs in interaction databases while negative pairs are rarely found. The present study offers an alternative strategy that addresses the absence of reliable negatives as well as data imbalance between ground-truth positives and unlabeled compound-protein pairs.

Featured Image

Why is it important?

The results show that the quasi-supervised learning algorithm can make accurate predictions on interaction status of unlabeled compound-protein pairs without requiring experimentally validated negative interactions that are rarely found in the literature. Moreover, It is robust against data imbalance between ground-truth positives and unlabeled compound-protein pairs. Lastly, it can operate on the similarity structure between protein and compound pairs directly without requiring a feature vector representation for either of them, a common requirement for most others.

Read the Original

This page is a summary of: Quasi‐Supervised Strategies for Compound‐Protein Interaction Prediction, Molecular Informatics, November 2021, Wiley,
DOI: 10.1002/minf.202100118.
You can read the full text:

Read

Contributors

The following have contributed to this page