What is it about?
We use open source fingerprints and a Bayesian algorithm to build thousands of computational models from data in a very big public dataset called ChEMBL. We demonstrate the cross validation of these models, make them openly accessible and demonstrate how they can be imported in to a mobile app and used for predictions.
Featured Image
Why is it important?
We are not aware of anyone using ChEMBL in this way with open source technologies and making the thousands of models accessible. In addition we describe a novel algorithm for detecting thresholds for active / inactive in continuous data. Finally we access the effect of folding on the fingerprints.
Perspectives
Read the Original
This page is a summary of: Open Source Bayesian Models. 2. Mining a “Big Dataset” To Create and Validate Models with ChEMBL, Journal of Chemical Information and Modeling, June 2015, American Chemical Society (ACS),
DOI: 10.1021/acs.jcim.5b00144.
You can read the full text:
Resources
Accompanying Data
A huge collection of models extracted then built from ChEMBL, each with a graphical summary.
Mining 'Bigger' Datasets to Create, Validate and Share Machine Learning Models
Talk at EPA 2015
Mining Big datasets to create and validate machine learning models
Presentation at ACS boston 2015
Bigger Data to Increase Drug Discovery
A talk given at the International Congress "Contrasts in Pharmacology 2.0" held in Turin 2015
Open Source Bayesian Models (X2)
blog
Contributors
The following have contributed to this page