This Week In Cheminformatics: Issue #009
New Ligand Target Prediction Models, Rules on model validation and a long list of papers
Highlights
Polypharmacology Browser PPB3: A Web-Based Deep Learning Tool for Target Prediction Using ChEMBL Data
The latest update to the Reymond group’s polypharmacology browser, PPB3, expands the scope of web-based target prediction by moving beyond simple protein-ligand interactions. While most tools focus strictly on single proteins, PPB3 leverages ChEMBL 34 data to include over 7,500 targets, covering protein complexes, cell lines, and whole organisms which is a crucial addition for flagging general toxicities early in the pipeline. Technically, it’s a streamlined DNN that maps 4,096-bit substructure fingerprints to a 7,546-bit target activity vector. What makes this particularly interesting for us is its performance on low-similarity queries; case studies on novel JAK inhibitors and chiral azepanes show that the consensus model often captures valid targets even when Tanimoto similarities to the training set were around 0.4. Give it a try here !
A Set of Rules for Model Validation
This paper gives 5 rules for model validation. This aligns with everything else I’ve read over the years but I’d like to point out model validation is a futile exercise unless it is done with explicit goal of measuring the model’s utility for a given use case, which is most often not the case. Test set should be representative of the use case but the final model is trained on the test set, which makes very little sense to people who are not from the field. Model validation is after all done to reduce the attrition rate, does it work in real life? maybe !
Long List
Cheminformatics
A machine learning-based pharmacokinetics predictor (EGFR-PROPK) for EGFR-targeting PROTACs
Scientific Knowledge Graph and Ontology Generation using Open Large Language Models
Deep Set Model for the Automated NMR Fingerprinting of Unknown Mixtures
Multi-Score Reinforcement Learning for High-Tg Polyimide Design
PROTAC-Splitter: a machine learning framework for automated identification of PROTAC substructures
Cluster-Validated Graph Neural Network for P-gp Substrate Prediction from Public Data
Addressing model overcomplexity in drug-drug interaction prediction with molecular fingerprints
Toxicological Evaluation of Ionic Liquids: QSAR Approach for Acetylcholinesterase Enzyme Inhibition
EdSr: A Novel End-to-End Approach for State-Space Sampling in Molecular Dynamics Simulation
MedChem
Other
Should We Teach FAD(H2) Is an Electron Carrier or a Cocatalyst, and Why Does It Matter?
How Alkali Metal Alkoxides Initiate Organic Radical Reactions
Palate Cleanser
stay alive,
Manas



























