This Week In Cheminformatics: Issue #018
A practical guide to molecular docking, stoichiometry informed symbolic regression, DiffDock-Glide and a long list of papers
Highlights
A Practical Guide to Molecular Docking
My bachelor’s professor and compchem wiz, Elvis Martis, has a book coming out ! Excited to read it :)
Stoichiometrically-informed symbolic regression for extracting chemical reaction mechanisms from data
This paper introduces a stoichiometrically-informed symbolic regression (SISR) method that caught my eye for how it processes time-series concentration data to extract reaction mechanisms and rate constants. What makes SISR interesting is that, unlike standard dynamical system identifiers like SINDy, it bakes physical constraints directly into the search space by encoding stoichiometric rules and evaluating candidate mechanisms through a genetic algorithm that balances mean squared error against expression tree complexity. This allows the model to circumvent common pitfalls of purely mathematical regressions, such as generating negative concentrations, overfitting noisy data, or artificially pruning slow processes in stiff chemical networks. It reminds me of the the famous PySR package I used to use back when everyone though Julia was the next big thing.
DiffDock-Glide: A Hybrid Physics-Based and Data-Driven Approach to Molecular Docking
Herron et al. introduced DiffDock-Glide, a hybrid model addressing the out-of-distribution generalization limitations of purely data-driven docking methods. The authors modified DiffDock’s generative dynamics by applying a flat-bottomed harmonic potential exclusively to the translational component. This allows for inference-time binding site conditioning without retraining the base neural network. To resolve the geometric distortions and steric clashes typical of diffusion samples, they decoupled the sampling and scoring phases, instead using Glide’s post-docking minimization pipeline under the OPLS-AA force field to relax and rank candidate poses. Benchmarking demonstrated that this hybrid strategy achieves virtual screening enrichments on the DUD-E dataset against AlphaFold2 structures that surpass both standard Glide and DiffDock alone !
Long List
Cheminformatics
EAC-Net: Predicting Real-Space Charge Density via Equivariant Atomic Contributions
A Transformer for Reaction-Aware Compound Explorations with GFlowNet in QSAR-Guided Molecular Design
Deuterated pH Probes Based on Ketoglutarate for Magnetic Resonance
A Comparative Study of QSPR Methods on a Unique Multitask PAMPA Data Set
XRD-VisionTransformer: Effective Multiphase Identification Framework for X-ray Diffraction Patterns
Rega: A Platform for the Prediction of the Regioselectivity of C–H Functionalization Reactions
Advancing Reproducibility and Open Data in Theoretical and Computational Chemistry
SuAVE-Scatter: A Module for Integrating Simulations and SAXS/SANS Analyses
Enzymatic Strategies for Nitrogen–Nitrogen Bond Formation in Natural Product Biosynthesis
Relating Model Performance to Embedding Distributions in Molecular Machine Learning
QSAR-ME Profiler 2025: A New Software for QSA(P)R Predictions Supported by Structural Analysis
Fragment-Aware Contrastive Learning Framework for Molecular Property Prediction
Current Insights on Skin Permeability Data and Quantitative Structure‐Property Relationship Modeling
Ten common mistakes that could ruin your enrichment analysis
Phenotype Classification of Intact Cells by NMR Spectroscopy through Machine Learning Approaches
Ensemble Analyzer: An Open-Source Python Framework for Automated Conformer Ensemble Refinement
LapGAT: A Semi‐Supervised Learning Framework for Drug–Target Interaction Prediction
Developing and Benchmarking Sage 2.3.0 with the AshGC Neural Network Charge Model
CrystalX: High-Accuracy Crystal Structure Analysis Using Deep Learning
MedChem
Other
Palate Cleanser
why do I do this every week,
Manas













































