This Week In Cheminformatics: Issue #005
CACHE Challenge #3, Digital pipette, Bias in retrosynthesis datasets, Benchmarking self-driving labs and a long list of papers to start the final week of January.
Highlights
CACHE Challenge #3: Targeting the Nsp3 Macrodomain of SARS-CoV-2
This reports the outcomes of the third Critical Assessment of Computational Hit-finding Experiments (CACHE) challenge, which tasked 23 computational teams with identifying novel inhibitors for the SARS-CoV-2 NSP3 macrodomain. The results provide a sobering but instructive benchmark for the state of in silico drug discovery: while participants successfully prioritized high chemical novelty with over 85% of the ~1,700 designed molecules being chemically distinct from known binders, the experimentally confirmed hits overwhelmingly converged back to established chemotypes. Specifically, the most potent validated hits structurally mimicked the adenine ring of the endogenous ADP-ribose substrate, preserving the canonical hydrogen-bonding motifs observed in prior crystallographic data. Technically, the challenge reinforces a trend observed in previous CACHE rounds regarding the “Physics-AI” usage. The most successful workflows did not rely on “black box” deep learning alone; rather, they combined physics-based docking to train ad-hoc machine learning models for rapid library expansion, or they relied exclusively on physics-based methods. Among the seven winning workflows, four were purely physics-based, and others integrated pharmacophore searches or fragment growing, suggesting that for rigid, polar-dominated pockets like Mac1, explicit modeling of energetic interactions was more useful than pattern-matching or generative AI approaches.
An exploration of dataset bias in single-step retrosynthesis prediction
This study evaluates how dataset composition in terms of template diversity and data volume impacts the performance of single-step retrosynthesis models. They benchmarked three architectures, including template-based LocalRetro, graph-based MEGAN, and SMILES-based RootAligned. The analysis of the “Narrow split,” which sequentially increases the number of unique reaction templates, demonstrates that increasing training set diversity significantly degrades top-k accuracy across all models. For instance, increasing the template count from 1,000 to 10,000 caused a drop in top-10 accuracy between 10% and 14.5%. However, this metric is deceiving; the study finds that while the models struggle to recall the exact ground truth from a larger “vocabulary” of reactions, their chemical feasibility, measured via top-5 round-trip accuracy improves by 14–21%. This suggests that diverse training data encourages the models to propose a broader range of chemically plausible alternatives, even if they deviate from the specific patent literature baseline. Conversely, the “Broad split” experiments, which increase data volume while holding template diversity constant, highlight significant differences in model robustness. LocalRetro and MEGAN, proved remarkably data-efficient, showing minimal performance gains (less than 4%) despite a ninefold increase in training set size. This implies that these models saturate quickly once the fundamental reaction rules or graph edits are learned. In contrast, the Transformer-based RootAligned model exhibited severe instability, with performance degrading by nearly 16% at intermediate data volumes before recovering. This volatility suggests that template-free sequence models may be more susceptible to overfitting or memorization when trained on smaller, chemically diverse datasets compared to their graph-based counterparts. Furthermore, the study provides excellent analysis on impact of class imbalance in open-source datasets like USPTO, where the top 10 templates account for 30% of the training data. A strong linear correlation was observed between a template’s frequency in the training set and the model’s accuracy on that template, a trend that persists even for template-free models. This indicates that models like MEGAN and RootAligned are effectively learning the underlying dataset distribution rather than universal chemical principles. The findings strongly suggest that current SOTA models are limited to interpolating within the chemical space defined by their training templates, regardless of whether those templates are explicitly defined or implicitly learned.
Benchmarking self-driving labs
This review establishes a quantitative framework for benchmarking Self-Driving Labs through a meta-analysis of 63 experimental and retrospective benchmarks. The authors use Acceleration Factor defined as the reduction in experiments to reach a target, and the Enhancement Factor defined as the performance gain at a fixed budget, to benchmark. A key counterintuitive finding is the “blessing of dimensionality”: while the median Acceleration Factor is approximately 6, this factor scales positively with the dimensionality of the parameter space, indicating that autonomous experimentation becomes increasingly efficient relative to random sampling as problem complexity grows. Analysis of performance trajectories reveals a universal heuristic for campaign planning: the utility of active learning is transient, with Enhancement Factor consistently peaking between 10 and 20 experiments per dimension before diminishing as the reference baseline converges.
Commit: Digital pipette: open hardware for liquid transfer in self-driving laboratories
This paper introduces Digital Pipette v2, a significant upgrade to an open-source liquid handling tool designed for robotic arms in self-driving laboratories. The primary innovation is the transition from a positive displacement mechanism to an air displacement design utilizing a custom 3D-printed syringe. This change allows compatibility with standard disposable pipette tips (specifically 1–10 mL tips), effectively resolving the previous version’s limitation regarding cross-contamination when handling multiple liquids. The hardware is low-cost, totaling under $300 USD, and uses a linear actuator driven by an Arduino microcontroller to integrate with standard robot grippers. Notably, when benchmarked against experienced human operators, the digital pipette demonstrated significantly lower variance in dispensing 5 mL volumes. To support fully autonomous operation, the system includes custom 3D-printed accessories such as a tip rack and a tip remover. A novel force-feedback spiral search algorithm was implemented to facilitate precise tip attachment, reducing the need for tedious manual calibration. Pretty cool !
Long List
Cheminformatics
ChemBERTa-3: An Open Source Training Framework for Chemical Foundation Models
Context-aware computer vision for chemical reaction state detection
Semantic repurposing model for traditional Chinese ancient formulas based on a knowledge graph
SynCat: molecule-level attention graph neural network for precise reaction classification
Democratizing machine learning in chemistry with community-engaged test sets
Mol2Raman: a graph neural network model for predicting Raman spectra from SMILES representations
Data augmentation in a triple transformer loop retrosynthesis model
HelixSide: A Comprehensive Method for Local and Global Orientational Analysis of Proteins
Understanding the Role of H-Bonds in the Stability of Molecular Glue-Induced Ternary Complexes
Understanding the Kinetic Mechanism of Ligands Stabilizing the RAS–CYPA Interaction
CompBind: Complex Guided Pretraining-Based Structure-Free Protein–Ligand Affinity Prediction
PropMolFlow: property-guided molecule generation with geometry-complete flow matching
Ligand-based machine learning models to classify active compounds for prostaglandin EP2 receptor
BIPE: Artificial Intelligence-Driven Peptide Bitterness Intensity Prediction Engine
Geometry-Enhanced Multiscale Joint Representation Learning for Drug-Target Interaction Prediction
Fine-Tuning a Transformer Model for METTL3 Lead Optimization
MedChem
Other
Palate Cleanser
Toodles,
Manas










