This Week In Cheminformatics: Issue #015
Model for carbohydrate-protein interaction, AFM based protein conformation estimation and a long list of papers
Highlights
CLIMBS: Assessing Carbohydrate–Protein Interactions through a Graph Neural Network Classifier Using Synthetic Negative Data
Scoring protein-carbohydrate interactions with physics-based functions is non-trivial, which is why Luo and Parmeggiani introduce CLIMBS which is a GNN classifier specifically built to evaluate these complexes. They used computationally designed carbohydrate-binding failures as synthetic negative data to balance the experimentally solved positive structures. Technically, the model leverages fragment-level graph pooling, allowing it to learn shared structural motifs across different sugars , and it effectively accounts for often-neglected interactions like CH-pi bonding. In their benchmarks, CLIMBS demonstrated a significantly higher AUPRC than standard scoring functions like Rosetta and AutoDock, while operating with a subsecond runtime !
Estimating Protein Conformational States from High-Speed AFM Images with Molecular Dynamics and Deep Learning
Sato et al. present a way to tackle the persistent noise and temporal lag distortions in high-speed AFM imaging by pairing molecular dynamics with a Vision Transformer based autoencoder, which they call DeepAFM. The authors generate robust training sets by simulating AFM images from MD snapshots, deliberately injecting Brownian motion and line-scan lag to emulate realistic experimental artifacts. When applied to the SecYAEG-nanodisc complex, the trained model effectively focused its attention on relevant large-scale domain motions to simultaneously denoise the images and classify conformational states with high accuracy. Interesting read.
Long List
Cheminformatics
OPENXRD: a comprehensive benchmark framework for LLM/MLLM XRD question answering
Self-driving laboratories in Korea: a new era of autonomous discovery
Learning Potential Energy Surfaces of Hydrogen Atom Transfer Reactions in Peptides
Physics-informed machine learning for predicting temperature-dependent chemical properties
RAISE: A self-driving laboratory for interfacial property formulation discovery
NaviDiv: a web app for monitoring chemical diversity in generative molecular design
Functional Group Identification from Infrared Spectra via a Dual-Head Convolutional Network
Leveraging Auxiliary Potentials in RFDiffusion for the Design of NA-Binding Proteins
Landscape of Nucleic Acid Modifications Induced by Chemical Warfare Agents
CHAOS - A Large-scale Database for σ-Profiles and Other Molecular Descriptors
MetalKB: Predicting Metal Binding Sites on Proteins with a Knowledge-Based Graph Framework
Protein Large Language Models Can Predict Flavivirus Protease Target Specificity
IAP-CFDock: Iterative Anchor Prediction and Coarse-to-Fine Protein–Ligand Blind Docking
Energy-Guided Denoising Contrastive Learning for Molecular Property Prediction
MedChem
Other
Palate Cleanser
Theory of NMR Lectures by Prof. Keeler (can not recommend this enough)
what,
Manas






























