Graph based diffusion models, proteins are represented as graphs Graph based methods
Nodes represent atoms (often backbone atoms like Cα- this is the red carbon) or residues.
Edges connect nodes that are spatially close, typically within a distance threshold (e.g., 8-10 Å). they store the orientation and the distance between atoms
adding noise is different compared to images (continuous noise) cause need to preserve graph properties Therefore add discrete noise like masking the Residue part of the protein (R group) or Nodes or edges, Categorical Diffusion: For discrete features, defining a transition matrix that "diffuses" a discrete value to another discrete value (e.g., an amino acid type transitioning to a "mask" token). Cross-Entropy Loss: The denoising model learns to predict the original discrete values using a cross-entropy loss, rather than predicting continuous noise. The denoising model (often a GNN) itself is designed to be equivariant. so no issues in rotating and translating