Dynamic Learning approach for Single Domain Generalization.
A prompt-based object-centric gating module is designed to perceive object-centric features of objects. Leverage multi-modal features of CLIP (prompts describe different domain scenes). Slot-Attention multi-modal fusion module → fuse the linguistic/visual features → extract effective object-centric representations. → Generate the gating masks → dynamically select relevant object-centric features to improve generalization ability.