Skydvn-PNU

Federated Segmentation

Federated Unlearning - Adversarial Recovery

Diffusion Models

Reasoning/Symbolic: A Survey

Decentralized MARL: A survey

Resource Allocation - SemCom: A survey

Explore

DG

⁠

Copy of Table

Copy of Table

Paper Name

Categories

Type

Recent issues

Motivations

Contributions

Target

Evaluation List

Conf./Jour.

Year

Link

Exploiting Domain-specific Features to Enhance DG

Domain-invariant features

Theoretical

Only consider domain-invariant

Ignore domain-specific

Extend beyond invariance view

Disentangle and joint learning

Confirms that domain-specific is essential.

Preserve domain-invariant info.

t-NSE to visualize the features → domain-invariant still makes mistake.

NIPS

A simple feature augmentation for DG

Data Augmentation

Architecture Design

Engineering

Relying on image-space data aug.

Limited data diversity.

Require careful augmentation.

Where to add SFA?

Noise type for SFA.

Hyper-parameters.

Visualize using t-SNE.

Incorporating methods.

CVPR

Domain-invariant Disentangled Network for Generalizable Object Detection

Domain-invariant features

Architecture Design

Object Detection

Engineering

Object detection has seldom being explored.

Effectiveness of each component.

Hyper-parameters.

Visualization (for fun/ no meaning).

Cross-domain Semantic Segmentation via Domain-invariant Iterative

Architecture Design

Domain-invariant features

Engineering

Domain Generalization via Entropy Regularization

Domain-invariant features

Loss Function Design

Theoretical

General task

Can only guarantee features have invariant marginal distributions.

invariance of conditional distributions more important.

Ensure conditional invariance → entropy regularization.

Different weighting factors.

Deeper Network.

Class imbalance.

Feature visualization (t-SNE).

NIPS

2020

Model-based Domain Generalization

Domain-invariant features

Loss Function Design

Theoretical

General task

Capture inter-domain variation.

1st learn transformation map data → enforce invariance.

Re-formulate the domain generalization problem → semi-infinite constrained optimization problem.

NIPS

2021

Learning to learn single Domain Generalization

Data Augmentation

Loss Function Design

Engineering

General task

Only 1 source domain, many unseen domains.

Leverage adversarial training.

Create fictitious, challenging data.

Use meta-learning scheme.

Wasserstein Autoencoder (WAE).

Features (t-SNE) visualization.

Hyper-parameters tuning.

Loss function validation.

Meta vs. Without Meta.

CVPR

2020

Domain Generalization with Mixstyle

Architecture Design

Mixing

Theoretical

General task

Where to apply?

Mixing vs. Replacing

Random vs. fixed shuffle at multiple layers.

Hyper-parameters.

ICLR

2021

Learning to diversify for Single Domain Generalization

Data Augmentation

Engineering

General task

Visualize (t-SNE) target features.

Hyper-parameters.

ICCV

2021

Gradient matching for Domain Generalization

Loss Function Design

Domain-invariant features

Theoretical

General task

Tracking GIP.

Random Grouping → domains show no shifting → no focus on learning matching? → bigger domain shift, better Fish.

Hyper-parameters.

Ablation on pretrained-models.

ICLR

2022

A Fourier-based Framework for Domain Generalization

Data Augmentation

Architecture Design

Engineering

General task

Phase component → high-level semantics

Magnitude component → low-level semantics

Fourier-based data augmentation.

Co-teacher regularization.

Different components impact: AM, a2o_co-teacher, o2a_co-teacher, Teacher (turn on/off components).

Other choice of Fourier-based data augmentation (AM vs. AS).

CVPR

2021

Progressive Domain Expansion Network for Single Domain Generalization.

Object Detection

Engineering

Limited generalization performance gains

Lack appropriate safety and effectiveness constraints.

Domain expansion network.

Generated domain → progressively expanded.

Contrastive learning → learn cross-domain invariant representation.

Visualize (t-SNE) feature space.

Tuning hyper-parameters.

CVPR

2021

SWAD: Domain Generalization by seeking flat minima

Architecture Design

Theoretical

General task

Simply minimizing ERM / complex, non-convex loss landscape → not sufficient.

Flat minima leads to robustness against the loss landscape shift.

Use dense stochastic weight averaging (D-SWA) → make the loss landscape flatter.

Local flatness analysis

Loss surface visualization

Validation accuracy/rounds

Different components.

NIPS

2021

Causality Inspired representation learning for DG

Domain-invariant features

Data Augmentation

Architecture Design

Engineering

General task

Remove components.

Visualize attention map

Independence of causal representation.

Representation Importance (

LDR

⁠

)

Hyper-parameter sentivity.

CVPR

2023

SelfReg: Self-supervised Contrastive Regularization for Domain Generalization

Self-supervised Learning

Loss Function Design

Engineering

General task

Require sampling of the negative data pair.

CL performance depends on quality/quantity of negative data pairs.

Only use positive data pairs → resolve problems caused by negative data pair sampling.

Self-supervised Contrastive Learning.

Class-specific domain perturbation layer → apply mixup augmentation (only positive pairs are used).

Visualize (t-SNE) the latent spaces

Different dissimilarity losses (logot only / feature only).

Use

GradCAM

⁠

to visualize where network focuses on.

Removing each components (Losses, Mixup, CDPL, SWA, IDCL).

ICCV

2021

C-Mixup: Improving Generalization in Regression

Mixing

Data Augmentation

Regression

Theoretical

Systematic analysis of mixup in regression remained unexplored.

Can result in incorrect labels.

Adjust the sampling probability based on the similarity of the labels.

Mixing on input data and label.

Generalization gap / Epochs.

Pair-wise divergence (averaging over class / domains).

Compatibility of C-Mixup (Integrate with other algorithms).

C-Mixup vs. other distance metrics.

Difference hyper-parameters (e.g., bandwidth).

NIPS

2022

Instance-Aware Domain Generalization for Face Anti-Spoofing

Data Augmentation

Architecture Design

Engineering

Artificial domain labels are coarse-grained and subjective, which cannot reflect real domain accurately.

Focus on domain-level alignment, not fine-grained enough to ensure that learned representations are insensitive to domain-style.

Align features on instance-level.

Dynamic Kernel Generator

Categorical Style Assembly.

Asymmetric Instance Adaptive Whitening.

Remove components.

Different losses (replace).

Different style augmentation.

Different kernel designs.

RobustNet: Improving Domain Generalization in Urban-Scene Segmentation via Instance Selective Whitening.

Architecture Design

Segmentation

Collecting multi-domain dataset is costly and labor-intensive.

Performance highly depends on the number of source datasets.

Exploit instance normalization layers → feature covariance contains domain-specific style such as texture and color.

Whitening transformation removes feature correlation and makes each feature have unit variance → eliminates domain-specific style information → may improve, but not fully explored DG

Instance selective whitening.

Whitening loss.

Disentangled Prompt Representation for Domain Generalization

Data Augmentation

Data Generation

Engineering

General task

PromptStyler: Prompt-driven Style Generation for Source-free Domain Generalization

Data Augmentation

Data Generation

Engineering

General task

Data of source and target domain are not accessible.

Only target task definition is given.

Large-scale vision language could shed light on this challenging source-free domain generalization.

Prompt-Driven Dynamic Object-Centric Learning for Single Domain Generalization

Data Augmentation

Data Generation

Engineering

General task

2 existing approaches for single domain generalization: data augmentation + feature disentanglement. Those methods mainly focus on static network.

Static networks lack the capability to dynamically adapt to the diverse variations in different visual scenes, which limits the representation power of the models.

Each image may have its unique characteristics (e.g., variations in lighting conditions, object appearances, scene structures).

Object-centric representations robust to variations in appearance, context, scene complexity.

Dynamic Learning approach for Single Domain Generalization.

A prompt-based object-centric gating module is designed to perceive object-centric features of objects.

Leverage multi-modal features of CLIP (prompts describe different domain scenes).

Slot-Attention multi-modal fusion module → fuse the linguistic/visual features → extract effective object-centric representations.

→ Generate the gating masks → dynamically select relevant object-centric features to improve generalization ability.

Disentangled Prompt Representation for Domain Generalization

Data Augmentation

Data Generation

Engineering

General task

Large-scale pre-trained models greatly enhance domain generalization.

Pre-trained Visual Foundation Model (VFM): trained by utilizing large-scale (image, text) pairs → rich in semantic information of prior knowledge.

VFMs are able to encode semantic meanings of visual descriptions (regardless of styles).

Fine-tuning pre-trained foundation models with new datasets → achieve better results on downstream tasks with few training samples.

ISSUES:

Existing prompt tuning methods tune the foundation model to generate domain and task-specific features, whereas domain generalization requires the model to generate domain-invariant features that works well across different unseen domains → Crucial to develop prompts that can guide the foundation model in disentangling invariant features across all domains.

Fully leverage a distinctive aspect of VFM (controllable and flexible language prompt).

Text prompt plays a vital role → guide the disentanglement of image feature.

Text modality in VFM can be more easily disentangled (rich in semantic information and interpretable).

prompt tuning framework for DG with LLM-assist text prompt disentanglement + text-guided visual representation disentanglement model.

Domain-invariant + domain-specific descriptions are first generated with LLM (for prompt tuning to learn disentangled textual features).

Learned disentangled textual features → guide the learning of domain-invariant and domain-specific visual features.

To classify images from unseen domains → leveraging domain-specific knowledge from similar seen domains is essential → domain-specific prototypes will be selected for images from different unseen domain.

STYLIP: Multi-Scale Style-Conditioned Prompt Learning for CLIP-based Domain Generalization

Data Augmentation

Data Generation

Engineering

General task

Unknown Prompt, the only Lacuna: Unveiling CLIP’s Potential for Open Domain Generalization

Data Augmentation

Data Generation

Engineering

General task

Key research gaps in using CLIP for Open DG (Unseen domain may contains new labels/categories):

Prompt design:

Multi-class classification over one-against-all recourse for ODG.

Domain-agnostic visual embeddings.

Unify the classification of known classes and outliers using CLIP → unknown-class prompt.

Gather training data → generate pseudo-open images that semantically distinct from existing categories → opt to pre-trained conditional diffusional model.

Learning Domain Invariant Prompt for Vision-Language Models

Data Augmentation

Data Generation

Engineering

General task

Towards Principled Disentanglement for Domain Generalization

Self-supervised Learning

Data Generation

Engineering

General task

Spurious correlation.

First, diversify the inter-class variation = modeling potential seen/unseen variations.

Then, disentangle constrained DG.

Principled constrained learning formulation based on disentanglement → theoretical guarantees on empirical duality gap.

Promotes semantic invariance via constrained optimization setup.

Controllable/interpretable data generation.

CVPR

2022

Towards Unsupervised Domain Generalization

Manually labeled data can be costly or unavailable.

Unlabeled data can be more accessible.

Contrastive learning only learns robust representations against pre-defined perturbation (under IID).

Unsupervised learning discriminative representations.

Select valid source of negative samples according to the similarity among domains.

Big differences: 1) domain-related features → discriminative enough, 2) boost variance across domains.

How unsupervised learning enhances the generalization ability of models:

CVPR

2022

PCL: Proxy-based Contrastive Learning for Domain Generalization

Data Augmentation

Domain-invariant features

Using contrastive learning to learn domain-invariant representations.

Positive sample-to-sample pairs hinder model generalisation.

Replace sample-to-sample relations with proxy-to-sample relations.

Limitations. The proxy-based method makes a trade-off between sample-to-sample relations and class-to-sample relations. The model generalization is gained by sacrificing some potential useful semantic relations.

Alleviate the positive alignment issue

Proposed a novel proxy-based contrastive learning

CVPR

2022

Style Neophile: Constantly Seeking Novel Styles for Domain Generalization

Domain-invariant features

Domain-invariant representation learning by style of its image

Ccurrent style augmentation methods observe from a restricted set of external images or interpolate style of source domain

Synthetic styles generated from both styles of source domain images and previously generated synthetic styles

Method to synthesize novel, diverse and plausible styles during training

CVPR

2022

Compound Domain Generalization via Meta-Knowledge Encoding

CVPR

2022

Ensemble of Averages: Improving Model Selection and Boosting Performance in Domain Generalization

NIPS

2022

Unsupervised Domain Generalization by Learning a Bridge Across Domains

Very littles data (images and/or labels) - not sufficient to train the standard Unsupervised Domain Adaptation and Domain Generalisation.

Propose a new concept of learnable BrAD (Bridge Across Domains) - an auxiliary visually “bridge” domain. That is used only during contrastive self-supervised training to learn representations of each domains to the ones for the shared BrAD

CVPR

2022

CLIP the Gap: A Single Domain Generalization Approach for Object Detection

Architecture Design

Domain-invariant features

Performance of object detectors degrades when the test data distribution deviate from the training data one.

Not always possible to obtain target data

How to learn to generalize from a single source dataset

Approach to Single Domain Generalization for object detection.

Use CLIP to guide the training of an object detector => generalise to unseen target domains.

To improve the generalisability of an object detector, domain concepts via text-prompts is used during training (augment the diversity of the learned image features and make them more robust to an unseen target domain)

Mean Average Precision

CVPR

2023

Exact Feature Distribution Matching for Arbitrary Style Transfer and Domain Generalization

Architecture Design

Theoretical

Features distribution following to Gaussian which cannot be accurately matched and higher computationally complexity.

Proposed to perform Exact Feature Distribution Matching by exactly matching the empirical Histogram Matching ==> Sort-Matching

CVPR

2022

Domain Generalization via Shuffled Style Assembly for Face Anti-Spoofing

Architecture Design

The current DG methods almost implement on the complete representation from common modules ==> ignore fully taking adv of subtle properties of global and local image

Style transfers are inefficient in large-scale training

Using two-stream structure to extract content and style features: content info recording some global semantic features and physical attributes. style information: preserve some discriminative info

Proposed a novel Shuffled Style Assembly Network to extract and reassemble different content and style features

Using contrastive learning to emphasize liveness-related style information

Representations of the correct assemblies are used to distinguish between living and spoofing during inferring

CVPR

2022

DNA: Domain Generalization with Diversified Neural Averaging

ICML

2022

There are no rows in this table

⁠

Want to print your doc?
This is not the way.

Try clicking the ⋯ next to your doc name or using a keyboard shortcut (

CtrlP

) instead.