Explore

Decentralized MARL

⁠

Decentralized MARL

Decentralized MARL

Paper Name

Categories

Type

Concept

Recent issues

Motivations

Contributions

Evaluation List

Target

Conf./Jour.

Year

Link

Coach-Player Multi-Agent Reinforcement Learning for Dynamic Team Composition

Centralized Training Decentralized Execution (CTDE).

Coach with full-view // players have partial-views

Coach can distribute information → agents with limited amount.

Most deep CTDE for cooperative MARL limited to a fixed number of homogeneous agents (C1).

Computationally prohibitive to re-train the agents (C2).

Agent can only access to its own decisions and partial environmental observations at test-time (C3).

All agents to communicate is too expensive in many scenarios (C4).

Generalize zero-shot to new compositions.

C3: Introducing communication

C4: Centralized coach → periodically distributes strategic information (full view).

Communication through continuous strategy vector.

Variational objective → regularize learning of the strategy.

Adaptive policy → coach only communicate if needed.

Strategy vector is encoded using VAE-based Encoder.

Resource Collection

Rescue Game

Starcraft Multi-agent Challenge (SMAC).

De-confounded Value Decomposition for Multi-Agent Reinforcement Learning

Centralized Training Decentralized Execution (CTDE).

- Credit Assignment: deduce contributions of individual agents from overall success.

On Improving Model-Free Algorithms for Decentralized Multi-Agent Reinforcement Learning

Decentralized Training.

Sample-efficient Model-free Algorithm.

Exponential dependence as it usually needs to exhaustively search the joint action space.

Computation bottleneck can be solved by communications → distributing workload.

Communication-based → communication overheads.

Stage-based V-Learning for General-Sum Markov Games.

Learning CCE

Learning CE

Learning NE in Markov Potential Games.

There are no rows in this table

⁠

Want to print your doc?
This is not the way.

Try clicking the ··· in the right corner or using a keyboard shortcut (

CtrlP

) instead.