feature(yzj): add multi-agent and structured observation env (GoBigger) #39

jayyoung0802 · 2023-06-01T07:05:40Z

No description provided.

lzero/mcts/tree_search/mcts_ptree_sampled.py

lzero/model/gobigger/network/activation.py

lzero/model/gobigger/network/res_block.py

lzero/policy/gobigger_muzero.py

zoo/gobigger/config/gobigger_muzero_config.py

lzero/worker/gobigger_muzero_collector.py

lzero/model/gobigger/gobigger_muzero_model.py

lzero/mcts/buffer/gobigger_game_buffer_muzero.py

lzero/entry/train_muzero_gobigger.py

…v-gobigger

lzero/entry/eval_muzero_gobigger.py

lzero/entry/utils.py

lzero/mcts/buffer/gobigger_game_buffer_efficientzero.py

lzero/mcts/buffer/gobigger_game_buffer_muzero.py

lzero/model/gobigger/network/gobigger_encoder.py

lzero/policy/gobigger_random_policy.py

lzero/worker/gobigger_muzero_collector.py

zoo/gobigger/config/gobigger_eval_config.py

zoo/gobigger/env/gobigger_env.py

lzero/entry/__init__.py

…gent obs to ptz

puyuan1996 · 2023-08-30T03:20:18Z

lzero/model/muzero_model_mlp.py

@@ -34,6 +36,7 @@ def __init__(
        discrete_action_encoding_type: str = 'one_hot',
        norm_type: Optional[str] = 'BN',
        res_connection_in_dynamics: bool = False,
+        state_encoder=None,


增加state_encoder的Type Hints以及相应的arguments注释

https://aicarrier.feishu.cn/wiki/N4bqwLRO5iyQcAkb4HCcflbgnpR 可以参考这里的提示词优化注释哈

puyuan1996 · 2023-08-30T03:22:03Z

lzero/policy/efficientzero.py

+                            beg_index = observation_shape * step_i
+                            end_index = observation_shape * (step_i + self._cfg.model.frame_stack_num)
+                            obs_target_batch_new[k] = v[:, beg_index:end_index]
+                    network_output = self._learn_model.initial_inference(obs_target_batch_new)


上面对结构化观察的处理或许可以抽象为一个函数

puyuan1996 · 2023-08-30T04:00:06Z

zoo/petting_zoo/model/model.py

+        self.encoder = FCEncoder(obs_shape=18, hidden_size_list=[256, 256], activation=nn.ReLU(), norm_type=None)
+
+    def forward(self, x):
+        x = x['agent_state']


增加注释，为什么是agent_state，x中包含哪些key，每一项的含义是什么

puyuan1996 · 2023-08-30T04:01:22Z

zoo/petting_zoo/envs/petting_zoo_simple_spread_env.py

+from pettingzoo.mpe._mpe_utils.simple_env import SimpleEnv, make_env
+from pettingzoo.mpe.simple_spread.simple_spread import Scenario
+from PIL import Image
+import pygame


optimize import

puyuan1996 · 2023-08-30T04:04:03Z

zoo/petting_zoo/envs/petting_zoo_simple_spread_env.py

+                tmp[k] = v[i]
+            tmp['action_mask'] = [1 for _ in range(*self._action_dim)]
+            ret_transform.append(tmp)
+        return {'observation': ret_transform, 'action_mask': action_mask, 'to_play': to_play}


关于'observation'的详细注释加在_process_obs()方法的overview中

puyuan1996 · 2023-08-30T04:05:54Z

lzero/worker/muzero_collector.py

+            last_game_priorities = [[None for _ in range(agent_num)] for _ in range(env_nums)]
+            # for priorities in self-play
+            search_values_lst = [[[] for _ in range(agent_num)] for _ in range(env_nums)]
+            pred_values_lst = [[[] for _ in range(agent_num)] for _ in range(env_nums)]


这样出现多次的代码段，或许可以抽象为class的一个工具函数

puyuan1996 · 2023-08-30T04:09:00Z

zoo/petting_zoo/config/__init__.py

@@ -0,0 +1 @@
+from .ptz_simple_spread_ez_config import main_config, create_config


所有lz中的petting_zoo换成pettingzoo或许更加简洁

puyuan1996 · 2023-12-03T14:24:03Z

lzero/mcts/buffer/game_buffer_efficientzero.py

@@ -44,6 +46,8 @@ def __init__(self, cfg: dict):
        self.base_idx = 0
        self.clear_time = 0

+        self.tmp_obs = None # for value obs list [46 + 4(td_step)] not < 50(game_segment)


优化注释，注释尽量完整清晰

puyuan1996 · 2023-12-03T14:24:43Z

lzero/mcts/buffer/game_buffer_muzero.py

+                    m_obs = value_obs_list[beg_index:end_index]
+                    m_obs = sum(m_obs, [])
+                    m_obs = default_collate(m_obs)
+                    m_obs = to_device(m_obs, self._cfg.device)


抽象为一个数据处理函数，放在utils中？

puyuan1996 · 2023-12-03T14:25:44Z

lzero/model/muzero_model_mlp.py

@@ -34,6 +36,7 @@ def __init__(
        discrete_action_encoding_type: str = 'one_hot',
        norm_type: Optional[str] = 'BN',
        res_connection_in_dynamics: bool = False,
+        state_encoder=None,


https://aicarrier.feishu.cn/wiki/N4bqwLRO5iyQcAkb4HCcflbgnpR 可以参考这里的提示词优化注释哈

puyuan1996 · 2023-12-03T14:26:38Z

lzero/policy/multi_agent_efficientzero.py

+    """
+    Overview:
+        The policy class for Multi Agent EfficientZero.
+    """


说明目前的Multi Agent算法与单agent算法的区别，概述一下目前的indepent learning的实现方式。

puyuan1996 · 2023-12-03T14:26:52Z

lzero/policy/multi_agent_efficientzero.py

+                    )
+                    # NOTE: Convert the ``action_index_in_legal_action_set`` to the corresponding ``action`` in the entire action set.
+                    action = np.where(action_mask[i] == 1.0)[0][action_index_in_legal_action_set]
+                output[i // agent_num]['action'].append(action)


增加注释

puyuan1996 · 2023-12-03T14:27:02Z

lzero/policy/multi_agent_muzero.py

+    """
+    Overview:
+        The policy class for Multi Agent MuZero.
+    """


zoo/gobigger/config/gobigger_eval_config.py

puyuan1996 · 2023-12-03T15:08:35Z

zoo/gobigger/env/gobigger_env.py

+from ding.utils import ENV_REGISTRY, deep_merge_dicts
+import math
+from easydict import EasyDict
+try:


加一下GoBigger原来仓库的链接，以及这里与其的区别吧？

try except中加了链接

puyuan1996 · 2023-12-03T15:10:13Z

zoo/petting_zoo/config/ptz_simple_spread_mz_config.py

+
+main_config = dict(
+    exp_name=
+    f'data_mz_ctree/{env_name}_muzero_ns{num_simulations}_upc{update_per_collect}_rr{reanalyze_ratio}_seed{seed}',


目前这里的ptz_simple_spread_mz性能是如何的呀？如果不太好，先把ptz相关的去掉吧

puyuan1996 · 2023-12-03T15:11:30Z

zoo/petting_zoo/entry/train_muzero.py

+        max_env_step: Optional[int] = int(1e10),
+) -> 'Policy':  # noqa
+    """
+    Overview:


之前为什么需要为ptz单独写entry呢？

因为需要单独传encoder

puyuan1996 · 2023-12-07T07:57:13Z

lzero/entry/train_muzero.py

@@ -47,12 +47,12 @@ def train_muzero(
    """


合并一下main分支，将mz ez的相关基线结果加在PR的description里面。然后优化好后新建一个分支 multi-agent, push到opendilab/lightzero 上去，在这个PR后面写一下，最新的稳定代码放在了 multi-agent 这个分支上面。

jayyoung0802 added 2 commits June 1, 2023 00:21

feature(yzj): adapt multi agent env gobigger with ez

ec0ba9d

fix(yzj): fix data device bug in gobigger ez pipeline

2c29842

puyuan1996 self-assigned this Jun 1, 2023

puyuan1996 added the enhancement New feature or request label Jun 1, 2023

jayyoung0802 added 9 commits June 1, 2023 22:09

feature(yzj): add vsbot with ez pipeline and add eat-info in tensorboard

335b0fc

feature(yzj): add vsbot with mz pipeline and polish model and buffer

0875e74

polish(yzj): polish gobigger env

d88d79c

feature(yzj): adapt multi agent env gobigger with sez

17992eb

feature(yzj): add gobigger visualization and polish gobigger eval config

4925d01

fix(yzj): fix eval_episode_return and polish env

b8e044e

polish(yzj): polish gobigger env pytest

f229b6a

polish(yzj): polish gobigger env and eat info in evaluator

4bbbeb0

fix(yzj): fix np.pad bug, which need padding_num>0

7529170

puyuan1996 reviewed Jun 12, 2023

View reviewed changes

jayyoung0802 added 13 commits June 13, 2023 20:07

polish(yzj): contain raw obs only on eval mode for save memory

85aeacf

fix(yzj): fix mcts ptree sampled value/value-prefix bug

f146c4d

polish(yzj): polish gobigger encoder model

47b145e

polish(yzj): polish gobigger encoder model with ding

2772ffd

polish(yzj): polish gobigger entry evaluator

e36e752

feature(yzj): add eps_greedy and random_collect_episode in gobigger ez

7098899

fix(yzj): fix key bug in entry utils when random collect

b94deae

fix(yzj): fix gobigger encoder bn bug

dfa4671

polish(yzj): polish ez config and set eps as 1.5e4 learner iter

ff11821

polish(yzj): polish code style by format.sh

a95c19c

polish(yzj): polish code comments about gobigger in worker/policy/entry

6da2997

feature(yzj): add eps_greedy and random_collect_episode in gobigger mz

a2ca5ee

Merge branch 'main' of https://github.com/opendilab/LightZero into de…

249d88a

…v-gobigger

puyuan1996 assigned PaParaZz1 Jun 25, 2023

puyuan1996 requested changes Jun 25, 2023

View reviewed changes

puyuan1996 reviewed Jun 25, 2023

View reviewed changes

zoo/gobigger/env/gobigger_env.py Show resolved Hide resolved

jayyoung0802 added 2 commits August 8, 2023 19:30

polish(yzj): polish gobigger collector and config to support t2p3

b6dca69

feature(yzj): add fc encoder on ptz env instead of identity

09a4440

puyuan1996 reviewed Aug 9, 2023

View reviewed changes

lzero/entry/__init__.py Outdated Show resolved Hide resolved

jayyoung0802 added 18 commits August 10, 2023 17:56

polish(yzj): polish buffer name and remove ignore done in atari config

407329a

fix(yzj): fix ssl data bug and polish to_device code

592fab1

fix(yzj): fix policy utils obs batch

3392d61

fix(yzj): fix collect mode and eval mode to device

9337ce3

fix(yzj): fix to device bug on policy utils

deab811

polish(yzj): polish multi agent game buffer code

705b5f9

polish(yzj): polish code

43b2bb5

fix(yzj): fix priority bug, polish priority related config, add all a…

3d88a17

…gent obs to ptz

polish(yzj): polish train entry

a09517a

polish(yzj): polish gobigger config

714ba4b

polish(yzj): polish best gobigger config on ez/mz

0ee0122

polish(yzj): polish collector to adapt multi-agent mode

71ce58e

polish(yzj): polish evaluator conflicts

05c025d

polish(yzj): polish multi agent model

5bec18b

polish(yzj): sync main

5d310ba

polish(yzj): polish gobigger entry and evaluator

920dc38

feature(yzj): add pettingzoo visualization

1c1fde9

polish(yzj): polish ptz config and model

72c669b

puyuan1996 reviewed Aug 30, 2023

View reviewed changes

feature(yzj): add ptz simple ez config

11ef08f

puyuan1996 mentioned this pull request Sep 27, 2023

how to solve reward dropping after reaching super humain level #97

Closed

puyuan1996 mentioned this pull request Oct 19, 2023

Support for dictionaries as observations? #115

Closed

puyuan1996 reviewed Dec 3, 2023

View reviewed changes

polish(yzj): polish code base

1e143bc

puyuan1996 reviewed Dec 7, 2023

View reviewed changes

Merge remote-tracking branch 'origin' into dev-gobigger

3e1e62f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feature(yzj): add multi-agent and structured observation env (GoBigger) #39

feature(yzj): add multi-agent and structured observation env (GoBigger) #39

jayyoung0802 commented Jun 1, 2023

puyuan1996 Aug 30, 2023

puyuan1996 Dec 3, 2023

jayyoung0802 Dec 7, 2023

puyuan1996 Aug 30, 2023

puyuan1996 Aug 30, 2023

puyuan1996 Aug 30, 2023

puyuan1996 Aug 30, 2023

puyuan1996 Aug 30, 2023

puyuan1996 Aug 30, 2023

puyuan1996 Dec 3, 2023

jayyoung0802 Dec 7, 2023

puyuan1996 Dec 3, 2023

puyuan1996 Dec 3, 2023

puyuan1996 Dec 3, 2023

jayyoung0802 Dec 7, 2023

puyuan1996 Dec 3, 2023

jayyoung0802 Dec 7, 2023

puyuan1996 Dec 3, 2023

jayyoung0802 Dec 7, 2023

puyuan1996 Dec 3, 2023

jayyoung0802 Dec 7, 2023 •

edited

Loading

puyuan1996 Dec 3, 2023

jayyoung0802 Dec 7, 2023

puyuan1996 Dec 3, 2023

jayyoung0802 Dec 7, 2023

puyuan1996 Dec 7, 2023

		@@ -0,0 +1 @@
		from .ptz_simple_spread_ez_config import main_config, create_config

feature(yzj): add multi-agent and structured observation env (GoBigger) #39

Are you sure you want to change the base?

feature(yzj): add multi-agent and structured observation env (GoBigger) #39

Conversation

jayyoung0802 commented Jun 1, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jayyoung0802 Dec 7, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jayyoung0802 Dec 7, 2023 •

edited

Loading