TEP: only for comparing #426

puyuan1996 · 2025-10-09T04:15:14Z

No description provided.

…g and configurable reconstruction loss mode (#355) * v0.2.0 * polish(pu): add final_norm_option_in_encoder * polish(pu): polish jericho configs * tmp * fix(pu): fix world model init bug when use pretrained_model * tmp * feature(xjy): add text regularization function * feature(xjy): add decode text regularization option and related logs (#348) * fix(xjy): fixed some bug and add a function to output the decoder's text * fix(pu): fix _shift_right in decode loss * fix(xjy): add decode text function and decode_loss_mode option of reconstruction loss for jericho (#363) * Standardized the format and fixed existing bugs * resolved game_buffer bug and polished formatting * polish(xjy): standardize decode text related code for jericho (#366) * polish(xjy): delete unnecessary comments and translate CN comments into EN * fix(xjy): merged latest main branch (#368) * v0.2.0 * style(pu): use actions/upload-artifact@v3 * fix(pu): fix Union import in game_segment * style(pu): use actions/upload-artifact@v4 * test(nyz): only upload cov in macos * fix(pu): fix reanalyze_ratio compatibility with rope embed (#342) * fix(pu): fix release.yml * fix(pu): fix release.yml (#343) * fix(pu): fix release.yml * fix(pu): fix release.yml * fix(pu): fix release.yml * fix(pu): fix release.yml * fix(pu): fix release.yml * fix(pu): use actions/download-artifact@v2 * fix(pu): use actions/download-artifact@v4 * release v0.2.0 * fix(lkj): fix typo in customize_envs.md * fix(pu): adapt atari and dmc2gym env to support shared_memory (#345) * fix(pu): fix atari and dmc2gym env to support shared_memory * tmp * fix(pu): fix frame_stack_num default cfg in atari env --------- Co-authored-by: puyuan <[email protected]> * delete unnecessary comments and translate CN comments into EN * delete unnecessary comment --------- Co-authored-by: 蒲源 <[email protected]> Co-authored-by: PaParaZz1 <[email protected]> Co-authored-by: 蒲源 <[email protected]> Co-authored-by: 林楷傑 <[email protected]> Co-authored-by: puyuan <[email protected]> * latest remove unnucessary comments * fix(pu): fix compatibility * polish(pu): polish readme and requirements --------- Co-authored-by: puyuan <[email protected]> Co-authored-by: xiongjyu <[email protected]> Co-authored-by: PaParaZz1 <[email protected]> Co-authored-by: 林楷傑 <[email protected]>

…ero (#372) fix timestep and non-text-based games for muzero

Co-authored-by: puyuan <[email protected]>

…or (#378)

Co-authored-by: puyuan <[email protected]>

…orical representation ranges (#387) * feature(fir): controlled reward/value categorical representation * scaling_transform.py correction

…el (#391) * Qwen is tested as a policy in the jericho environment * fixed the bug that bad reflection cannot be collected * supports options for selecting encoder/decoder * fixed a few bugs and standardized the format * standardize the format again --------- Co-authored-by: puyuan <[email protected]>

Co-authored-by: zjowowen <[email protected]>

…evaluator_env_num (#415)

…cho training (#410)

…mon.py

…_world_models

…-clean

…ty, fix _reset_collect/eval, add adaptive policy entropy control

weight-decay

…ation option in unizero.py

…ask.py

alpha_loss

puyuan1996 and others added 30 commits June 4, 2025 00:51

fix(fir): fix timestep and non-text-based games compatibility for muz…

36fd720

…ero (#372) fix timestep and non-text-based games for muzero

fix(pu): fix dtype bug in sez buffer

9e4cb99

fix(pu): fix timestep and reward-type compatibility (#380)

a5c1343

Co-authored-by: puyuan <[email protected]>

fix(fir): fix compatibility of stochastic muzero in collector/evaluat…

8aaac01

…or (#378)

polish(fir): polish ensure_softmax function (#389)

527d355

Co-authored-by: puyuan <[email protected]>

feature(fir): enable independent configuration for reward/value categ…

2a66cfd

…orical representation ranges (#387) * feature(fir): controlled reward/value categorical representation * scaling_transform.py correction

fix(fir): fix timestep compatibility in muzero_evaluator.py (#386)

3148c7e

fix(fir): fix probabilities visualization (#393)

005cea1

polish(fir): polish softmax (#394)

c2eb518

fix(pu): fix pad dtype bug (#412)

90e44a6

Co-authored-by: zjowowen <[email protected]>

fix(pu): fix pos_in_game_segment bug in buffer (#414)

5069425

Co-authored-by: zjowowen <[email protected]>

fix(pu): fix muzero_evaluator compatibility when n_evaluator_episode>…

da2da95

…evaluator_env_num (#415)

adaptively set the config of batchsize and accumulation_steps in Jeri…

da2a62f

…cho training (#410)

polish(pu): polish comments and style in entry of scalezero

bbbe505

polish(pu): polish comments and style of ctree/tree_search/buffer/com…

bf9f965

…mon.py

polish(pu): polish comments and style of files in lzero.model

fb04c7a

polish(pu): polish comments and style of files in lzero.model.unizero…

06148e7

…_world_models

polish(pu): polish comments and style of unizero_world_models

471ae6a

polish(pu): polish comments and style of files in policy/

07933a5

polish(pu): polish comments and style of files in worker

df3b644

polish(pu): polish comments and style of files in configs

4f89dcc

Merge remote-tracking branch 'origin/main' into dev-multitask-balance…

e7a8796

…-clean

fix(pu): fix some merge typo

ab746d1

fix(pu): fix ln norm_type, fix kv_cache rewrite bug, add value_priori…

0476aca

…ty, fix _reset_collect/eval, add adaptive policy entropy control

fix(pu): fix unizero_mt

2c0a965

polish(pu): add LN in head, polish init_weight, polish adamw

84e6094

weight-decay

fix(pu): fix configure_optimizer_unizero in unizero_mt

05da638

feature(pu): add encoder-clip, label smooth, analyze_latent_represent…

06ad080

…ation option in unizero.py

tAnGjIa520 added 5 commits October 9, 2025 13:32

feature(pu): add encoder-clip, label smooth option in unizero_multit…

9f69f5a

…ask.py

fix(pu): fix tb log when gpu_num<task_num, fix total_loss += bug, polish

af99278

alpha_loss

polish(pu):polish config

bf91ca2

fix(pu): fix encoder-clip bug and num_channel/res bug

b18f892

polish(pu): polish scale_factor in DPS

bf3cd12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

TEP: only for comparing #426

TEP: only for comparing #426

Uh oh!

puyuan1996 commented Oct 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

TEP: only for comparing #426

Are you sure you want to change the base?

TEP: only for comparing #426

Uh oh!

Conversation

puyuan1996 commented Oct 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants