Skip to content

Custom DataCollator Bug in RewardTrainer #3101

@pasztorb

Description

@pasztorb

Reproduction

If you use a custom data_collator with the RewardTrainer class defined in the trl.trainer.reward_trainer.py file, the variable max_length is not defined in Line 177 that causes an undefined error in Line 220 (and Line 236 if the eval_dataset is used).
Is there a specific reason why the max_length variable is defined within the if statement if data_collator is None? If not, I would suggest it moving outside of this statement. Currently, I have to create a new RewardTrainer class and override the init method if I wanna use a custom data collator.
Thanks!

System Info

  • Platform: Linux-5.15.0-131-generic-x86_64-with-glibc2.35
  • Python version: 3.11.6
  • PyTorch version: 2.3.1
  • CUDA device(s): NVIDIA TITAN RTX
  • Transformers version: 4.48.3
  • Accelerate version: 1.3.0
  • Accelerate config: not found
  • Datasets version: 3.2.0
  • HF Hub version: 0.28.1
  • TRL version: 0.14.0
  • bitsandbytes version: 0.45.1
  • DeepSpeed version: not installed
  • Diffusers version: not installed
  • Liger-Kernel version: not installed
  • LLM-Blender version: not installed
  • OpenAI version: not installed
  • PEFT version: 0.14.0

Checklist

  • I have checked that my issue isn't already filed (see open issues)
  • I have included my system information
  • Any code provided is minimal, complete, and reproducible (more on MREs)
  • Any code provided is properly formatted in code blocks, (no screenshot, more on code blocks)
  • Any traceback provided is complete

Metadata

Metadata

Assignees

No one assigned

    Labels

    🏋 RewardRelated to Reward modelling🐛 bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions