Skip to content

Conversation

@bsagevedant
Copy link

Save Intermediate Results During vf-eval

This PR addresses issue #251 by adding support for saving intermediate results during evaluation and enabling interleaved reward computation.

Changes

  • Added configuration options to Environment class:

    • save_intermediate: Enable saving intermediate results during rollout
    • interleave_rewards: Enable computing rewards after each rollout instead of batching
  • Modified run_rollouts method to:

    • Support saving intermediate results after each rollout
    • Support interleaving reward computation
    • Make both features optional and configurable
  • Added comprehensive tests in test_intermediate_results.py

Testing

Added new test cases that verify:

  • Intermediate results saving functionality
  • Interleaved reward computation
  • Configuration options
  • Integration with existing evaluation methods

Notes

  • The interleaved reward computation is optional as it's not fully compatible with some pairwise reward strategies
  • Intermediate results are logged using the environment's logger, which can be customized by the user

…omputation

- Add save_intermediate and interleave_rewards configuration options
- Modify run_rollouts to support saving intermediate results
- Add support for interleaving reward computation
- Add comprehensive tests for new functionality
@CLAassistant
Copy link

CLAassistant commented Sep 24, 2025

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ willccbb
❌ Your GitHub Username


Your GitHub Username seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

Your GitHub Username and others added 2 commits September 24, 2025 09:26
- Keep our implementation of intermediate results saving
- Adapt to upstream's interleaved reward computation changes
@willccbb
Copy link
Member

nice! looks pretty good, updated to merge with latest main -- probably will make some other edits before merging, our logic for vf-eval outputs json saving has drifted a bit from make_dataset + ideally we bring these back in sync so that intermediate saving would handle vf-eval -s directly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants