Skip to content

Commit 39f972b

Browse files
authored
Add thumbnail image to bitwise consistent RL blog post (#114)
Signed-off-by: Michael Goin <[email protected]>
1 parent 8b24f2c commit 39f972b

File tree

1 file changed

+1
-0
lines changed

1 file changed

+1
-0
lines changed

_posts/2025-11-10-bitwise-consistent-train-inference.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22
layout: post
33
title: "No More Train-Inference Mismatch: Bitwise Consistent On-Policy Reinforcement Learning with vLLM and TorchTitan"
44
author: "vLLM and TorchTitan Teams"
5+
image: /assets/figures/2025-11-10-bitwise-exact-rl/reward-comparison.png
56
---
67

78
We demonstrate an open-source bitwise consistent on-policy RL run with [TorchTitan](https://github.com/pytorch/torchtitan) as the training engine and [vLLM](https://github.com/vllm-project/vllm) as the inference engine. Built on top of [vLLM's recent work on batch-invariant inference](https://docs.vllm.ai/en/latest/features/batch_invariance/), we show how to run an RL fine-tune of Qwen3 1.7B with bitwise matching training and inference numerics in [our open-sourced instructions](https://github.com/pytorch/torchtitan/tree/main/torchtitan/experiments/deterministic_vllm_rl):

0 commit comments

Comments
 (0)