Skip to content

Conversation

@nikita-savelyevv
Copy link
Collaborator

@nikita-savelyevv nikita-savelyevv commented Oct 31, 2025

What does this PR do?

Changes

Add an extra value to input dict when wrapping infer request of a stateful model. This feature is not part of the latest NNCF v.2.19 release and depends on NNCF develop version.

Reason for changes

Quantization quality of stateful Whisper models is poor because a state of a stateful model must be cleared with the same schedule as it is done during calibration input data collection.

Ticket CVS-172705

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@nikita-savelyevv nikita-savelyevv changed the title [OpenVINO] Update InferRequestWrapper to control stateful model state resetting [OpenVINO] Update InferRequestWrapper to collect samples taking into account state of stateful models Oct 31, 2025
AlexanderDokuchaev pushed a commit to openvinotoolkit/nncf that referenced this pull request Dec 2, 2025
…ful OV models (#3714)

### Changes

Added `nncf.definitions.NNCF_DATASET_RESET_STATE_KEY` constant to
specify when to reset model state. This constant is used by OpenVINO
backend to control resetting of internal model state between model
inferences. This key can be added to a dataset sample input dictionary
with either `True` or `False` value. With `True` value, the model state
will be reset before inference on the corresponding sample, and with
`False` the state will not be reset.

For an example of usage please see
huggingface/optimum-intel#1505.

### Reason for changes

Without this logic static quantization quality of stateful Whisper
models is poor because a state of a stateful model must be cleared with
the same schedule as it is done during calibration input data
collection.

### Related tickets

172705

### Tests

Added
`tests/openvino/native/test_engine.py::test_stateful_model_inference_with_controlled_resetting`.
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enhances the InferRequestWrapper class to properly handle state management for stateful OpenVINO models during quantization calibration. The changes ensure that state reset operations are tracked and communicated to NNCF's calibration process, which is critical for accurate quantization of stateful models like Whisper.

Key changes:

  • Added state tracking mechanism to detect and record when model state is reset
  • Modified input collection to include state reset information for NNCF calibration (version 2.19+)
  • Implemented reset_state() method wrapper to track state reset calls

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@nikita-savelyevv nikita-savelyevv added the openvino-nightly Runs OpenVINO nightly and NNCF develop tests label Dec 2, 2025
@nikita-savelyevv nikita-savelyevv force-pushed the ns/stateful-models-quantization branch from 76a11e5 to 03ce84c Compare December 2, 2025 17:56
@nikita-savelyevv nikita-savelyevv marked this pull request as ready for review December 3, 2025 08:24
Copy link
Collaborator

@echarlaix echarlaix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Collaborator

@echarlaix echarlaix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for iterating, feel free to merge!

@nikita-savelyevv nikita-savelyevv merged commit f2fa597 into main Dec 9, 2025
38 of 39 checks passed
@nikita-savelyevv nikita-savelyevv deleted the ns/stateful-models-quantization branch December 9, 2025 08:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

openvino-nightly Runs OpenVINO nightly and NNCF develop tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants