Skip to content

Conversation

@daidahao
Copy link
Contributor

@daidahao daidahao commented Nov 6, 2025

Checklist before merging this PR:

  • Mentioned all issues that this PR fixes or addresses.
  • Summarized the updates of this PR under Summary.
  • Added an entry under Unreleased in the Changelog.

Fixes #2943.

Summary

This PR adds base FoundationModel and Chronos2Model to Darts forecasting models.

Major Changes
  • Add FoundationModel as a base class for all foundation forecasting models. All foundation models like Chronos-2 should inherit the base to make use of PyTorch datasets, optimized historical forecasting, Lightning APIs for model training (fine-tuning), checkpointing, etc.
  • Add HuggingFaceModelMixin as a mixin class for foundation models that require downloading model configuration and weights from HuggingFace Hub. The class provides methods for downloading model config, weight files, and loading them into a PLForecastingModule instance.
  • Add Chronos2Model for zero-shot forecasting using Amazon's pre-trained checkpoint. It supports past and future covariates and can converts quantiles from Chronos-2 to Darts QuantileRegression likelihood model. By default, it is deterministic by outputting only the median quantile. It can be probabilistic with user-selected quantiles from pre-trained quantiles. Fine-tuning is not supported for now.
  • Add dependencies for huggingface-hub and safetensors, the former for downloading model files and the other for loading model weights. Both should be lightweight enough to ship with Darts.
  • Raise PyTorch to >2.0.0 for SDPA and other performance benefits.
Quickstart chronos2
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt

from darts.datasets import ElectricityConsumptionZurichDataset
from darts.models import Chronos2Model
from darts.utils.likelihood_models import QuantileRegression

# maximum prediction length w/o triggering auto-regression where the results
# would diverge from the original implementation due to different sampling methods
prediction_length = 1024
# whether to produce probabilistic forecasts
probabilistic = True
# quantiles to be used for probabilistic forecasts
quantiles = [0.1, 0.5, 0.9]

# convert to float32 due to MPS not supporting float64
ts_energy = ElectricityConsumptionZurichDataset().load().astype(np.float32)

# extract temperature, solar irradiation and rain duration
ts_weather = ts_energy[["T [°C]", "StrGlo [W/m2]", "RainDur [min]"]]

# extract households energy consumption
ts_energy = ts_energy[["Value_NE5", "Value_NE7"]]

# create train and validation splits
validation_cutoff = pd.Timestamp("2022-07-01")
ts_energy_train, ts_energy_val = ts_energy.split_after(validation_cutoff)

# load model
model = Chronos2Model(
    input_chunk_length=8192,  # maximum context length
    output_chunk_length=prediction_length,  # maximum prediction length w/o AR
    likelihood=(
        QuantileRegression(quantiles) if probabilistic else None
    ),
)
# fit model w/o fine-tuning
model.fit(ts_energy_train, future_covariates=ts_weather)

# predict on the validation inputs w/ covariates
pred = model.predict(
    n=prediction_length,
    future_covariates=ts_weather,
    predict_likelihood_parameters=probabilistic,
)

# plot results
plt.style.use('ggplot')
fig, ax = plt.subplots(figsize=(15, 5))
ts_energy_val[:prediction_length].plot(label='actual', ax=ax, c='k')
if pred.n_components == ts_energy_val.n_components:
    pred.plot(label='forecast', ax=ax)
else:
    for component in ts_energy_val.components:
        for q in quantiles:
            pred[f"{component}_q{q:.3f}"].plot(label=f'forecast_{component}_{int(q*100)}%', ax=ax)

plt.show()
Chronos-2 Adaptation

Chronos-2 was ported from amazon-science/chronos-forecasting@c23d34c and I have since made some changes to integrate within Darts.

  • Remove dependencies on transformers and einops libraries.
  • Load model config and weights from HuggingFace Hub using HuggingFaceModelMixin.
  • Remove output_attentions option from forward pass.
  • Integrate likelihood model and loss computation with Darts QuantileRegression, and remove original loss computation in forward pass.
  • Replace *Output return type with direct torch.Tensor to comply with Darts PLForecastingModule interface.
  • Replace einops rearrange operations with native PyTorch tensor operations.

The key principle here is to introduce as little dependencies as possible and implement it fully in PyTorch. I think using chronos-forecasting library is convenient but has the risk of conflicts with Darts dependencies in the future.

Fidelity Tests

For validation, I use the ElectricityConsumptionZurichDataset to generate forecasts with Darts implementation and the original. See test_chronos2.py for details. Because TSFMs might be slower than other torch models, I limit the fidelity tests to 2 (probabilistic or median).

Important Notes: Due to differences in probabilistic sampling methods, zero-shot forecasts obtained here would differ from those obtained using the original implementation when prediction horizon n is larger than 1024.

What is Missing
  • Documentation for FoundationModel (using Chronos2Model as prime example) either as part of Quickstart or User Guide.
  • Example notebook for Chronos2Model to showcase Chronos-2 capabilities.
  • CHANGELOG.
  • README model catalogue update.
  • Unit tests for Chronos2Model, maybe using a tiny mock Chronos-2 configuration.
  • Unit tests for FoundationModel.
  • Example code in Chronos2Model docstring.
  • (Optional) more fidelity tests for Chronos2Model using different datasets?
  • Some more I haven't thought of...
Discussions

There are a few points I would like to discuss here, following the discussions in #2933:

  • FoundationModel inherits from MixedCovariatesTorchModel and thus TorchForecastingModel. This is to allow optimized historical forecasts, PyTorch data loaders, Lightning APIs to be used. Would that be confusing since it also introduces a lot more parameters from TorchForecastingModel and PLForecastingModule? -> Using TorchForecastingModel as base for FoundationModel.
  • Chronos-2 has its own RINorm routine, different than Darts RINorm in io_processor(). See my notes in _Chronos2Module.forward() for details. What would be the best way to integrate both and to ensure fine-tuning support w/ normalized loss? -> Not integrated for now until fine-tuning is supported.
  • What should be included as test cases for Chronos-2, like target-only, target+past, target+future, etc.?
  • I understand fine-tuning is not a priority for now. But since Chronos-2 can be trained using the quantile loss, should we allow fine-tuning Chronos-2 for our users? -> Fine-tuning not supported for now.
  • SDPA is used default by Chronos-2 but only introduced from PyTorch 2.0.0. Should we raise torch to 2.0.0 or differentiate between using torch<2.0.0 and torch>2.0.0? -> Raised to >2.0.0.

Other Information

Co-authored-by: Zhihao Dai <[email protected]>
Co-authored-by: Zhihao Dai <[email protected]>
Co-authored-by: Zhihao Dai <[email protected]>
Co-authored-by: Zhihao Dai <[email protected]>
Co-authored-by: Zhihao Dai <[email protected]>
- Docstring for `FoundationModel`.
- Docstring for `HuggingFaceModelMixin`.
- Add `probabilistic` parameter for converting probabilistic TSFMs into
  determinstic (might not be supported by all TSFMs).

Co-authored-by: Zhihao Dai <[email protected]>
- Docstring for `Chronos2Model`.
- Docstring for `_Chronos2Module`.
- Add `probabilistic` parameter to convert Chronos2 into determinstic
  model by taking the median quantile.

Co-authored-by: Zhihao Dai <[email protected]>
@daidahao daidahao requested a review from dennisbader as a code owner November 6, 2025 19:51
@codecov
Copy link

codecov bot commented Nov 6, 2025

Codecov Report

❌ Patch coverage is 96.31148% with 18 lines in your changes missing coverage. Please review.
✅ Project coverage is 95.48%. Comparing base (ab24853) to head (f0eb77c).
⚠️ Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
darts/models/components/chronos2_submodels.py 95.14% 12 Missing ⚠️
darts/models/forecasting/chronos2_model.py 98.20% 3 Missing ⚠️
darts/models/__init__.py 50.00% 2 Missing ⚠️
darts/models/forecasting/foundation_model.py 93.33% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2944      +/-   ##
==========================================
- Coverage   95.52%   95.48%   -0.05%     
==========================================
  Files         146      150       +4     
  Lines       15710    16198     +488     
==========================================
+ Hits        15007    15466     +459     
- Misses        703      732      +29     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@daidahao daidahao changed the title Feature/ Foundation model and Chronos 2 Feature/ Foundation model and Chronos-2 Nov 6, 2025
@daidahao
Copy link
Contributor Author

daidahao commented Nov 7, 2025

@dennisbader Could you please add @abdulfatir as a reviewer? The PR is not fully ready yet due to missing parts and discussion points (see above), but we can ask @abdulfatir for review once those are completed.

@dennisbader
Copy link
Collaborator

@daidahao, is @abdulfatir not able to review already? I believe every user should be able to add a review by default.

@daidahao
Copy link
Contributor Author

daidahao commented Nov 7, 2025

@daidahao, is @abdulfatir not able to review already? I believe every user should be able to add a review by default.

Aha, my mistakes. I thought we would need to add someone explicitly to review the changes.

Copy link

@abdulfatir abdulfatir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@daidahao This looks great! Thank you for your effort. I mainly reviewed the Chronos-2 part and only have minor comments on that. Regarding the design and Darts-specific stuff, the maintainers may provide better feedback.

layer_norm_epsilon: float = 1e-6,
feed_forward_proj: str = "relu",
rope_theta: float = 10000.0,
attn_implementation: Literal["eager", "sdpa"] | None = None,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw discussion somewhere on whether SDPA would require changes to torch versions. Note that the benefit that SDPA provides for Chronos-2 is relatively minor, so if needed the default may be changed to eager. See: amazon-science/chronos-forecasting#331

Copy link
Contributor Author

@daidahao daidahao Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abdulfatir Great insight from amazon-science/chronos-forecasting#331 and I followed it too when it was first posted. Regarding the torch version, I am in favour of raising torch to >=2.0.0 since it was released more than two years ago and could bring performance benefits for Darts users including sdpa, torch.compile(), etc. I leave it as is for now and hear what @dennisbader would like to say.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you can go ahead an raise torch to >=2.0.0 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done by 32ae5c0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we are raising torch to >=2.0.0, would it be okay to remove attn_implementation option (and any eager implementation) as the user would not set it from Chronos2Model and it is always defaulting to sdpa anyway? @abdulfatir

Copy link
Collaborator

@dennisbader dennisbader left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really great stuff @daidahao, thanks a lot 🚀 It's really nice to see how nicely it can be integrated. And I think you laid a very strong foundation for future work 🌟

I added a couple of suggestions. Let me know what you think.

@abdulfatir
Copy link

@daidahao notebook looks great!

A small comment: the example only shows past covariates, and for this specific split, using past-only covariates actually worsens the results slightly. This will be very confusing for the the reader. I would recommend showing the known-future covariates scenario in the notebook instead. That is generally the more popular setting. Of course, you should put the caveat about what things could be known in the future or could be reasonably predicted (weather) to be supplied as future covariates.

@daidahao
Copy link
Contributor Author

@abdulfatir Thank you for the suggestions. I noticed the worse results too when creating the notebook. Unfortunately, there are no future covariates available from this particular dataset as the weather measurements came from a weather station and not a forecast. Creating future covariates like time of day is possible and but may not be as interesting as using covariates from the data source.

For the sake of demonstration, I will update the notebook to use weather measurements as future covariates instead. But I will also put up a caveat about one should use weather forecasts in practice. What do you think?

@abdulfatir
Copy link

For the sake of demonstration, I will update the notebook to use weather measurements as future covariates instead. But I will also put up a caveat about one should use weather forecasts in practice. What do you think?

Yes, this makes sense and is also a common approach in practice.

Copy link
Collaborator

@dennisbader dennisbader left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice updates, thanks a lot @daidahao
The example notebook looks great as well!

I added some last few minor suggestions, after that everything should be good to go 🚀

@daidahao
Copy link
Contributor Author

@dennisbader Can I suggest ignoring files like CHANGELOG.md from triggering workflows. No need to waste money on my silly typos😊

name: darts PR workflow

on:
  pull_request:
    branches:
      - master
    paths-ignore:
      - 'CHANGELOG.md'

@dennisbader
Copy link
Collaborator

@dennisbader Can I suggest ignoring files like CHANGELOG.md from triggering workflows. No need to waste money on my silly typos😊

name: darts PR workflow

on:
  pull_request:
    branches:
      - master
    paths-ignore:
      - 'CHANGELOG.md'

Yes that sounds like a good idea :) Could we do this in a separate PR?

@dennisbader
Copy link
Collaborator

Beautiful 😍 Thanks again for this great work, and also impressive how quickly you could set this up! 🚀

This will allow us to add new foundation models efficiently! Really nice to see this, kudos 🔥

@dennisbader dennisbader merged commit a49c5bd into unit8co:master Nov 16, 2025
9 checks passed
@github-project-automation github-project-automation bot moved this from In review to Done in darts Nov 16, 2025
@daidahao daidahao deleted the feature/chronos-2 branch November 16, 2025 16:55
@abdulfatir
Copy link

Indeed, great work @daidahao!

@daidahao
Copy link
Contributor Author

@abdulfatir @dennisbader Thank you both for your code reviews. Learnt a lot from this experience 😊

@dennisbader
Copy link
Collaborator

Yes, thank you too @abdulfatir for the helpful reviews!

@daidahao daidahao mentioned this pull request Nov 16, 2025
3 tasks
@dennisbader dennisbader moved this from Done to Released in darts Dec 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Released

Development

Successfully merging this pull request may close these issues.

[New Model] Chronos 2 zero shot forecasting

3 participants