[930][evaluation] implement CSVReader #932

iluise · 2025-09-19T14:23:06Z

Description

Add reader to retrieve CSV scores from files generated with Quaver. @Jubeku

quaver scores can be plotted in the FastEvaluation package by adding this in the config (you need to have them locally first):

  nhem_son2022_24h_ifs_oper_an:
    type: "csv"
    label: "IFS Quaver ERA5"
    csv_path: "./scores/scores_nhem_son2022_24h_ifs_oper_an.csv" 
    metrics_dir: "./scores/"
    metric: "rmse"
    region: "nhem"
    streams: 
      ERA5:
        channels: ["2t", "10ff", "q_850", "t_850", "z_500"]
        evaluation: 
          forecast_step: "all"
          sample: "all"

Issue Number

Closes #930

Is this PR a draft? Mark it as draft.

Checklist before asking for review

I have performed a self-review of my code
My changes comply with basic sanity checks:
- I have fixed formatting issues with ./scripts/actions.sh lint
- I have run unit tests with ./scripts/actions.sh unit-test
- I have documented my code and I have updated the docstrings.
- I have added unit tests, if relevant
I have tried my changes with data and code:
- I have run the integration tests with ./scripts/actions.sh integration-test
- (bigger changes) I have run a full training and I have written in the comment the run_id(s): launch-slurm.py --time 60
- (bigger changes and experiments) I have shared a hegdedoc in the github issue with all the configurations and runs for this experiments
I have informed and aligned with people impacted by my change:
- for config changes: the MatterMost channels and/or a design doc
- for changes of dependencies: the MatterMost software development channel

tjhunter

@iluise I have not tried it but I trust you did. Approved, and here are a couple of comments (small)

tjhunter · 2025-09-24T15:01:40Z

packages/evaluate/src/weathergen/evaluate/io_reader.py

+        self.metric = eval_cfg.get("metric")
+        self.region = eval_cfg.get("region")
+
+    def rename_channels(self) -> str:


you return a pd.DataFrame

Also, personally, I would write this as a little helper function:

class CSVReader: ... pd_data = pd.read_csv(self.csv_path, index_col=0) self.data = _rename_channels(pd_data) def _rename_channels(data) -> pd.DataFrame: # No need for self.data here

tjhunter · 2025-09-24T15:03:21Z

packages/evaluate/src/weathergen/evaluate/io_reader.py

+    def rename_channels(self) -> str:
+        """
+        Rename channel names to include underscore between letters and digits.
+        E.g., 'z500' -> 'z_500', 't850' -> 't_850', '2t' -> '2t', '10ff' -> '10ff'


also, why do we need to do that renaming? I trust you it was necessary, just put a line in the docstring

tjhunter · 2025-09-24T15:03:53Z

packages/evaluate/src/weathergen/evaluate/run_evaluation.py

        _logger.info(f"RUN {run_id}: Getting data...")

-        reader = WeatherGenReader(run, run_id, private_paths)
+        type = run.get("type", "zarr")


thank you for putting a sensible default!

tjhunter · 2025-09-24T15:07:30Z

packages/evaluate/src/weathergen/evaluate/utils.py

+            and region == reader.region
+            and stream == reader.stream
+        ):
+            data = reader.data.values[np.newaxis, :, :, np.newaxis].T


style note: you colud have

else: data = np.full( ( len(available_data.samples), len(available_data.fsteps), len(available_data.channels), 1, ), np.nan, )

tjhunter · 2025-09-24T15:13:43Z

packages/evaluate/src/weathergen/evaluate/utils.py

-        / f"{reader.run_id}_{stream}_{region}_{metric}_epoch{reader.epoch:05d}.json"
-    )
-    _logger.debug(f"Looking for: {score_path}")
+    if hasattr(reader, "data") and reader.data is not None:


hasattr is not a good habit because it is really hard for humans and type checkers to figure out if an python object has an attribute. Here is what you can do, which then vscode can rename/check for you:

class Reader: data: pd.DataFrame | None # Data attributes (if specified) def __init__(self, eval_cfg: dict, run_id: str, private_paths: dict | None = None): ... self.data = None ... # No change to WG reader or CSVReader # Now you can directly use: if reader.data is not None:

tjhunter · 2025-09-24T15:15:20Z

packages/evaluate/src/weathergen/evaluate/io_reader.py

+        self.csv_path = eval_cfg.get("csv_path")
+        assert self.csv_path is not None, "CSV path must be provided in the config."
+
+        self.data = pd.read_csv(self.csv_path, index_col=0)


one thing I would do is cast all the values to np.float32 (or float). pandas tries to be very clever and would for example use int32 if the data allows. I am not sure if xarray can deal with that later.

iluise added 4 commits September 18, 2025 16:34

first version of quaver reader

8e05a4b

working version

a87444e

update to develop

a492192

add CSVReader

4d3a63d

iluise self-assigned this Sep 19, 2025

iluise added the evaluation anything related to the model evaluation pipeline label Sep 19, 2025

iluise added this to WeatherGen-dev Sep 19, 2025

clessig requested a review from tjhunter September 19, 2025 20:39

tjhunter approved these changes Sep 24, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[930][evaluation] implement CSVReader #932

[930][evaluation] implement CSVReader #932

Uh oh!

iluise commented Sep 19, 2025 •

edited

Loading

Uh oh!

tjhunter left a comment •

edited

Loading

Uh oh!

tjhunter Sep 24, 2025

Uh oh!

tjhunter Sep 24, 2025

Uh oh!

tjhunter Sep 24, 2025

Uh oh!

tjhunter Sep 24, 2025

Uh oh!

tjhunter Sep 24, 2025

Uh oh!

tjhunter Sep 24, 2025

Uh oh!

tjhunter Sep 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[930][evaluation] implement CSVReader #932

Are you sure you want to change the base?

[930][evaluation] implement CSVReader #932

Uh oh!

Conversation

iluise commented Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Issue Number

Checklist before asking for review

Uh oh!

tjhunter left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tjhunter Sep 24, 2025

Choose a reason for hiding this comment

Uh oh!

tjhunter Sep 24, 2025

Choose a reason for hiding this comment

Uh oh!

tjhunter Sep 24, 2025

Choose a reason for hiding this comment

Uh oh!

tjhunter Sep 24, 2025

Choose a reason for hiding this comment

Uh oh!

tjhunter Sep 24, 2025

Choose a reason for hiding this comment

Uh oh!

tjhunter Sep 24, 2025

Choose a reason for hiding this comment

Uh oh!

tjhunter Sep 24, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

iluise commented Sep 19, 2025 •

edited

Loading

tjhunter left a comment •

edited

Loading