Skip to content

Problems during parallel computing with Dask #4

@QianqianHan96

Description

@QianqianHan96

Based on the progress we got on June 27th, we managed to predict 100 timesteps with dask. I also managed to reduce the size of trained RF model from 15 GB to 245 MB. I can predict 200 timesteps (with 240GB memory).
However, when I tried to predict 1500 timesteps (the data of 1500 timesteps is 297 MB.), always hit the worker memory limit (240 GB memory), no matter how many workers I use (4/32/64), the threads_per_worker is always 1. When I requested 960 GB, still hit the worker memory limit. When I use my python script (https://github.com/EcoExtreML/Emulator/blob/main/1computationBlockTest/2read10kminput-halfhourly-0616.py) without Dask to predict the whole year 17000 steps, it used 100 GB memory. I do not understand why Dask need so much memory. Could you help give some advice in terms of this problem? The script is at https://github.com/EcoExtreML/Emulator/blob/main/1computationBlockTest/2read10kminput-halfhourly-0628.ipynb.
image

image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions