-
Notifications
You must be signed in to change notification settings - Fork 1
Description
Based on the progress we got on June 27th, we managed to predict 100 timesteps with dask. I also managed to reduce the size of trained RF model from 15 GB to 245 MB. I can predict 200 timesteps (with 240GB memory).
However, when I tried to predict 1500 timesteps (the data of 1500 timesteps is 297 MB.), always hit the worker memory limit (240 GB memory), no matter how many workers I use (4/32/64), the threads_per_worker is always 1. When I requested 960 GB, still hit the worker memory limit. When I use my python script (https://github.com/EcoExtreML/Emulator/blob/main/1computationBlockTest/2read10kminput-halfhourly-0616.py) without Dask to predict the whole year 17000 steps, it used 100 GB memory. I do not understand why Dask need so much memory. Could you help give some advice in terms of this problem? The script is at https://github.com/EcoExtreML/Emulator/blob/main/1computationBlockTest/2read10kminput-halfhourly-0628.ipynb.

