-
Notifications
You must be signed in to change notification settings - Fork 6.7k
Description
What happened + What you expected to happen
When I use Pool from ray.utils.multiprocessing it seams as plasma memory is not cleaned after closing pool. My code structure is like this:
from ray.util.multiprocessing import Pool
def calculate_data(i):
#returns big numpy array containing data as tensors
def parallel_data_acquisition():
my_pool = Pool(2)
new_data= my_pool.map(calculate_data,[1,2,3,4])
my_pool.close()
my_pool.join()
return new_data
Here i have use vector [1,2,3,4] just because I need something to provide to Pool.map().
for i in range(5):
my_data = []
my_data+=parallel_data_acquisition()
process(my_data) #some random function to process this data
`
After running this on every new iteration it seams as plasma memory usage is increasing gradually(I have seen this using ray memory), is there any way to fix this behavior, because after I gather data and append it to mu_data I want to release all plasma memory that is used, also on every new iteration of for loop my_data is empty, so there shouldn't be any pointers left hanging....
Does anyone have solution, or advice?
EDIT: Also what I have noticed is that Plasma memory would be released when pool is closed, but when again opening pool it would increase drastically. Like pool wasn't closed before.
Versions / Dependencies
i have installed ray[default]
Reproduction script
.
Issue Severity
High: It blocks me from completing my task.