Skip to content

Memory leakage using ray.utils.multiprocessing.Pool #39733

@VuKe77

Description

@VuKe77

What happened + What you expected to happen

When I use Pool from ray.utils.multiprocessing it seams as plasma memory is not cleaned after closing pool. My code structure is like this:
from ray.util.multiprocessing import Pool

def calculate_data(i):
#returns big numpy array containing data as tensors
def parallel_data_acquisition():
my_pool = Pool(2)
new_data= my_pool.map(calculate_data,[1,2,3,4])
my_pool.close()
my_pool.join()
return new_data

Here i have use vector [1,2,3,4] just because I need something to provide to Pool.map().

for i in range(5):
my_data = []
my_data+=parallel_data_acquisition()
process(my_data) #some random function to process this data
`
After running this on every new iteration it seams as plasma memory usage is increasing gradually(I have seen this using ray memory), is there any way to fix this behavior, because after I gather data and append it to mu_data I want to release all plasma memory that is used, also on every new iteration of for loop my_data is empty, so there shouldn't be any pointers left hanging....
Does anyone have solution, or advice?
EDIT: Also what I have noticed is that Plasma memory would be released when pool is closed, but when again opening pool it would increase drastically. Like pool wasn't closed before.

Versions / Dependencies

i have installed ray[default]

Reproduction script

.

Issue Severity

High: It blocks me from completing my task.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions