Skip to content

[Core] specific temp-dir not work perfectly in ray 2.3, influence ray job submit to get logs, etc. #33661

@Basasuya

Description

@Basasuya

What happened + What you expected to happen

When I use ray 2.3, I found temp-dir can not work correctly in ray job submit mode.
ray job submit and ray job log command would not return logs to the stdout.

my analysis:

  • dashboard/modules/job/job_agent.py:get_job_logs -> dashboard/modules/job/job_manager.py:get_log_file_path -> python/ray/_private/node.py:get_logs_dir_path

  • the _log_dir is update by python/ray/_private/node.py:_init_temp
    I think maybe commit e1a8796#diff-e486e78dca43662c19b202692ed44f45c7d49405a56b9ef58c56d85ab1ae2cc8 leads to this problem, because the self._ray_params.temp_dir will not be None to trigger internal_kv_get_with_retry to query the correct temp-dir

these problem also leads to ray client mode log would output to /tmp/ray, not the specific temp-dir

Versions / Dependencies

pip install ray==2.3

Reproduction script

// on physical node
ray start --head --block --disable-usage-stats --node-ip-address={NODE_IP} --block --dashboard-host='' --temp-dir=/opt/ray
// on you laptop
export RAY_ADDRESS=http://{NODE_IP}:8265
ray job submit --runtime-env-json='{"working_dir": "./"}' -- python3 counter.py

the logs would only output at the end of Job 'raysubmit_MqJfn5Ns8f2krGFQ' succeeded

ray job logs raysubmit_MqJfn5Ns8f2krGFQ is the same

Issue Severity

Medium: It is a significant difficulty but I can work around it.

Metadata

Metadata

Assignees

Labels

P2Important issue, but not time-criticalbugSomething that is supposed to be working; but isn'tcoreIssues that should be addressed in Ray Corecore-apicore-gcsRay core global control servicestability

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions