Skip to content

Process on GPU killed after long run and/or restart #348

@Qcellaris

Description

@Qcellaris

Dear developers,

I was using PySPH on my old GPU (GeForce GTX 680) for a while and I recently started using it on a newer GPU (NVIDIA TITAN V) as well. Unfortunately, there are some strange issues showing up, so hopefully someone can help me out here:

  1. When I run a simulation for a longer while, e.g. 8 hours or so, it suddenly gets killed. The resulting error log can be find in the attachment.
  2. When I try to restart the simulation from any of the previous restart files it gets killed again right away while showing the same error messages.
  3. The issue also shows up when want to restart a simulation that has run for a short time, say 30 minutes, and which finished correctly.

This problem doesn't appear on my GTX 680 but on that machine I am using an older version of PyOpenCL. I tried using this same older version of PyOpenCL on the TITAN V but it didn't solve the issue.

Best,

Stephan

gpu_err.log

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions