Skip to content

R-GAT Offline Test failed with file not found in the dataset directory #2329

@arjunsuresh

Description

@arjunsuresh

Discussed in #2041

Originally posted by dineshchitlangia January 16, 2025
I setup R-GAT following the README and downloaded the full dataset.
Verified the dataset directory is 2.2TB as expected.

What am I missing?

(gnn) $:/mlperf/inference/graph/R-GAT$ python3 main.py --dataset igbh-dgl --dataset-path igbh/ --profile rgat-dgl-full --model-path $MODEL_PATH --device cpu --dtype fp32 --scenario Offline

(gnn) $:/mlperf/inference/graph/R-GAT$ INFO:main:Namespace(dataset='igbh-dgl', dataset_path='igbh/', in_memory=False, layout='COO', profile='rgat-dgl-full', scenario='Offline', max_batchsize=1, threads=1, accuracy=False, find_peak_performance=False, backend='dgl', model_name='rgat', output='output', qps=None, model_path='/mlperf/inference/graph/R-GAT/model/', dtype='fp32', device='cpu', user_conf='user.conf', audit_conf='audit.config', time=None, count=None, debug=False, performance_sample_count=5000, max_latency=None, samples_per_query=8)
/mlperf/inference/graph/R-GAT/dgl_utilities/feature_fetching.py:231: UserWarning: The given NumPy array is not writable, and PyTorch does              not support non-writable tensors. This means writing to this tensor will result in undefined behavior. You may want to copy the array to              protect its data or make it writable before converting it to a tensor. This type of warning will be suppressed for the rest of this prog             ram. (Triggered internally at ../torch/csrc/utils/tensor_numpy.cpp:206.)
  return edge, torch.from_numpy(
Traceback (most recent call last):
  File "/mlperf/inference/graph/R-GAT/main.py", line 510, in <module>
    main()
  File "/mlperf/inference/graph/R-GAT/main.py", line 363, in main
    ds = dataset_class(
  File "/mlperf/inference/graph/R-GAT/dgl_utilities/feature_fetching.py", line 131, in __init__
    self.igbh_dataset = IGBHeteroGraphStructure(
  File "/mlperf/inference/graph/R-GAT/dgl_utilities/feature_fetching.py", line 203, in __init__
    self.edge_dict = self.load_edge_dict()
  File "/mlperf/inference/graph/R-GAT/dgl_utilities/feature_fetching.py", line 237, in load_edge_dict
    loaded_edges = {
  File "/mlperf/inference/graph/R-GAT/dgl_utilities/feature_fetching.py", line 237, in <dictcomp>
    loaded_edges = {
  File "/home/amd/miniconda3/envs/gnn/lib/python3.10/concurrent/futures/_base.py", line 621, in result_iterator
    yield _result_or_cancel(fs.pop())
  File "/home/amd/miniconda3/envs/gnn/lib/python3.10/concurrent/futures/_base.py", line 319, in _result_or_cancel
    return fut.result(timeout)
  File "/home/amd/miniconda3/envs/gnn/lib/python3.10/concurrent/futures/_base.py", line 451, in result
    return self.__get_result()
  File "/home/amd/miniconda3/envs/gnn/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
    raise self._exception
  File "/home/amd/miniconda3/envs/gnn/lib/python3.10/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/mlperf/inference/graph/R-GAT/dgl_utilities/feature_fetching.py", line 232, in load_edge
    np.load(osp.join(parent_path, edge, "edge_index.npy"), mmap_mode=mmap))
  File "/home/amd/miniconda3/envs/gnn/lib/python3.10/site-packages/numpy/lib/npyio.py", line 427, in load
    fid = stack.enter_context(open(os_fspath(file), "rb"))
FileNotFoundError: [Errno 2] No such file or directory: 'igbh/full/processed/paper__written_by__author/edge_index.npy'

Contents of dataset


(gnn)$:/mlperf/inference/graph/R-GAT$ ls -R igbh/full/processed/
igbh/full/processed/:
author  paper  paper__cites__paper  train_idx.pt  val_idx.pt

igbh/full/processed/author:
author_id_index_mapping.npy  node_feat.npy

igbh/full/processed/paper:
node_feat.npy  node_label_19.npy  node_label_2K.npy  paper_id_index_mapping.npy

igbh/full/processed/paper__cites__paper:
edge_index.npy

On investigating the stack trace further:

expects

        edges = [
            "paper__cites__paper",
            "paper__written_by__author",
            "author__affiliated_to__institute",
            "paper__topic__fos"]

But it seems the dataset does not have other edges except "paper__cites__paper"

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions