Skip to content

Simple Example with Planetoid class Fails at Data Fetch #22

Description

@ajpowelsnl

Example

A simple example fails on data fetch step.

Versions

$ conda -V
conda 24.11.2

$ conda list pytorch
# packages in environment at /opt/anaconda3:
#
# Name                    Version                   Build  Channel
pytorch                   2.4.1                  py3.12_0    pytorch
pytorch-scatter           2.1.2           py312_torch_2.4.0_cpu    pyg
pytorch-sparse            0.6.18          py312_torch_2.4.0_cpu    pyg
pytorch_geometric         2.5.3              pyhd8ed1ab_0    conda-forge

Failing Command

dataset = Planetoid(root='data/Planetoid', name='Cora', transform=NormalizeFeatures())

Error

In [32]: dataset = Planetoid(root="data/Planetoid", name="Cora", split="public", transform=NormalizeFeatures
    ...: ())
Downloading https://github.com/kimiyoung/planetoid/raw/master/data/ind.cora.x
Downloading https://github.com/kimiyoung/planetoid/raw/master/data/ind.cora.tx
Downloading https://github.com/kimiyoung/planetoid/raw/master/data/ind.cora.allx
Downloading https://github.com/kimiyoung/planetoid/raw/master/data/ind.cora.y
Downloading https://github.com/kimiyoung/planetoid/raw/master/data/ind.cora.ty
Downloading https://github.com/kimiyoung/planetoid/raw/master/data/ind.cora.ally
Downloading https://github.com/kimiyoung/planetoid/raw/master/data/ind.cora.graph
Downloading https://github.com/kimiyoung/planetoid/raw/master/data/ind.cora.test.index
Processing...
---------------------------------------------------------------------------
UnpicklingError                           Traceback (most recent call last)
Cell In[32], line 1
----> 1 dataset = Planetoid(root="data/Planetoid", name="Cora", split="public", transform=NormalizeFeatures())

File /opt/anaconda3/lib/python3.12/site-packages/torch_geometric/datasets/planetoid.py:102, in Planetoid.__init__(self, root, name, split, num_train_per_class, num_val, num_test, transform, pre_transform, force_reload)
     99 self.split = split.lower()
    100 assert self.split in ['public', 'full', 'geom-gcn', 'random']
--> 102 super().__init__(root, transform, pre_transform,
    103                  force_reload=force_reload)
    104 self.load(self.processed_paths[0])
    106 if split == 'full':

File /opt/anaconda3/lib/python3.12/site-packages/torch_geometric/data/in_memory_dataset.py:81, in InMemoryDataset.__init__(self, root, transform, pre_transform, pre_filter, log, force_reload)
     72 def __init__(
     73     self,
     74     root: Optional[str] = None,
   (...)
     79     force_reload: bool = False,
     80 ) -> None:
---> 81     super().__init__(root, transform, pre_transform, pre_filter, log,
     82                      force_reload)
     84     self._data: Optional[BaseData] = None
     85     self.slices: Optional[Dict[str, Tensor]] = None

File /opt/anaconda3/lib/python3.12/site-packages/torch_geometric/data/dataset.py:115, in Dataset.__init__(self, root, transform, pre_transform, pre_filter, log, force_reload)
    112     self._download()
    114 if self.has_process:
--> 115     self._process()

File /opt/anaconda3/lib/python3.12/site-packages/torch_geometric/data/dataset.py:260, in Dataset._process(self)
    257     print('Processing...', file=sys.stderr)
    259 fs.makedirs(self.processed_dir, exist_ok=True)
--> 260 self.process()
    262 path = osp.join(self.processed_dir, 'pre_transform.pt')
    263 fs.torch_save(_repr(self.pre_transform), path)

File /opt/anaconda3/lib/python3.12/site-packages/torch_geometric/datasets/planetoid.py:161, in Planetoid.process(self)
    160 def process(self) -> None:
--> 161     data = read_planetoid_data(self.raw_dir, self.name)
    163     if self.split == 'geom-gcn':
    164         train_masks, val_masks, test_masks = [], [], []

File /opt/anaconda3/lib/python3.12/site-packages/torch_geometric/io/planetoid.py:27, in read_planetoid_data(folder, prefix)
     25 def read_planetoid_data(folder: str, prefix: str) -> Data:
     26     names = ['x', 'tx', 'allx', 'y', 'ty', 'ally', 'graph', 'test.index']
---> 27     items = [read_file(folder, prefix, name) for name in names]
     28     x, tx, allx, y, ty, ally, graph, test_index = items
     29     train_index = torch.arange(y.size(0), dtype=torch.long)

File /opt/anaconda3/lib/python3.12/site-packages/torch_geometric/io/planetoid.py:107, in read_file(folder, prefix, name)
    105 with fsspec.open(path, 'rb') as f:
    106     warnings.filterwarnings('ignore', '.*`scipy.sparse.csr` name.*')
--> 107     out = pickle.load(f, encoding='latin1')
    109 if name == 'graph':
    110     return out

UnpicklingError: invalid load key, '<'.

Many thanks for any guidance you can offer.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions