Seems unable to utilize multiple GPUs

Hi there. 

I have tried running this code on one of my machine with four RTX3090 GPUs (GPU memory 24GB for each)
```
python -m torch.distributed.launch --nproc_per_node=4 script/run.py -c config/inductive/wn18rr.yaml --gpus [0,1,2,3]
```
I do not change any other parts of this repo. However, I encountered the CUDA error saying that I need more GPU memory. Later I modified this code as follows: 
```
python script/run.py -c config/inductive/wn18rr.yaml --gpus [0]
```
and run it on a machine with one A100 GPU with 40GB GPU memory. The code runs successfully and costs roughly 32GB GPU memory. I am really puzzled for this: why the code does not properly utilize the total 24GB*4=96GB GPU memory and still report a memory issue? Is there something wrong with my setups? 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Seems unable to utilize multiple GPUs #11

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Seems unable to utilize multiple GPUs #11

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions