Skip to content

[BUG] Does ibverbs transport support transferring large volumes of data? #474

@irexyc

Description

@irexyc

It seems that ibverbs transport doesn't support transferring large volumes of data.

The max_msg_sz of my device is 1GB (0x40000000) and when I transfer data that larger than 1GB (like 1GB + 4bytes), an error occured as below.

$ ibv_devinfo -v | grep max_msg_sz
			max_msg_sz:		0x40000000
			max_msg_sz:		0x40000000
			max_msg_sz:		0x40000000
			max_msg_sz:		0x40000000
			max_msg_sz:		0x40000000
			max_msg_sz:		0x40000000
			max_msg_sz:		0x40000000
			max_msg_sz:		0x40000000

command

./gloo/benchmark/benchmark -s 2 -r 0 -h 110.110.8.158 -p 6379 -x 123 -t ibverbs --no-verify broadcast --messages 1 --elements 268435457 
./gloo/benchmark/benchmark -s 2 -r 1 -h 110.110.8.158 -p 6379 -x 123 -t ibverbs --no-verify broadcast --messages 1 --elements 268435457 

error

[/nvme1/chenxin/ws/test/gloo/gloo/transport/ibverbs/pair.cc:587] ERROR LID: 0 QPN: 5728 PSN: 3360883->LID: 0 QPN: 5727 PSN: 3360883: Exception in handleCompletion: [enforce fail at /nvme1/chenxin/ws/test/gloo/gloo/transport/ibverbs/pair.cc:681] wc->status == IBV_WC_SUCCESS. 1 vs 0. Memory region recv for slot 0: local length error
[/nvme1/chenxin/ws/test/gloo/gloo/transport/ibverbs/device.cc:230] ERROR Exception while handling completion event: [enforce fail at /nvme1/chenxin/ws/test/gloo/gloo/transport/ibverbs/pair.cc:681] wc->status == IBV_WC_SUCCESS. 1 vs 0. Memory region recv for slot 0: local length error
terminate called after throwing an instance of 'gloo::EnforceNotMet'
  what():  [enforce fail at /nvme1/chenxin/ws/test/gloo/gloo/transport/ibverbs/pair.cc:681] wc->status == IBV_WC_SUCCESS. 1 vs 0. Memory region recv for slot 0: local length error
Aborted (core dumped)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions