Multi-gpu training by qinjian623 · Pull Request #12 · sel118/LaneAF

qinjian623 · 2021-05-07T07:26:30Z

Add a new file train_culane_mp.py which is single-node , multi-gpu training script.

The original train_culane.py was deleted while developing the new one, maybe we should restore it back.

Default batch_size is still 2, but it should be too small for multi-gpu training. 64 is acceptable on a 4-V100 node.

A new var metric_skips=10 is introduced. This could low the frequency of f1 metric which using CPU only.

Start training with:

CUDA_VISIBLE_DEVICES=4,5,6,7 python train_culane_mp.py ...

to control how many GPUs involved in training.

qinjian623 · 2021-05-07T07:36:10Z

Sorry for forgetting mention this,
In the file ./datasets/culane.py line 115 to line 124,
those l[1:] or l[:] differences came from an unofficial version CULane dataset, we fix part of links in the list file.

So may need some minor changes to use official CULane.

                if self.image_set == 'test':
                    self.img_list.append(os.path.join(self.data_dir_path,
                                                      l[1:]))  # l[1:]  get rid of the first '/' so as for os.path.join
                else:
                    self.img_list.append(os.path.join(self.data_dir_path, l))

                if self.image_set == 'test':
                    self.seg_list.append(os.path.join(self.data_dir_path, 'laneseg_label_w16_test', l[1:-3] + 'png'))
                else:
                    self.seg_list.append(os.path.join(self.data_dir_path, 'laneseg_label_w16', l[:-3] + 'png'))

qinjian623 added 5 commits April 28, 2021 10:43

Branch switch.

760390a

Multi-gpu training DONE.

1d110e9

.

0e20ca5

Remove old training script.

dad5742

Better another file.

43fdee1

qinjian623 mentioned this pull request May 7, 2021

Training time of CULane is too long #8

Open

Fix output_dir is None error.

fef2f8e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi-gpu training#12

Multi-gpu training#12
qinjian623 wants to merge 6 commits into
sel118:mainfrom
qinjian623:feature_multi_gpu_training

qinjian623 commented May 7, 2021

Uh oh!

qinjian623 commented May 7, 2021 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

qinjian623 commented May 7, 2021

Uh oh!

qinjian623 commented May 7, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

qinjian623 commented May 7, 2021 •

edited

Loading