Training Speed on Multi-GPU

Want to understand the training process better? Got tips for which model to use and when? This is the place for you


Forum rules

Read the FAQs and search the forum before posting a new topic.

This forum is for discussing tips and understanding the process involved with Training a Faceswap model.

If you have found a bug are having issues with the Training process not working, then you should post in the Training Support forum.

Please mark any answers that fixed your problems so others can find the solutions.

Locked
User avatar
dheinz70
Posts: 43
Joined: Sat Aug 15, 2020 2:43 am
Has thanked: 4 times

Training Speed on Multi-GPU

Post by dheinz70 »

Also I noticed this the other day.

Distributed with a batch of 14, and only gpu1 with a batch of 7.

Shouldn't the distributed batch of 14 have roughly 2x the EG/s of the single gpu with a batch of 7?

Screenshot from 2020-10-12 17-39-26.png
Screenshot from 2020-10-12 17-39-26.png (2.61 KiB) Viewed 1405 times
User avatar
torzdf
Posts: 2649
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 128 times
Been thanked: 622 times

Re: Log and graph weirdness

Post by torzdf »

You get speed up by increasing your batchsize.

The same batchsize on a single GPU or multi GPUs is likely to run at about the same speed, or maybe slightly slower.

EDIT

Misread your message. I don't have a multi-gpu setup, so can't compare, and don't know where the "sweet-spot" is. Hopefully someone who does have a multi-gpu setup can offer some insight.

My word is final

Locked