a problem with multi gpu training_Very low EGs/sec

Want to understand the training process better? Got tips for which model to use and when? This is the place for you


Forum rules

Read the FAQs and search the forum before posting a new topic.

This forum is for discussing tips and understanding the process involved with Training a Faceswap model.

If you have found a bug are having issues with the Training process not working, then you should post in the Training Support forum.

Please mark any answers that fixed your problems so others can find the solutions.

Locked
User avatar
a1-kh
Posts: 3
Joined: Sun Nov 10, 2019 1:18 am
Has thanked: 3 times

a problem with multi gpu training_Very low EGs/sec

Post by a1-kh »

Hi there, i am facing a strange problem, the speed (EGs/sec) on my multi gpus is worse than a single gpu.
i know of course that you need to rise the batch size so you can benefit form you multi gpus, but it is performing strangely slow.
Here is the best values i got from my test

1 Gpu:
batch = 8
Egs/sec = 10.3

2Gpus:
batch = 16
Egs/sec = 11.3

2Gpus:
batch = 21
Egs/sec = 12.7

Specs : Win10 64bit, 2X 980ti, Ryzen 1700x.
The test : Dfl-H128
hope that you can help me figuring out what is the problem here, cause i do believe there is something wrong here.
Of course no one can expect a liner gain in speed with multi gpus, but at least shouldn't it be 70-80% much faster than 1 gpu?

hope someone could help me figuring out what is the problem here.

Ps:
also there is a problem in this version when trying to convert Deep Face Lab images data sets into alignments file, i tried an older version with a "Reformat" option in the "Alignments " tab and it worked, but this version with the "dfl" option isn't working properly.

User avatar
bryanlyon
Site Admin
Posts: 793
Joined: Fri Jul 12, 2019 12:49 am
Answers: 44
Location: San Francisco
Has thanked: 4 times
Been thanked: 218 times
Contact:

Re: a problem with multi gpu training_Very low EGs/sec

Post by bryanlyon »

The only thing that matters is the EGs/sec as a batch will actually be slower on multiple GPUs. There is some loss from syncing the GPUs and it works best with larger BSs. You also did an odd number BS on an even number of GPUs, that is not-ideal as if the GPUs match (2x 980tis in your case do) then you'll have one GPU waiting on nothing while the other GPU is working on that last EG.

The basics of finding the ideal BS is to follow the steps in viewtopic.php?f=6&t=124 for one GPU, then double it when you enable the 2 GPUs. The larger the BS the better the multiGPU scaling will be. A small BS like 8 per GPU is going to be a rather small improvement.

Also, depending on available PCI-E bandwidth or CPU speed your bottleneck may actually lie elsewhere. Though your 1700x should be fine, so I don't believe that your CPU is your bottleneck. PCI-E I/O may still be a factor in your setup. Finally, if you have SLI enabled on your two cards, I'd actually recommend DISABLING it, as SLI actually slows down multiGPU cuda operations due to driver and firmware overhead with the SLI tech (This problem was resolved and removed in the 20xx series cards which use SLI over NVlink and work differently).

User avatar
a1-kh
Posts: 3
Joined: Sun Nov 10, 2019 1:18 am
Has thanked: 3 times

Re: a problem with multi gpu training_Very low EGs/sec

Post by a1-kh »

Thanks bryanlyon for the tips , i will try them.

But for the time being is there a way to choose a specific gpu to train on ?
I do believe that this is possible, i tried digging in the scripts but i couldn't find the option, can you give me a little help here please.

Best regards

User avatar
torzdf
Posts: 2672
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 131 times
Been thanked: 625 times

Re: a problem with multi gpu training_Very low EGs/sec

Post by torzdf »

At the moment, not within Faceswap itself (but this is a feature we do want to add).

You can force which GPU by setting the Environmental Variable CUDA_VISIBLE_DEVICES

https://stackoverflow.com/questions/497 ... indows-cmd

My word is final

Locked