Page 1 of 1
Multi GPU
Posted: Sun Sep 15, 2019 4:27 pm
by orkblutt
Hello all,
I'm trying to use faceswap on a 6 GPU (6 x nVidia 1070 - 8GB)with a i7 9th gen CPU and 24GB of RAM.
The results are really strange. Here the average speeds while changing GPU numbers on training with a batch size of 100:
1 GPU: 2.5 iterations / s
2 GPU: 1.2 iterations /s
3 GPU: 0.4 iterations / s
4 GPU: process hanging
5 GPU: idem
6 GPU: even not tested ...
I wasn't expecting that kind of results. Any clue about what's hapening here ?
Regards,
Orkblutt
Re: Multi GPU
Posted: Sun Sep 15, 2019 11:06 pm
by bryanlyon
Iterations will be the same (or slower) but you need to increase the batch size so the egs/sec will be higher.
Essentially multiply your BS by the number of GPUs used.
There is some loss from syncronizing the GPUs, and you'll also have losses due to less bandwidth avaiable per GPU so you wont see linear growth.
Re: Multi GPU
Posted: Mon Sep 16, 2019 1:42 pm
by orkblutt
Hi Bryan,
thank you for these explanations. I'll try this evening.
By the way what Egs/s stand for ?
All the best,
Orkblutt
Re: Multi GPU
Posted: Mon Sep 16, 2019 2:04 pm
by bryanlyon
Examples/sec
Basically how many images are trained in a second.
Re: Multi GPU
Posted: Mon May 18, 2020 8:04 am
by koroep
So if I understood this correctly, multiple GPU are only useful when increasing the batch size, right?
In the training guide the following is stated:
Higher batch sizes will train faster, but will lead to higher generalization. Lower batch sizes will train slower, but will distinguish differences between faces better. Adjusting the batch size at various stages of training can help.
So basically you can achieve better results using lower batch sizes, it will only be slower? Has someone experimented with the different batch sizes and could perhaps offer some guidance how long you should train with different batch sizes to get the optimal results? Of course this might be affected by number of things but any idea what the baseline might be?
Re: Multi GPU
Posted: Mon May 18, 2020 6:37 pm
by bryanlyon
Yes, experimentation has been done. We recommend using the largest batch size you can until you're not seeing any improvements, then you can do a fit train at a lower batch size. See other discussions here on how to do fit training.