Multi GPU

Talk about Hardware used for Deep Learning


Locked
User avatar
orkblutt
Posts: 2
Joined: Sun Sep 15, 2019 4:07 pm

Multi GPU

Post by orkblutt »

Hello all,

I'm trying to use faceswap on a 6 GPU (6 x nVidia 1070 - 8GB)with a i7 9th gen CPU and 24GB of RAM.
The results are really strange. Here the average speeds while changing GPU numbers on training with a batch size of 100:

1 GPU: 2.5 iterations / s
2 GPU: 1.2 iterations /s
3 GPU: 0.4 iterations / s
4 GPU: process hanging
5 GPU: idem
6 GPU: even not tested ...

I wasn't expecting that kind of results. Any clue about what's hapening here ?

Regards,

Orkblutt

User avatar
bryanlyon
Site Admin
Posts: 793
Joined: Fri Jul 12, 2019 12:49 am
Answers: 44
Location: San Francisco
Has thanked: 4 times
Been thanked: 218 times
Contact:

Re: Multi GPU

Post by bryanlyon »

Iterations will be the same (or slower) but you need to increase the batch size so the egs/sec will be higher.

Essentially multiply your BS by the number of GPUs used.

There is some loss from syncronizing the GPUs, and you'll also have losses due to less bandwidth avaiable per GPU so you wont see linear growth.

User avatar
orkblutt
Posts: 2
Joined: Sun Sep 15, 2019 4:07 pm

Re: Multi GPU

Post by orkblutt »

Hi Bryan,

thank you for these explanations. I'll try this evening.
By the way what Egs/s stand for ?

All the best,

Orkblutt

User avatar
bryanlyon
Site Admin
Posts: 793
Joined: Fri Jul 12, 2019 12:49 am
Answers: 44
Location: San Francisco
Has thanked: 4 times
Been thanked: 218 times
Contact:

Re: Multi GPU

Post by bryanlyon »

Examples/sec

Basically how many images are trained in a second.

User avatar
koroep
Posts: 3
Joined: Sun May 17, 2020 10:39 am

Re: Multi GPU

Post by koroep »

So if I understood this correctly, multiple GPU are only useful when increasing the batch size, right?

In the training guide the following is stated:

Higher batch sizes will train faster, but will lead to higher generalization. Lower batch sizes will train slower, but will distinguish differences between faces better. Adjusting the batch size at various stages of training can help.

So basically you can achieve better results using lower batch sizes, it will only be slower? Has someone experimented with the different batch sizes and could perhaps offer some guidance how long you should train with different batch sizes to get the optimal results? Of course this might be affected by number of things but any idea what the baseline might be?

User avatar
bryanlyon
Site Admin
Posts: 793
Joined: Fri Jul 12, 2019 12:49 am
Answers: 44
Location: San Francisco
Has thanked: 4 times
Been thanked: 218 times
Contact:

Re: Multi GPU

Post by bryanlyon »

Yes, experimentation has been done. We recommend using the largest batch size you can until you're not seeing any improvements, then you can do a fit train at a lower batch size. See other discussions here on how to do fit training.

Locked