question on the use of multiple gpus 8+

Talk about Hardware used for Deep Learning


Locked
User avatar
mjam
Posts: 5
Joined: Tue Dec 28, 2021 8:30 pm

question on the use of multiple gpus 8+

Post by mjam »

I am working on a way to get the maximum out of $80,000. I am looking to create a GPU Cluster or GPU blade that can be used with FaceSwap. I am in the early stage here and need to make sure that FaceSwap can handle more than 8 GPUs for training.

So some of the questions that I have are as follows:

  1. What is the maximum GPU count for FaceSwap?
  2. What is the best GPU for training?
  3. Is there a way to run FaceSwap training from multiple (separate) PCs? [This one is sort of a backup plan if the GPU Cluster or GPU blade is not compatible with FaceSwap.]
User avatar
bryanlyon
Site Admin
Posts: 793
Joined: Fri Jul 12, 2019 12:49 am
Answers: 44
Location: San Francisco
Has thanked: 4 times
Been thanked: 218 times
Contact:

Re: question on the use of multiple gpus 8+

Post by bryanlyon »

There are no maximum number of GPUs, but each additional GPU adds more latency and provides less gain. If you were to do 8 GPUs, you'll definitely actually be past that bottleneck and you'll get better results by reducing the number of GPUs.

The GPU I normally recommend is a 3060. It's got a lot of VRAM for it's tier. You might want to see if that is the balance you want however. 3090 might be worth the extra speed in your use case.

User avatar
mjam
Posts: 5
Joined: Tue Dec 28, 2021 8:30 pm

Re: question on the use of multiple gpus 8+

Post by mjam »

Thanks for the fast reply, I did not expect to get one that fast, this helps a lot. So, the only question I have now is: Is it possible to run 1 training session over multiple computers, let's say 12 pcs running with rtx 2090s?

User avatar
bryanlyon
Site Admin
Posts: 793
Joined: Fri Jul 12, 2019 12:49 am
Answers: 44
Location: San Francisco
Has thanked: 4 times
Been thanked: 218 times
Contact:

Re: question on the use of multiple gpus 8+

Post by bryanlyon »

No. Latency is bad enough if the GPUs are all in one box. If you were to sync over a network you'd lose anything you could possibly gain from more GPUs.

User avatar
mjam
Posts: 5
Joined: Tue Dec 28, 2021 8:30 pm

Re: question on the use of multiple gpus 8+

Post by mjam »

Thanks for your time. :)

User avatar
mjam
Posts: 5
Joined: Tue Dec 28, 2021 8:30 pm

Re: question on the use of multiple gpus 8+

Post by mjam »

Ok, I got some ideas here. I am looking at a 8 GPU server with them all being Nvidia A100. I realize this is extreme overkill for a face swap system but in my situation it is not only for face swapping. But, as to not waist on any of my projects, I just wish to see if a system like this would be compatible with FaceSwap. I am very new to deeplearning so I am not sure about the A100 working with FaceSwap. So, my main question now is: Will FaceSwap work with 8 Nvidia A100 GPUs?

User avatar
abigflea
Posts: 182
Joined: Sat Feb 22, 2020 10:59 pm
Answers: 2
Has thanked: 20 times
Been thanked: 62 times

Re: question on the use of multiple gpus 8+

Post by abigflea »

I think it will be compatible, although I wouldn't use more than 2 or 3 cards.
As Bryanlyon said, the latency between cards will slow things down.

The cards need to communicate with each other, the more cards, the more communication.

For sure, each card would need to be connected to a Pcie slot Gen 3 or 4 running the full X16.
I had 4x 3070 connected and it was trash, one card was communicating at Gen 3, X1 speed. The entire 4 card setup was slower than a single card!

One can make the argument that more cards allow a larger batch size, which is better, but I feel a batch size more than 80-100 and the models quit learning , or simply crash.
I can get a Batch size of 110 on Villain with 2x 3090, and I reduce the batch size to 20 by the end of training .. so those 2 are plenty.

:o I dunno what I'm doing :shock:
2X RTX 3090 : RTX 3080 : RTX: 2060 : 2x RTX 2080 Super : Ghetto 1060

Locked