Page 1 of 1

question on the use of multiple gpus 8+

Posted: Tue Dec 28, 2021 8:45 pm
by mjam

I am working on a way to get the maximum out of $80,000. I am looking to create a GPU Cluster or GPU blade that can be used with FaceSwap. I am in the early stage here and need to make sure that FaceSwap can handle more than 8 GPUs for training.

So some of the questions that I have are as follows:

  1. What is the maximum GPU count for FaceSwap?
  2. What is the best GPU for training?
  3. Is there a way to run FaceSwap training from multiple (separate) PCs? [This one is sort of a backup plan if the GPU Cluster or GPU blade is not compatible with FaceSwap.]

Re: question on the use of multiple gpus 8+

Posted: Tue Dec 28, 2021 8:48 pm
by bryanlyon

There are no maximum number of GPUs, but each additional GPU adds more latency and provides less gain. If you were to do 8 GPUs, you'll definitely actually be past that bottleneck and you'll get better results by reducing the number of GPUs.

The GPU I normally recommend is a 3060. It's got a lot of VRAM for it's tier. You might want to see if that is the balance you want however. 3090 might be worth the extra speed in your use case.


Re: question on the use of multiple gpus 8+

Posted: Tue Dec 28, 2021 9:15 pm
by mjam

Thanks for the fast reply, I did not expect to get one that fast, this helps a lot. So, the only question I have now is: Is it possible to run 1 training session over multiple computers, let's say 12 pcs running with rtx 2090s?


Re: question on the use of multiple gpus 8+

Posted: Tue Dec 28, 2021 9:16 pm
by bryanlyon

No. Latency is bad enough if the GPUs are all in one box. If you were to sync over a network you'd lose anything you could possibly gain from more GPUs.


Re: question on the use of multiple gpus 8+

Posted: Tue Dec 28, 2021 9:25 pm
by mjam

Thanks for your time. :)


Re: question on the use of multiple gpus 8+

Posted: Tue Jan 04, 2022 9:56 pm
by mjam

Ok, I got some ideas here. I am looking at a 8 GPU server with them all being Nvidia A100. I realize this is extreme overkill for a face swap system but in my situation it is not only for face swapping. But, as to not waist on any of my projects, I just wish to see if a system like this would be compatible with FaceSwap. I am very new to deeplearning so I am not sure about the A100 working with FaceSwap. So, my main question now is: Will FaceSwap work with 8 Nvidia A100 GPUs?


Re: question on the use of multiple gpus 8+

Posted: Tue Jan 18, 2022 5:26 pm
by abigflea

I think it will be compatible, although I wouldn't use more than 2 or 3 cards.
As Bryanlyon said, the latency between cards will slow things down.

The cards need to communicate with each other, the more cards, the more communication.

For sure, each card would need to be connected to a Pcie slot Gen 3 or 4 running the full X16.
I had 4x 3070 connected and it was trash, one card was communicating at Gen 3, X1 speed. The entire 4 card setup was slower than a single card!

One can make the argument that more cards allow a larger batch size, which is better, but I feel a batch size more than 80-100 and the models quit learning , or simply crash.
I can get a Batch size of 110 on Villain with 2x 3090, and I reduce the batch size to 20 by the end of training .. so those 2 are plenty.