Faceswap Forum

Posted: **Mon Mar 15, 2021 6:34 am**

abigflea wrote: ↑Sun Mar 14, 2021 1:33 am
@freedzy is this a single card install, or do you have other cards installed? Like a 20x0 series as well?

I have got an old 1060 card but I didn't tried to plug it in parallel. I could make the test in a couple of hours. But this old card has got 3Go Vram, so there is not a lot of models that can work (the Vram will not be shared if I'm not wrong). The Original model was ok, I will make tests with this.

Posted: **Mon Mar 15, 2021 7:53 am**

Did a try, but my *#!? computer doesn't give me a chance to use my 3070 as primary card (I will try to invert order on motherbord, the BIOS has no option )

So, with RealFace model, no "lowmemory" to be sure that VRAM is shared:

Code: Select all

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.56       Driver Version: 460.56       CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce GTX 106...  Off  | 00000000:01:00.0  On |                  N/A |
| 21%   50C    P8     8W / 120W |    396MiB /  3019MiB |     14%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  GeForce RTX 3070    Off  | 00000000:02:00.0 Off |                  N/A |
| 33%   63C    P2   114W / 220W |   7960MiB /  7982MiB |     81%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1793      G   /usr/libexec/Xorg                 124MiB |
|    0   N/A  N/A      1945      G   /usr/bin/gnome-shell               32MiB |
|    0   N/A  N/A      2539      G   kitty                               3MiB |
|    0   N/A  N/A      3809      C   .../envs/faceswap/bin/python      171MiB |
|    0   N/A  N/A     16176      G   ...AAAAAAAAA= --shared-files       59MiB |
|    1   N/A  N/A      3809      C   .../envs/faceswap/bin/python     7957MiB |
+-----------------------------------------------------------------------------+

That helps, not a lot but that's OK, VRam is shared and both GPU are computing. That does 5 to 8 iteration/seconds with batch size = 16 (no tweak on model)

I had 4 to 6 it/seconds with my 3070 and 1it/second with my 1060 alone. So, yes, it works

I will switch cards to have my 3070 as default card, hoping it will works.

EDIT: Switched done, Works again
One thing: I tried to activate de "Distributed" checkbox, it fails with RealFace:

Code: Select all

tensorflow.python.framework.errors_impl.InternalError: Failed copying input tensor from /job:localhost/replica:0/task:0/device:GPU:0 to /job:localhost/replica:0/task:0/device:GPU:1 in order to run Identity: Dst tensor is not initialized. [Op:Identity]

Didn't tried with others models

Posted: **Mon Mar 15, 2021 8:27 am**

I was just curious.
Linux wasn't working very well for me ,and even with the correct driver I couldn't use my 30xx . I had a suspicion that it was because I have two mixed generations.
I was going to try and devote some time Tuesday evening to this idea I need to do some rearranging myself

Posted: **Mon Mar 15, 2021 9:12 am**

It seems that I misunderstood what I saw on nvidia-smi.

I made a try to "continue training" a Dfl-128H model that I started only with my 3070 card, I can see that the python script is using both GPUs but the 1060 stick at 0%.

Code: Select all

Mon Mar 15 10:12:05 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.56	 Driver Version: 460.56       CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce RTX 3070    Off  | 00000000:01:00.0  On |                  N/A |
| 43%   67C    P2   143W / 220W |   5684MiB /  7981MiB |     85%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 106...  Off  | 00000000:02:00.0 Off |                  N/A |
| 10%   46C    P8     4W / 120W |    175MiB /  3019MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1788	G   /usr/libexec/Xorg                 154MiB |
|    0   N/A  N/A      1939	G   /usr/bin/gnome-shell               34MiB |
|    0   N/A  N/A      2624	G   kitty                               4MiB |
|    0   N/A  N/A     12915	G   ...AAAAAAAAA= --shared-files	55MiB |
|    0   N/A  N/A     60380	C   .../envs/faceswap/bin/python     5431MiB |
|    1   N/A  N/A     60380	C   .../envs/faceswap/bin/python      171MiB |

It's an "old" model that I continue to train, so maybe the "Mixed Precision" is kept and following the warning I've got at start train, maybe my old 1060 is not used (only VRam is shared ?)

I need to do more tests, but at this time, my 3070 is working great.

Note: I made a tried earlier with distributed training, and both cards had 2G loaded, but... nothing happens afterward.

Posted: **Mon Mar 15, 2021 5:37 pm**

freedzy wrote: ↑Sun Mar 14, 2021 12:42 am
So the solution is:

install Cuda >= 11.1

install CuDnn >= 8.1

make the installtion for "CPU"

conda activate faceswap

remove tensorflow with "conda remove tensorflow"

install tensorflow-gpu==2.4.1

remove config/.faceswap*

start faceswap

go in Settings > Configure Settings

in Extract: Allow Growth checked

in Train, set Allow Growth and optionally Mixed Precision

I've got a 3060 Ti and I finally got the installation to work with these Windows versions of Cuda etc..

I've launched the extraction part but it is very slow as it still uses the CPU instead of the GPU. During installation I chose CPU, deleted config/.faceswap and selected NVIDIA during the first launch. Is there another setting I'm missing where I have to change it from CPU to GPU?

Posted: **Mon Mar 15, 2021 10:52 pm**

Might help someone as it was a nightmare for me to make it work on an RTX3080 and finally I got it working. After many tries with conda, update'ing, upgrading, reinstalling, etc. here's what I've done and might help some (Windows 10):

Install the latest version of CUDA (in my case 11.2);
Install the latest version of Cudnn (in my case 8.1.1) - tip: don't forget there's an asterisk (*), so actually you need to copy all files from installation folders;
Restart;
Remove the faceswap folder (if you have installed it before);
Remove miniconda if you have installed it before;
Install faceswap and choose NVIDIA during installation (not CPU!);
Open the anaconda prompt window and enter commands (one by one and in case of brotli I just tried all options since some don't work);
conda activate faceswap
conda remove tensorflow
conda install brotli
conda install urllib3
conda install -c anaconda urllib3
pip install tensorflow-gpu==2.4.1
Run Faceswap and choose NVIDIA.

Works for me! In my case the difference is pretty huge. On CPU I was getting around 3-4 EGs/sec, now I'm getting much more (50-70 EGs/sec with 16-32 batch) - though I'm also using a bigger image file (1024) to train my model so I think this can have an influence on the speed of the process since it takes more memory. Anyway I can confirm it's possible to make it work on the RTX3080 on Win10.

Posted: **Wed Mar 17, 2021 3:59 am**

Glad to see you got it working, with some modified settings.
Good to know what will and won't work.
Mine worked in Win10 with the instructions from the first post, although not till the 3rd attempt.
Gotta love officially unsupported configurations

Posted: **Wed Mar 17, 2021 8:00 am**

gregormax wrote: ↑Mon Mar 15, 2021 10:52 pm
Install faceswap and choose NVIDIA during installation (not CPU!);

Thanks, now it's also working for me.

Posted: **Sun Mar 28, 2021 8:00 am**

Hi Everyone, I am trying to fake my friends face as a birthday present but am a total noob so any help would be really appreciated.

I have got to the training stage and am getting extremely slow speeds.

GPU: 3080
CPU ryzen 5 3600
batch size: 16
Villain: 1.3 EG/s

people with comparable hardware seem to get much faster speeds. Are there some obvious mistakes I might be making?

Thanks for reading

Posted: **Sun Mar 28, 2021 11:49 am**

noobynoobnoob wrote: ↑Sun Mar 28, 2021 8:00 am
Hi Everyone, I am trying to fake my friends face as a birthday present but am a total noob so any help would be really appreciated.

I have got to the training stage and am getting extremely slow speeds.

GPU: 3080
CPU ryzen 5 3600
batch size: 16
Villain: 1.3 EG/s

people with comparable hardware seem to get much faster speeds. Are there some obvious mistakes I might be making?

Thanks for reading

30xx cards are not currently directly supported. I recommend checking through this thread for similar issues/making sure you have everything set up correctly.

Posted: **Sun Mar 28, 2021 11:28 pm**

I tried reinstalling everything and now when i try to train it fails at "Loading trainer from Original Plugin".

did i cock something up?

Posted: **Mon Mar 29, 2021 2:44 am**

Possibly so.
Torzdf and Bryanlyon stress enough this is not exactly a supported GPU yet. Took 3 attempts for me.

Suggest removing the Cuda toolkit, DDU your Nvidia drivers, remove the Conda environment, reinstall fresh drivers, then follow the first post again.

If it fails need to see your crash report.

Posted: **Mon Mar 29, 2021 12:09 pm**

torzdf wrote: ↑Thu Feb 11, 2021 10:04 am
Change the "Log Level" option to "Verbose"

Where is this in the GUI?

Posted: **Mon Mar 29, 2021 2:27 pm**

gregormax wrote: ↑Mon Mar 15, 2021 10:52 pm
Might help someone as it was a nightmare for me to make it work on an RTX3080 and finally I got it working. After many tries with conda, update'ing, upgrading, reinstalling, etc. here's what I've done and might help some (Windows 10):

Install the latest version of CUDA (in my case 11.2);

Install the latest version of Cudnn (in my case 8.1.1) - tip: don't forget there's an asterisk (*), so actually you need to copy all files from installation folders;

Restart;

Remove the faceswap folder (if you have installed it before);

Remove miniconda if you have installed it before;

Install faceswap and choose NVIDIA during installation (not CPU!);

Open the anaconda prompt window and enter commands (one by one and in case of brotli I just tried all options since some don't work);
conda activate faceswap
conda remove tensorflow
conda install brotli
conda install urllib3
conda install -c anaconda urllib3
pip install tensorflow-gpu==2.4.1

Run Faceswap and choose NVIDIA.

Works for me! In my case the difference is pretty huge. On CPU I was getting around 3-4 EGs/sec, now I'm getting much more (50-70 EGs/sec with 16-32 batch) - though I'm also using a bigger image file (1024) to train my model so I think this can have an influence on the speed of the process since it takes more memory. Anyway I can confirm it's possible to make it work on the RTX3080 on Win10.

Can confirm this works! Many many thanks to you.

The problem i was having was only copying across the cudnn files that it says on the NVidia install instructions - but i needed to copy them all (as it says in gregors instructions! duh me.)

Posted: **Mon Mar 29, 2021 3:10 pm**

G'day folks - hope everyone here is keeping safe and well

I'll list down my hardware below, but my primary question is, why can't I enable both cards to utilise the full power of each to extract and train? If I check-in 3090 along with 2080ti - everything simply hangs. Things only work when I check 2080ti. Even if I check JUST 3090, it just hangs (or is EXTREMELY slow).

Yes, I'm using the latest FaceSwap (just updated this morning again).

Hardware Specs:

MOTHERBOARD:
ASUS Pro WS X299 SAGE II

CPU:
Intel Core i9 i9-10980XE

GPU 1:
3090

GPU 2:
2080ti

(both optimised to use full PCIe 3.0 bandwith)

RAM: 128GB

Running on SSDs (boot drive and also drive to save faceswap on)

No background tasks running. Windows "de-bloated"

Posted: **Mon Mar 29, 2021 3:17 pm**

Have you followed the instructions here?
viewtopic.php?f=4&t=1226

If you are intending on using both cards at the same time there is some things of note.
30x0 support is currently unsupported, although it seems most people can get it to work with the instructions above.

I personally haven't managed to get any 30XX card to work in distributed in Win10 nor load 2 instances of Faceswap in Win10 and train 2 models simultaneously. I really could use that speed the last few months :-/ Your mileage may vary.

In Linux I have managed to train 2 simultaneously on 2x 30x0 cards,
although It seems loading drivers for a 20x0 card paired with a 30x0 card seems to break the support, apparently.

Even if you get it all to work, I am unsure if a 2080ti paired (distributed training) with a 3090 would do much other than maybe slow you down. In best case, I would expect your performance to be about the equivalent of 2x 2080ti.
The faster card will just be waiting on the slower.

According to this https://lambdalabs.com/gpu-benchmarks

2X 2080ti gets a score 569 on Resnet 50
a single 3090 gets 554

Trust me, I WANT it to work. Waiting on upstream software integration for support of 30x0. Tensorflow and I believe the Nvidia drivers.
If you figure a way, Ill send ya a beer.

Posted: **Mon Mar 29, 2021 3:37 pm**

Thanks, mate - but that guide clearly says that this is ONLY for 30xx cards... so i wasn't sure if I followed that, it will still work with my 2080ti

Posted: **Mon Mar 29, 2021 3:38 pm**

You have a 3090. If you want it to work with that 3090 you must follow that thread.

Posted: **Mon Mar 29, 2021 3:41 pm**

your 2080 should still work, my 2070 did no problem.

Posted: **Mon Mar 29, 2021 5:44 pm**

OK - so I followed this specific guide by @gregormax:

viewtopic.php?p=5123#p5123

Training started. 3 mins down and my avg EGs/sec is 47.9

When I used to just use the 2080ti, I was getting an average of 85ish!

Faceswap Forum

[Guide] Using Faceswap on Nvidia RTX 30xx cards

Re: [Guide] Using Faceswap on Nvidia RTX 30xx cards

Re: [Guide] Using Faceswap on Nvidia RTX 30xx cards

Re: [Guide] Using Faceswap on Nvidia RTX 30xx cards

Re: [Guide] Using Faceswap on Nvidia RTX 30xx cards

Re: [Guide] Using Faceswap on Nvidia RTX 30xx cards

Re: [Guide] Using Faceswap on Nvidia RTX 30xx cards

Re: [Guide] Using Faceswap on Nvidia RTX 30xx cards

Re: [Guide] Using Faceswap on Nvidia RTX 30xx cards

Troubleshoot slow training speeds

Re: Troubleshoot slow training speeds

Re: [Guide] Using Faceswap on Nvidia RTX 30xx cards

Re: [Guide] Using Faceswap on Nvidia RTX 30xx cards

Re: [Guide] Using Faceswap on Nvidia RTX 30xx cards

Re: [Guide] Using Faceswap on Nvidia RTX 30xx cards

Current hardware recco + 3090 advice needed

Re: Current hardware recco + 3090 advice needed

Re: Current hardware recco + 3090 advice needed

Re: Current hardware recco + 3090 advice needed

Re: Current hardware recco + 3090 advice needed

Re: Current hardware recco + 3090 advice needed