Slower training since the last update...

If training is failing to start, and you are not receiving an error message telling you what to do, tell us about it here


Forum rules

Read the FAQs and search the forum before posting a new topic.

This forum is for reporting errors with the Training process. If you want to get tips, or better understand the Training process, then you should look in the Training Discussion forum.

Please mark any answers that fixed your problems so others can find the solutions.

Locked
User avatar
Barnuble
Posts: 17
Joined: Tue Jan 19, 2021 2:42 pm
Been thanked: 1 time

Slower training since the last update...

Post by Barnuble »

Hi everybody,

I updated faceswap two days ago on both my two machines and following that, faceswap stopped starting :(
So I decided to completely reinstall faceswap (importing same models and settings as before) and from the training speed slowed on both !

My configs :

  • First machine : RTX4090 / using model PhazeA-256 @Batchsize=8; before reinstalling=24.4fps, after=16.5fps (-47% speed)
  • Second machine : RTX 3090 / using model PhazeA-256 @Batchsize=8; before reinstalling=14.7fps, after=12.4fps (-18% speed)

I have to clarify that both models and images are the same, and now my RTX 4090 is barely faster than my RTX 3090...while it was 60% faster before the last update.

Has anyone else noticed a slowdown since last updates ?
RTX 4090 seems more affected by slowdown than RTX 3090 !
Really strange, if anyone has experienced the same kind of slowdown, I would be interested to read your observations...

Thanks in advance !

Last edited by Barnuble on Wed Jun 28, 2023 5:57 pm, edited 1 time in total.
User avatar
torzdf
Posts: 2687
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 135 times
Been thanked: 628 times

Re: Slower training since the last update...

Post by torzdf »

If anything, most users have reported a slight speed up. The update didn't change any of the code. It just updated minimum versions of Python and Tensorflow, now that we no longer need to support the outdated PlaidML.

You may want to make sure all your settings are as they were.

For other users updating: Latest Faceswap will require a full re-install from the latest installer: https://github.com/deepfakes/faceswap/r ... ag/v2.10.0

You can keep your configuration settings by copying the .ini files from the faceswap/.config folder, and then placing them in the new folder when you reinstall

My word is final

User avatar
Barnuble
Posts: 17
Joined: Tue Jan 19, 2021 2:42 pm
Been thanked: 1 time

Re: Slower training since the last update...

Post by Barnuble »

Hi,

Thanks for answerring so fast,

  • My settings are as they were, I re-imported my .ini files, found my settings back and checked everything...
  • I downgraded Nvidia drivers (536.23 -> 535.98) : Not better
  • Uninstalled miniconda, faceswap, cleaned the registry... and reinstalled everything... : Not better
  • Checked CPU, GPU clocks and many other things...

Note : When trying to check for updates, Faceswap answers me : "Git is not installed or you are not running a cloned repo. Unable to check for updates"
I'm still trying to find out why the performances are so reduced. (-47% on my RTX 4090).

There must be another reason that I can't explain at the moment...

Thanks again for your help, i will report when i found ;)

User avatar
torzdf
Posts: 2687
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 135 times
Been thanked: 628 times

Re: Slower training since the last update...

Post by torzdf »

Hope it resolves itself and would be interested to know if others have issues. Unfortunately, there isn't a lot I can do to fix it, sadly, as if there is an issue, it will be in the underlying libraries.

re: the git issue. It looks like MS has made some changes in how git folders are trusted on Windows. I'm looking for a permanent workaround, but in the meantime you can follow the instructions I posted here (substituting the folder, with the one on your system).

viewtopic.php?p=9165#p9165

My word is final

User avatar
HoloByteus
Posts: 19
Joined: Mon Apr 12, 2021 11:31 pm
Been thanked: 3 times

Re: Slower training since the last update...

Post by HoloByteus »

Nice speed bump for me so far. I've been using the same phaseA StoJo/SubPixel model for months and getting a pretty consistent 8.6 EGs/s @ BS8 which I use start to finish. First model I'm training since the update with the same settings is now doing 13.5 EGs/s 27K iterations in.

Also noticed a bit of a bump with conversion where I observed a pretty consistent 15 it/s when a face was in frame to the same frames now showing 25 it/s. I also updated FFMPEG in miniconda and thought this might have been the reason. Unfortunately, I didn't capture the encoding performance line prior to the update to have a more accurate comparison.

Last edited by HoloByteus on Sat Jul 01, 2023 4:31 pm, edited 3 times in total.

HTPC: Windows 11 Pro, 12700K, 32GB, MSI Z690 Carbon, Strix 3060, 2x2TB FireCuda 530, 2x18TB Seagate EXO.
AV: Denon X3800H, Outlaw 7000X Amp, Hisense 65H9F, Panamax MR4300

User avatar
Barnuble
Posts: 17
Joined: Tue Jan 19, 2021 2:42 pm
Been thanked: 1 time

Re: Slower training since the last update...

Post by Barnuble »

Hi,

Last June 28, I reported a slowdown of 18% to 47% of the calculation speed following an update of FaceSwap.
I've spent a lot of time over the last few weeks doing various hardware, software and configuration tests, I've reinstalled Faceswap several times with different distributions, different drivers and settings.
It didn't solve anything!
I then modified the FaceSwap parameters one by one by relaunching the tests and I finally found (last night) the origin of the problem!

The slowdown comes from the Loss filters or their addition (SSIM + MAE + LPIPS-Alex + SSL).

After resetting Loss (resetting SSIM+MSE to default) performance reappears on both my machines.
I think that LPIPS-Alex or SSL are in question! Were they updated in June?
My machines had been using these 4 functions for months with no problems and following the June update, a constant slowdown appeared.
After reinstalling FaceSwap, another form of slowdown appeared: Performance increases for the first 20 minutes and then degrades hour by hour (20 EGs at the start, 14 EGs after one hour and 8 EGs after 8 hours).
I've seen Python's increasing memory consumption over time, and I think one (or more) of these functions is "memory leaking" buggy.

Since I no longer use these 4 filters simultaneously, the slowdown has disappeared
Settings were : SSIM + MAE@25% + LPIPS_Alex@5% + FFL@100%

I wanted to give you this feedback...
For users: If you have slowdowns, consider looking at the Loss filters!

Thanks again !
PS : I am now using SSIM+MAE@100

Last edited by Barnuble on Tue Jul 25, 2023 11:02 am, edited 2 times in total.
User avatar
torzdf
Posts: 2687
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 135 times
Been thanked: 628 times

Re: Slower training since the last update...

Post by torzdf »

Thanks for the in depth analysis and the feedback. I'm sorry you are having issues.

Ultimately, no, changes have not been made to the loss functions. However, as part of the recent update, the versions of cuda/cudnn were updated too. For most people, this has seen a speed bump. For some people, this has seen a speed decrease (maybe including you), and for others they have been receiving segmentation faults whilst training (myself included).

If you hit any of these issues, this should resolve it:

Start > Anaconda Prompt (cmd)

Code: Select all

conda activate faceswap
conda remove cudatoolkit cudnn
conda install -c conda-forge cudatoolkit=11.2 cudnn=8.1

Close out of Anaconda Prompt, relaunch Faceswap, and you should be back to the old speeds.

Last edited by torzdf on Thu Jul 27, 2023 11:12 pm, edited 1 time in total.

My word is final

Locked