LR optimizer first impressions (yay new stuff)

Want to understand the training process better? Got tips for which model to use and when? This is the place for you


Forum rules

Read the FAQs and search the forum before posting a new topic.

This forum is for discussing tips and understanding the process involved with Training a Faceswap model.

If you have found a bug are having issues with the Training process not working, then you should post in the Training Support forum.

Please mark any answers that fixed your problems so others can find the solutions.

Locked
User avatar
Ryzen1988
Posts: 57
Joined: Thu Aug 11, 2022 8:31 am
Location: Netherlands
Has thanked: 8 times
Been thanked: 28 times

LR optimizer first impressions (yay new stuff)

Post by Ryzen1988 »

So, how blind can you trust this graph i wonder? not much in life is as simple as being able to be captured in a 2d graph but then again it is probably a supercool tool to guide you.

I have already found some cool first conclusions, in the pics is the LR optimizer for Adabelief with and without mixed precision.
As you can see with mixed precision the learning rate should be way lower and the curve goes up much sooner and steeper, no wonder adabelief with mixed precision was always though to get right

clipvlearning_rate_finder_2023-08-27_17.29.11.png
clipvlearning_rate_finder_2023-08-27_17.29.11.png (34.15 KiB) Viewed 10007 times
learning_rate_finder_2023-08-27_17.38.01.png
learning_rate_finder_2023-08-27_17.38.01.png (34.55 KiB) Viewed 10007 times
User avatar
Ryzen1988
Posts: 57
Joined: Thu Aug 11, 2022 8:31 am
Location: Netherlands
Has thanked: 8 times
Been thanked: 28 times

Re: LR optimizer first impressions (yay new stuff)

Post by Ryzen1988 »

As far as i can see when the LR optimizer is used the value sort of sticks in the model file right?
Would it be possible to make it so that you can retest models during training?
For example i have noticed the curve looks really different if you have the encoder frozen in the begin stages of training compared to when you have int unfrozen and in the training loop.
Batchsize also seem to influence the LR rate, so would seem really useful to be able to retest the LR curve when new parameters are applied.

User avatar
torzdf
Posts: 2687
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 135 times
Been thanked: 628 times

Re: LR optimizer first impressions (yay new stuff)

Post by torzdf »

Ryzen1988 wrote: Mon Aug 28, 2023 8:50 am

Would it be possible to make it so that you can retest models during training?

No, for the reason stated in the training documentation:

https://forum.faceswap.dev/viewtopic.php?t=146#settings wrote:

For new models, this Learning Rate will be discovered prior to commencing training. For resuming saved models, the Learning Rate discovered when first creating the model will be used. It is not possible to use the Learning Rate Finder to discover a new optimal Learning Rate when resuming saved models, as the loss update is too small between each iteration, however this option still needs to be enabled if you wish to use the Learning Rate that was found during the initial discovery phase.

Ryzen1988 wrote: Mon Aug 28, 2023 8:50 am

Batchsize also seem to influence the LR rate, so would seem really useful to be able to retest the LR curve when new parameters are applied.

Yes, Batchsize has a huge impact on Learning Rate. Further discussed here:
viewtopic.php?p=7432#p7432

My word is final

User avatar
Ryzen1988
Posts: 57
Joined: Thu Aug 11, 2022 8:31 am
Location: Netherlands
Has thanked: 8 times
Been thanked: 28 times

Re: LR optimizer first impressions (yay new stuff)

Post by Ryzen1988 »

I'm i that late to notice this function or did you already updated the guide?

I was wondering about the lack of being able to read a changelog but from your reaction i guess the guide is also something that integrates all the new functions that arrive, will check that out first.

I understand that LR rate updates are to small to properly retest.
What i did now for testing, please comment if this makes sense.
Load the model with encoder frozen, batchsize 32 - Initial training stage, get the LR curve and delete model
Load the model with encoder unfrozen batchsize 24 - main training stage, get the LR curve and delete model
Load the model with encoder unfrozen en small batchsize for final training without warp - get the LR curve, delete the model
Start with step one training with the LR value, and the 2 other values you already have for the latter training steps
:geek:
Maybe its a bit overdone or im making it over overcomplicated but the reported values due change a lot in the three modes an average model passes.

User avatar
torzdf
Posts: 2687
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 135 times
Been thanked: 628 times

Re: LR optimizer first impressions (yay new stuff)

Post by torzdf »

The main issue with this approach is that the random initialization has a huge impact on the Learning Rate. There is a link in the training guide that gives some information about what the lr_finder does and some things to think about.

I did not consider the use-case for wanting to test different batch sizes (I tend to train at the batch size I can fit, so this value is always small) so there is no easy way to currently test that kind of stuff with the current implementation.

When I add significant new features, I always update the guides either before, or very soon after, release.

We don't have a changelog because Faceswap is rolling release. The easiest way to see what is added and when is to look at the commit history: https://github.com/deepfakes/faceswap/commits/master

My word is final

User avatar
MaxHunter
Posts: 194
Joined: Thu May 26, 2022 6:02 am
Has thanked: 177 times
Been thanked: 13 times

Re: LR optimizer first impressions (yay new stuff)

Post by MaxHunter »

@Ryzen1988 I've asked for the same, but I found the easiest and quickest way is to do a "system information" check and it'll give you a brief description of what's been changed. If it's significant I go to the GitHub. 🙂

User avatar
Ryzen1988
Posts: 57
Joined: Thu Aug 11, 2022 8:31 am
Location: Netherlands
Has thanked: 8 times
Been thanked: 28 times

Re: LR optimizer first impressions (yay new stuff)

Post by Ryzen1988 »

So i have noticed with the optimizer that the first 20% and last 20% is never where the useful part is.
It takes a while before the Best rate start ticking up.
Would it be hard to make the LR steps wider until the curve starts dropping, having dense small steps in the center of the curve but as soon as the best LR isn't moving anymore + margian it can accelerate again in bigger steps. Now certainly 1/3 of the LR rate optimizer feels like just waiting for nothing. In the last part you know it will not move again.
Often i just manually close, enter the rate and go train that feels more logical then letting it finish.

Alternatively having a way to set the min and max value for the optimizer would also save time,
especially when like me, you are testing a whole lot of configurations before actually launching the training :geek: :ugeek:

Look at the beauty of this curve, i wonder if it will reflect the potential of the network :mrgreen: :geek:

Attachments
learning_rate_finder_2023-08-31_18.05.06.png
learning_rate_finder_2023-08-31_18.05.06.png (35.93 KiB) Viewed 9843 times
Last edited by Ryzen1988 on Thu Aug 31, 2023 4:15 pm, edited 1 time in total.
User avatar
torzdf
Posts: 2687
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 135 times
Been thanked: 628 times

Re: LR optimizer first impressions (yay new stuff)

Post by torzdf »

What you are asking for is theoretically possible (min and max values), but I didn't implement it for a couple of reasons.... It would be yet more config options to add the already myriad of options, and I don't trust users not to input insane values and then complain that it doesn't work.

As for the step size, it raises logarithmically through each step. I'm unlikely to change that part, as I deliberately implemented as per the original implementation.

My word is final

User avatar
Ryzen1988
Posts: 57
Joined: Thu Aug 11, 2022 8:31 am
Location: Netherlands
Has thanked: 8 times
Been thanked: 28 times

Re: LR optimizer first impressions (yay new stuff)

Post by Ryzen1988 »

Another interesting curve :geek: :ugeek:

Attachments
learning_rate_finder_2023-09-04_18.35.15.png
learning_rate_finder_2023-09-04_18.35.15.png (36.04 KiB) Viewed 9674 times
User avatar
Ryzen1988
Posts: 57
Joined: Thu Aug 11, 2022 8:31 am
Location: Netherlands
Has thanked: 8 times
Been thanked: 28 times

Re: LR optimizer first impressions (yay new stuff)

Post by Ryzen1988 »

learning_rate_finder_2023-09-25_14.34.57.png
learning_rate_finder_2023-09-25_14.34.57.png (33.74 KiB) Viewed 9462 times

That's what you would call a really hyperactive geek that's really wants to learn :ugeek: :geek:

User avatar
torzdf
Posts: 2687
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 135 times
Been thanked: 628 times

Re: LR optimizer first impressions (yay new stuff)

Post by torzdf »

Yeah, it has definitely put the points in the wrong place there!

My word is final

Locked