Overtraining question

If training is failing to start, and you are not receiving an error message telling you what to do, tell us about it here


Forum rules

Read the FAQs and search the forum before posting a new topic.

This forum is for reporting errors with the Training process. If you want to get tips, or better understand the Training process, then you should look in the Training Discussion forum.

Please mark any answers that fixed your problems so others can find the solutions.

Locked
User avatar
Fed
Posts: 16
Joined: Sat Jun 18, 2022 3:31 pm
Been thanked: 1 time

Overtraining question

Post by Fed »

I recently found out that overtraining is a thing that can happen.
(I know it's in the guide, but when you (I mean, me) read the guide for the first time, you filter out everything that doesn't answer the question "how do you make this thing work at all?")
I searched the forum and if I'm not using the search wrong, there's 10 mentions of this topic.

I figured that it could happen if you train the model for too long. When the model learned more or less everything it could and starts ignoring the results of loss-rate and gets worse? I have no idea how it works (or even if it works the way I described).

Also I figured there's no silver bullet to decide if you reached the point where overtraining starts. You need to watch the preview. I'm not good at that. The random changes between iterations are way bigger than incremental improvement over thousands of iterations. Sometimes the preview in the timeline looks worse than it looked 100k iterations before. And than it gets better again. At least in my eyes.

I also figured that to somewhat prevent overtraining you can add new data, so that that model has valid things to learn. And it's a separate question from whether you already have enough data for a decent model.
So I guess if you are worried about overtraining, you could keep some of your training data in a stash, start with less (but enough) and add batches of images to the model's training data gradually, like, a batch every several 100k iterations? It would make the process a bit less effective probably, but will protect against overtraining because there will +/- always be some new valid data for the model to learn. Something like that?

I guess my question(s) is:

  1. Did I get everything right?
  2. Are there any more pointers on when overtraining can happen? Like for example "You don't need to worry about overtraining until you hit way beyond 10m iterations".
  3. Are there good ways to prevent overtraining other than "don't train it after it's already good enough"? I mean, it looks like a good suggestion, but I have problems evaluating if the model is good enough already (as I previously described). )
User avatar
torzdf
Posts: 2671
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 131 times
Been thanked: 625 times

Re: Overtraining question

Post by torzdf »

Thanks for this, it's a good post.

Overtraining is definitely a thing, however, I have never seen it happen in Faceswap (and I have trained models a VERY long way). That is not to say it isn't a thing, just that I've never hit it (I train with a LOT of data).

Overtraining is generally when the model performs well on data it has seen, but badly on new data. The steps you have shown will help mitigate this.

My word is final

User avatar
MaxHunter
Posts: 194
Joined: Thu May 26, 2022 6:02 am
Has thanked: 177 times
Been thanked: 13 times

Re: Overtraining question

Post by MaxHunter »

Great post and questions.

I've been at this for about a year so I am far from an expert.

I've had one recent model turn out terrible, and the only thing I could attest it too is "overtraining." I use a specific B model alot trying to get it down to under .02 loss, and I think you're right, adding more pics/faces at increments might help.

I wish there was a way to implement an alarm noting that loss isn't dropping, or "your loss drop is slowing, consider adding more data to avoid overtraining," etc.

Last edited by MaxHunter on Thu Apr 27, 2023 8:35 pm, edited 1 time in total.
User avatar
bryanlyon
Site Admin
Posts: 793
Joined: Fri Jul 12, 2019 12:49 am
Answers: 44
Location: San Francisco
Has thanked: 4 times
Been thanked: 218 times
Contact:

Re: Overtraining question

Post by bryanlyon »

So I guess if you are worried about overtraining, you could keep some of your training data in a stash, start with less (but enough) and add batches of images to the model's training data gradually, like, a batch every several 100k iterations? It would make the process a bit less effective probably, but will protect against overtraining because there will +/- always be some new valid data for the model to learn. Something like that?

No, you shouldn't. The fix for overtraining is giving it new data, but the prevention is to give it that data all along. In other words, by keeping that data in you're keeping the model from overtraining from the start. You should not restrict data for overtraining reasons, just give it all to the model so it can do the best job it can getting you a quality deepfake.

FaceSwap has been thoroughly engineered to minimize the chance of overtraining. This is why things like augmentation, flip, and other options were created and added to FS.

User avatar
Fed
Posts: 16
Joined: Sat Jun 18, 2022 3:31 pm
Been thanked: 1 time

Re: Overtraining question

Post by Fed »

torzdf wrote: Thu Apr 27, 2023 12:39 pm

I have never seen it happen in Faceswap (and I have trained models a VERY long way). That is not to say it isn't a thing, just that I've never hit it (I train with a LOT of data).

bryanlyon wrote: Thu Apr 27, 2023 8:36 pm

The fix for overtraining is giving it new data, but the prevention is to give it that data all along. In other words, by keeping that data in you're keeping the model from overtraining from the start. You should not restrict data for overtraining reasons, just give it all to the model so it can do the best job it can getting you a quality deepfake.

FaceSwap has been thoroughly engineered to minimize the chance of overtraining. This is why things like augmentation, flip, and other options were created and added to FS.

Ah. Good to know.

Locked