Model collapse while Training Phaze-A

Want to understand the training process better? Got tips for which model to use and when? This is the place for you


Forum rules

Read the FAQs and search the forum before posting a new topic.

This forum is for discussing tips and understanding the process involved with Training a Faceswap model.

If you have found a bug are having issues with the Training process not working, then you should post in the Training Support forum.

Please mark any answers that fixed your problems so others can find the solutions.

Locked
User avatar
hyoer
Posts: 1
Joined: Sat Jun 19, 2021 2:22 am

Model collapse while Training Phaze-A

Post by hyoer »

Hi,

I am trying to train phaze-a with decent amount of data (2k B images), the stojo preset, with a batch size of 8 and mixed precision enabled, and everything else the default settings, but something weird is going on. The output images is always pale bluish. I only managed to trained the model for 10k steps before it got corrupted. Is that expected for phaze-a?

I also tried training villain using the exact same dataset, and it worked, so I dont think the problem is with the dataset.

User avatar
bryanlyon
Site Admin
Posts: 793
Joined: Fri Jul 12, 2019 12:49 am
Answers: 44
Location: San Francisco
Has thanked: 4 times
Been thanked: 218 times
Contact:

Re: Difficulty Training Phaze-A

Post by bryanlyon »

Unfortunately a model can collapse at any time. I'd suggest trying again, and if you still have problems, try reducing your Learning Rate.

User avatar
torzdf
Posts: 2649
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 128 times
Been thanked: 623 times

Re: Model collapse while Training Phaze-A

Post by torzdf »

Yeah, basically reduce learning rate. More complex models need lower learning rates.

If you are using Mixed Precision on StoJo preset, then I would also suggest raising the Epsilon Exponent to -5 (That model has a nasty habit of hitting NaNs with Mixed Precision otherwise, which is INCREDIBLY frustrating).

My word is final

Locked