Page 1 of 1

Model collapse while Training Phaze-A

Posted: Sun Jun 20, 2021 3:27 pm
by hyoer

Hi,

I am trying to train phaze-a with decent amount of data (2k B images), the stojo preset, with a batch size of 8 and mixed precision enabled, and everything else the default settings, but something weird is going on. The output images is always pale bluish. I only managed to trained the model for 10k steps before it got corrupted. Is that expected for phaze-a?

I also tried training villain using the exact same dataset, and it worked, so I dont think the problem is with the dataset.


Re: Difficulty Training Phaze-A

Posted: Mon Jun 21, 2021 3:05 am
by bryanlyon

Unfortunately a model can collapse at any time. I'd suggest trying again, and if you still have problems, try reducing your Learning Rate.


Re: Model collapse while Training Phaze-A

Posted: Sat Jun 26, 2021 9:30 am
by torzdf

Yeah, basically reduce learning rate. More complex models need lower learning rates.

If you are using Mixed Precision on StoJo preset, then I would also suggest raising the Epsilon Exponent to -5 (That model has a nasty habit of hitting NaNs with Mixed Precision otherwise, which is INCREDIBLY frustrating).