Mixed Precision Questions

Want to understand the training process better? Got tips for which model to use and when? This is the place for you


Forum rules

Read the FAQs and search the forum before posting a new topic.

This forum is for discussing tips and understanding the process involved with Training a Faceswap model.

If you have found a bug are having issues with the Training process not working, then you should post in the Training Support forum.

Please mark any answers that fixed your problems so others can find the solutions.

Locked
User avatar
363LS2GTO
Posts: 30
Joined: Fri Jul 08, 2022 7:06 pm
Has thanked: 1 time
Been thanked: 4 times

Mixed Precision Questions

Post by 363LS2GTO »

I have read that mixed precision will free some VRAM so I can run a larger batch size. I have also read that it can cause errors.

I have read varying information on its usefulness depending upon the graphics card used.

I have also read some conflicting information on if it can be enabled / disabled during training. Which is correct?

My RTX 3050 is a little low on VRAM at 8GB but has a compute capability same as all of the other 30XX cards at 8.6. Is it recommended to turn on mixed precision for these cards due to their high compute capability?

User avatar
torzdf
Posts: 2649
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 128 times
Been thanked: 623 times

Re: Mixed Precision Questions

Post by torzdf »

Mixed precision will always be able to gain you VRAM. It can be enabled for any Nvidia card. You will only get additional speed benefits with a 20xx card or higher (so your 30xx card will get you additional speed benefits).

You can switch between full precision and mixed precision for any existing model. This is a relatively new addition to Faceswap (there is no reason why you can't switch precision, just Tensorflow does not allow that ability out of the box, so I have had to implement our own solution). In the past, when you chose a precision you were stuck with it.

Mixed Precision can increase the risk of numerical instability. That is NaNs appearing in the model. This will not destroy your model, but it can be very frustrating having to roll back 50k iterations and lower learning rate. This is because the numerical range of fp16 (the numerical range that mixed precision does calculations in) is much smaller than the numerical range of fp32 (the numerical range of full precision). Tensorflow implements something called "dynamic loss scaling" which should mitigate this issue, but from experience, it is not perfect. In fact Googling "mixed precision NaN" you will find numerous posts, across multiple ML Libraries that have this issue.

The bottom line is, though, for some models, you will have no choice but to enable Mixed Precision to fit the model into VRAM. You will get speed benefits from this, but you may need to learn how to live with the drawbacks.

My word is final

User avatar
363LS2GTO
Posts: 30
Joined: Fri Jul 08, 2022 7:06 pm
Has thanked: 1 time
Been thanked: 4 times

Re: Mixed Precision Questions

Post by 363LS2GTO »

Thank you for the reply. That answers my questions.

I will stick with full precision as long as I can train in the 10-16 batch size and try mixed precision once I get into single digit batch sizes.

Locked