Is MP always a VRAM saver / performance enhancer?

couleurs · Post by **couleurs** » Fri Jan 13, 2023 6:37 am

There appears to be a complex relationship between model structure and how MP will behave.

Some models run significantly faster or with higher batch sizes, but others don't really see any improvement at all.
And in the worst case, I even built a model that somehow allowed me to run a batch size (4) on FP that OOMed on MP

Is it possible to determine how model structure will affect the MP/FP ratio of performance or VRAM use? (without building/running the model, I mean)

I'm at a loss to explain why MP would run slower or with more VRAM than FP in some configurations... has anyone else experienced this? I'm running a 3060, so should support MP fully afaik

Post by **bryanlyon** » Fri Jan 13, 2023 6:44 am

Mixed precision will ALWAYS use less VRAM. It however, will not always be faster. That does depend on your GPU and system.

If you're on Windows a LOT of things can use your VRAM, even just opening the start menu takes some. In the end, it's very possible that your web browser reserved some memory and caused your MP to OOM.

couleurs · Post by **couleurs** » Fri Jan 13, 2023 8:55 pm

bryanlyon wrote: ↑Fri Jan 13, 2023 6:44 am
Mixed precision will ALWAYS use less VRAM. It however, will not always be faster. That does depend on your GPU and system.

Thank you for the clarification

bryanlyon wrote: ↑Fri Jan 13, 2023 6:44 am
If you're on Windows a LOT of things can use your VRAM, even just opening the start menu takes some. In the end, it's very possible that your web browser reserved some memory and caused your MP to OOM.

I'll make sure to close everything and check idle VRAM to see if I can confirm the inverted MP vs FP usage behavior, but you're probably right

For estimating the %VRAM saved for any given model going FP->MP, am I correct in reasoning that it's roughly
(16bits/32bits = 0.5) * (#params in MP-compatible layers) / (#params in total layers)
or are there other factors - in the model, not GPU/OS/system-level - that affect it?

Post by **bryanlyon** » Mon Jan 16, 2023 12:44 am

Mixed precision is not a simple 1/2 the vram saved unfortunately. It uses "mixed" precision which means some parts are kept in fp32 and some are in fp16. In fact, recent versions of Tensorflow have even done some stuff into TF32 which is a weird Nvidia-only 19bit float replacement (see https://blogs.nvidia.com/blog/2020/05/1 ... on-format/ )

In the end even version can change the savings. But it will NEVER make it larger.

Faceswap Forum

Is MP always a VRAM saver / performance enhancer?

Is MP always a VRAM saver / performance enhancer?

Re: Is MP always a VRAM saver / performance enhancer?

Re: Is MP always a VRAM saver / performance enhancer?

Re: Is MP always a VRAM saver / performance enhancer?