While training a model, which one is the main cause of VRAM usage -> Input size or Output size?
For eg: I am willing to train a RealFace model, so which of the following case will use more VRAM :-
Case A) Input size = 64px Output size = 128px
Case B) Input size = 128px Output size = 64px
And is this thing same or different for other models.