what is the impact of the output size on the transformation results?

xzx_0018 · Post by **xzx_0018** » Sat Apr 08, 2023 9:54 am

After attempting to train using dfl-H128 1000000 times, the model performed well when the model's face was distance from the camera, but had terrible performance when the model's face was close to the camera. I am curious as to why this is the case, and I am currently trying to switch to the Dlight model to train a higher-resolution model. Can this improve the situation?Does training a higher-resolution model become necessary when the face is relatively close to the camera? Additionally, what is the impact of the output size on the transformation results?

Post by **torzdf** » Sat Apr 08, 2023 11:46 am

xzx_0018 wrote: ↑Sat Apr 08, 2023 9:54 am
I am currently trying to switch to the Dlight model to train a higher-resolution model. Can this improve the situation?Does training a higher-resolution model become necessary when the face is relatively close to the camera?

Short answer is, yes. Higher resolution will perform better at close up shots. If you have 1080p image (1920 x 1080 pixels) and you have a model that is trained with an output size of 128px (for example), then if the face is completely filling the frame, that 128px image needs to be enlarged almost 10x. That is not going to give you a great result on almost any image which needs to be blown up that much.

xzx_0018 wrote: ↑Sat Apr 08, 2023 9:54 am
Additionally, what is the impact of the output size on the transformation results?

There should be no impact on the transformation results beyond a better final image, so the question becomes: why don't we just always train at higher-res? The answer to that is VRAM and time, pure and simple.

All things being equal, for every doubling of resolution, you quadruple the amount of VRAM required to train that model, and quadruple the amount of time required to train the model which, more often than not, will become unrealistic to train for the end user. However, if you can get to 256px (depending on the model) you should get more satisfactory (though still not perfect) results. The below example is trained at 256px on Phaze-A:

Faceswap Forum

what is the impact of the output size on the transformation results?

what is the impact of the output size on the transformation results?

Re: what is the impact of the output size on the transformation results?