what is the impact of the output size on the transformation results?

Want to understand the training process better? Got tips for which model to use and when? This is the place for you


Forum rules

Read the FAQs and search the forum before posting a new topic.

This forum is for discussing tips and understanding the process involved with Training a Faceswap model.

If you have found a bug are having issues with the Training process not working, then you should post in the Training Support forum.

Please mark any answers that fixed your problems so others can find the solutions.

Locked
User avatar
xzx_0018
Posts: 1
Joined: Sat Apr 08, 2023 9:44 am

what is the impact of the output size on the transformation results?

Post by xzx_0018 »

After attempting to train using dfl-H128 1000000 times, the model performed well when the model's face was distance from the camera, but had terrible performance when the model's face was close to the camera. I am curious as to why this is the case, and I am currently trying to switch to the Dlight model to train a higher-resolution model. Can this improve the situation?Does training a higher-resolution model become necessary when the face is relatively close to the camera? Additionally, what is the impact of the output size on the transformation results?

User avatar
torzdf
Posts: 2681
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 133 times
Been thanked: 625 times

Re: what is the impact of the output size on the transformation results?

Post by torzdf »

xzx_0018 wrote: Sat Apr 08, 2023 9:54 am

I am currently trying to switch to the Dlight model to train a higher-resolution model. Can this improve the situation?Does training a higher-resolution model become necessary when the face is relatively close to the camera?

Short answer is, yes. Higher resolution will perform better at close up shots. If you have 1080p image (1920 x 1080 pixels) and you have a model that is trained with an output size of 128px (for example), then if the face is completely filling the frame, that 128px image needs to be enlarged almost 10x. That is not going to give you a great result on almost any image which needs to be blown up that much.

xzx_0018 wrote: Sat Apr 08, 2023 9:54 am

Additionally, what is the impact of the output size on the transformation results?

There should be no impact on the transformation results beyond a better final image, so the question becomes: why don't we just always train at higher-res? The answer to that is VRAM and time, pure and simple.

All things being equal, for every doubling of resolution, you quadruple the amount of VRAM required to train that model, and quadruple the amount of time required to train the model which, more often than not, will become unrealistic to train for the end user. However, if you can get to 256px (depending on the model) you should get more satisfactory (though still not perfect) results. The below example is trained at 256px on Phaze-A:

My word is final

Locked