RealFace or Dlight when training for 128px output?

ugramund · Post by **ugramund** » Fri Mar 12, 2021 3:51 pm

Mr [mention]torzdf[/mention] has already said in
viewtopic.php?t=418
that dlight > dfaker > h128
& so I am not including Dfaker & DflH128 here.

For a given high quality data, what should I use :-
RealFace - 64px input & 128px output
Dlite - 128px input/output

I asking this question because RealFace is a very heavy model but it seems to train faster at default 64in/128out.
So if anybody is working on a 128px output project, is it to worth to train on RealFace with this default setting of 64in/128out
or the difference will be negligible so better to train on DLight ?

Also, when I go to configure the settings of RealFace, the mouse hover information on the Input size slide says :-
"Higher resolution may increase prediction accuracy, but does not affect the resulting Output size"
Does this mean that for a 128px output, the video quality difference between 64px input & 128px input will be negligible?

Post by **bryanlyon** » Tue Mar 16, 2021 7:34 pm

It's more complicated than that. The encoder is trained to take a face and encode it so that the decoder can re-create the face. Giving it a 64px or 128px it should be able to get all the face data it needs to be able to feed the decoder. The decoder is responsible for all the detail above that.

ugramund · Post by **ugramund** » Sat Mar 20, 2021 6:42 am

bryanlyon wrote: ↑Tue Mar 16, 2021 7:34 pm
It's more complicated than that. The encoder is trained to take a face and encode it so that the decoder can re-create the face. Giving it a 64px or 128px it should be able to get all the face data it needs to be able to feed the decoder. The decoder is responsible for all the detail above that.

As the RealFace model configure settings page says :-
"Incorporates ideas from Bryanlyon..."

So can you please suggest me an ideal input size for RealFace model if I were
to use 192px & 256px as output size?

Post by **bryanlyon** » Mon Mar 22, 2021 6:00 pm

I'd leave it at 64x. Maybe 128x at most.

ugramund · Post by **ugramund** » Wed Mar 31, 2021 3:25 am

One last question [mention]bryanlyon[/mention], since I feel like I have found my
"DREAM MODEL" in the RealFace model.

Supposing that in the near future, I can rent a GPU farm or become an owner of a
very powerful machine, at what values should I keep the :-
Dense Nodes
Complexity Encoder
Complexity Decoder
so that I can get the best realistic swap (assuming that I have a great quality data set)
even if it makes my training time more ?

I am asking this question irrespective of the Input/Output size.

Also, if the DEFAULT settings are IDEAL for all cases, then when does one can change them?

Post by **torzdf** » Wed Mar 31, 2021 10:07 am

Honestly, that's not an answer we can really give.... it really is an experiment for the user. As you are aware, training models takes time, so we cannot test all (or any) settings. Generally higher = better, but that is a massive over-simplification and is only true to a point.

thinkapplefour · Post by **thinkapplefour** » Thu Apr 01, 2021 5:30 am

Difficult angles are ok, but iit's about having matching data.....

ugramund · Post by **ugramund** » Sat Apr 03, 2021 8:18 am

My Final RealFace model settings.
Hope it would help someone.

My main use of FaceSwap was always for doing PHOTO swaps & not VIDEO swaps since :-
1)Easy to do & gets completed quickly
2)Limitation of my current hardware (training on an i5 CPU of a 7 year old laptop )

Have done basic photoshop skills like changing the skin tone or
increasing skin brightness successfully through FaceSwap.
There are many more funny things such as adding/removing beard, swapping face
on to a baby/old man etc which I have done semi-perfectly.

UPDATE
The last time I shared the 5 settings here, I was not happy with it as it was based on pure guess
without any logic, even though it did gave me good results in swapping but it looks like whatever value
you will increase from the DEFAULT RealFace model values, it is obvious that it will give good results
but as [mention]torzdf[/mention] said above :-
"Generally higher = better, but that is a massive over-simplification and is only true to a point"

& so I want to increase the values only up to the point where it increases the quality.
I don't want it to go so further that it only increases my training time & not quality.

Then I thought the best person to ask about the RealFace model settings would be
[mention]andenixa[/mention] (creator of RealFace model) & so I did a PM to him
but it looks like he's a very busy person & never gonna reply me.

So this time , the setting I want to share is calculated on the basis of the RATIO that is
used in the DEFAULT setting values.

Default Values of RealFace Model

Input size = 64
Output size = 128
Dense Nodes = 1536
Complexity Encoder = 128
Complexity Decoder = 512

Calculating the Ratios

Output size = 2 x Input size ,
so by this ratio :-
Input size = 128 when you want Output size = 256
Input size = 96 when you want Output size = 192
Input size = 80 when you want Output size = 160

But you can read above that [mention]bryanlyon[/mention] has said he would not recommend on increasing
input size so it's up to you to follow this ratio of Input/Output size.

Complexity Decoder = 512 = 4 x 128 = 4 x Complexity Encoder
and
Dense Nodes = 1536 = 3 x 512 = 3 x Complexity Decoder
also
Dense Nodes = 1536 = 12 x 128 = 12 x Complexity Encoder

The maximum value you can take the Complexity Decoder to is 544.
So, if we set the new value for Complexity Decoder = 544, then :-

new value of Complexity Encoder = new value of Complexity Decoder/4 = 544/4 = 136
new value of Dense Nodes = 3 x new value of Complexity Decoder = 544 x 3 = 1632
also here the
new value of Dense Nodes = 12 x new value of Complexity Encoder = 12 x 136 = 1632

So, the sum up of our new values would be :-

New Values of RealFace model

Input size = 64 (up to you, whatever size you want)
Output size = 128 (up to you, whatever size you want)
Dense Nodes = 1632
Complexity Encoder = 136 (also falls in the sensible range of 128 to 150 as said in the tooltip)
Complexity Decoder = 544

To be honest, I still don't know if the above settings are GOOD in comparison to the ORIGINAL settings
but at least this time I have some logical reasons for these values, although it is still, a logical guess &
not TRUE KNOWLEDGE.

I use FACE in the "face centering" setting with 80% coverage because of which I have never faced double eyebrow issue.
During all my training time, the Eye & Mouth multiplier were to 1 & learning rate was 6.5e.
You can use the default learning rate but set the Eye & Mouth multiplier to 1 as I think it
helps in getting greater details of skin.

Got many information from this forum but now I would remain inactive for sometime.

I have to complete my PET PROJECT of making a MEGA PRE TRAINED RealFace model for
which I have already collected pictures of more than 2000+ different persons from
more than 25 different countries.
(India,China,Japan,S.Korea,Thailand,US,UK,Canada,France,Russia,Germany,UAE ......etc)
I personally call it "The World Face Library" .

Each person has only one image but they are with different lighting
conditions, facial expressions, angles & features and more than 85%
of those images are in HD to SuperHD quality.

I want my model to MASTER each of these faces to become a super pre trained model.

By mastering I mean to say whenever I have to master a face, I make 25 copies of that
single face, set it as the common location for both Train A & B location & I train
till the time swap preview becomes exactly like the original image.
25 copies is because it is the minimum number for FaceSwap to start training.

People can argue that Pre Trained models are bad as it can lead to identity bleed
but as far as I have seen in case of RealFace model, I had mastered 25 different faces
at 64px in/out & used that model for other photo swaps & never had an issue of identity
bleeding, instead it helped me in FASTER training & filling the gaps when I had incomplete
data set. The only thing you have to understand is allow it train for long time & the
model can remove all the identity bleeds (in most cases with a good variety of data set).

If I can own a powerful machine in the near future & I am able to complete this project, then
I would love to share that Pre Trained model here & in other forums but it looks like not
gonna happen in next 2-3 years.

Thank you everyone here

Faceswap Forum