[Guide] Training best practices.

Want to understand the training process better? Got tips for which model to use and when? This is the place for you


Forum rules

Read the FAQs and search the forum before posting a new topic.

This forum is for discussing tips and understanding the process involved with Training a Faceswap model.

If you have found a bug are having issues with the Training process not working, then you should post in the Training Support forum.

Please mark any answers that fixed your problems so others can find the solutions.

Post Reply
User avatar
bryanlyon
Site Admin
Posts: 793
Joined: Fri Jul 12, 2019 12:49 am
Answers: 44
Location: San Francisco
Has thanked: 4 times
Been thanked: 218 times
Contact:

[Guide] Training best practices.

Post by bryanlyon »

Training is the most intensive -- and confusing part of Faceswap. Unfortunately, this part is more like an art than a science right now, and there is a ton of conflicting information out there. For this reason, we thought we'd collate all of the "best practices" on how to use Faceswap right now and give everyone a basic guide for how to ensure that they get the best results.

The following topics are all roughly in their order of importance. This may vary a bit (and individuals may disagree with my subjective ordering) but the advice here is the best that we currently know.

Use Multiple Sources

This one is critical. You really need to train from multiple sources. Don't just use one video for each A and B (In this guide, we'll also use Original for A and Swap for B) since that wont give the AI enough information to perform a proper swap. Failure to follow this advice will cause the AI to fail to learn enough to get proper color, probably wont get enough data to handle all expressions, and will almost definitely fail to give you a quality result.

Use the Previews, not Loss or Iterations to decide when you're done

The only reliable way to determine if the AI is going to give quality results is to look at the previews. The 3rd column of each set of photos shows the Swapped face. This lets you compare to what the original face looks like as well as the swap. Ideally the swap should look exactly the same detail and quality as the original. This will never quite be the case and usually the swap will be slightly blurrier and less detailed. Unfortunately there is always some degree of loss due to how AIs work.

If one face's recreation is significantly worse than the other's (Judge by the face being put back on itself and not the swap) then you need to address this. This is sign of a data problem, getting higher quality and more data for the lagging face will be required to get a good result. Even if it's the Original face that is lagging if it's significant you need to address it or the swap will suffer.

Some people say you should look at the loss leveling out before you stop training. This is not recommended since loss only measures recreating the original faces (A-A and B-B) and cannot measure the quality of the swap. This, couples with something known as "overtraining" which will actually lead to worse results as time goes on. The only way to prevent this is to use the previews. However, there is another "treatment" for overtraining which can help a mildly overtrained model (Severely overtrained models will never be recoverable) this treatment is called Fit Training and is addressed below.

Use Quality Sources

Youtube videos are quite low quality. They're heavily compressed, usually shot with bad lighting, and are NOT recommended unless there simply is no other way to get data. Some Youtube channels are better than others and may have 4K videos or high quality filming which might help to mitigate these problems, but you're still using a sub-standard data source.

Ideally you'd want to rip Blu-Rays (Or possibly very carefully selected DVDs) yourself. You don't have to get 4k quality from these, but you want to get the cleanest, best data you can. Ripping yourself is the only way to guarantee that you aren't dealing with substandard data. (Software to do this is outside the scope of FaceSwap and we don't officially endorse any software, but you can check out https://www.makemkv.com/ for a potential ripping software)

If you're filming yourself, then try to get a variety of lighting, expressions and poses. Also you should do everything you can to avoid motion blur and video noise. For these reasons we recommend using more lighting than you might think, going with a fast shutter speed, and using a tripod. If at all possible, film the same scene multiple times with different lighting colors. When this can't be done, you're going to be spending a lot more time postprocessing the results to get a reasonable output.

Part of this is trusting your source. Don't just remove all images that have a bit of blur or aren't perfect. The AI will experience those while swapping so needs to learn them early if you want it to be robust to those situations later. By all means, delete any images that are severely misaligned or don't show any usable face, but leave most other images you get from the extractor. The only exception to this would be if the lighting were so different it was hurting your training, in which case you might want to go back to the drawing board on which sources you select.

Select your Original and Swap faces carefully

Be careful about face shape

You're not going to get good results putting a thin and angular facFit training with different Swap data is not required e onto a wide, rounded face. The data just isn't going to give you a good result. For this reason we recommend finding faces that are pretty similar in shape and dimensions (Relative, not absolute). Try not to swap a short face for a long one or to make excessive changes to the shape. The AI will dutifly swap them, but you're not going to get a good result without manually painting out the original face.

If you're shooting yourself and can't avoid swapping a smaller face onto a larger one then make sure to get background plates so that you can place the background back over the original face and apply the swap on top of that.

Watch out for skin tones

The AI can and SHOULD adjust for minor differences in skin tone. The Swap will be recolored to be more like the original in these cases. However, there is a limit to this, and the further from the Swap's original skin tone the worse the results will be. My golden rule applies here: Don't make the AI do any more than it has to. Try to keep the skin tones similar. Multiple sources with varied lighting will be more helpful for the AI to learn how to change the skin tone.

If you're changing skin tone I think it's very important to leave Color Augmentation ON for this. You'll get more accurate results.

Avoid beards that go outside the face area

Mustaches and subtle Goatees are fine, they'll be swapped just like any other facial detail but you really need to avoid beards in general. Anything that exits the face area will only be swapped inside the face area. Even if you're training on two people with similar beards they wont line up after the swap. There is unfortunately no way around this at this time, so we highly recommend that you avoid swapping any sources with beards.

Use Fit Training

Fit training is a technique where you train your model on data that it wont see in the final swap then do a short "fit" train to with the actual video you're swapping out in order to get the best results. This technique eliminates the danger of "overtraining" and produces clearer final results. To do this, you MUST follow the advice in our first section about getting multiple sources. It's impossible to fit train from a single source of data.

An example of this: If you were going to swap Nick Cage onto Harrison Ford as Indiana Jones you'd first train on data from (for example) Star Wars and Blade Runner (and maybe a scene or two from Indiana Jones that you're not swapping, ideally a different movie). This would be done until the quality from the swap looked good then once you're happy with those results you'd stop that training and give it the actual video you're planning on swapping. This would train for a much shorter time (Less than 10% of the original training) and should learn the subtleties of the new scene. Changing the Swap data for Fit training is not required but recommended to avoid the same overtraining problems.

This works even for a reused model where you have the same Original and Swap faces, since you can do multiple sessions of fit training on a model for different scenes. If you decide to do this, I recommend keeping backups before each fit training so that you can use the model that learned the swap best.

Don't Pretrain or Reuse models

You may have heard of Pretraining or reusing the same model for different faces proposed by other users, unfortunately doing what they say will only give worse results than a properly trained model. The benefits of pretraining or reusing a model is that it can improve training speed of a model, but at the cost of "Identity leakage" where a face just doesn't match the Swap's face properly. How the AI works is that it continuously refines the face that it has, but it is incentivized to only ADD to the knowledge so no mater if you retrain a model that was different faces or use a pretrained model you will get odd face mixing that will taint your results and make it look more like a different person.

There is a "safe" way to do pretraining/model reuse that avoids this particular drawback, unfortunately it requires a significant effort to get it working properly. To do this, you want to start a new model that matches the pretrained model, then replace the Encoder with the pretrained encoder. Unfortunately, this is not a perfect solution and will still take longer than a full pretrained model, but it does prevent the identity leakage situation.

EDIT: This is now officially supported by using the "Load Weights" option. Use of this option is detailed in the Training guide here: viewtopic.php?t=146#:~:text=Load%20Weights

Don't use HDR videos

HDR videos use a non-linear colorspace. The model wont be able to learn the face properly since the colors and lighting will be unreliable.

It's important to know that MOST but not ALL 4k movies are HDR. There is also no way to reliably un-HDR a movie for Faceswapping (at least without manually regrading the entire source). However, you can generally find a non-HDR version by getting a non-4k blu-ray which generally wont have HDR..

Last edited by bryanlyon on Mon May 16, 2022 5:54 am, edited 1 time in total.
Reason: Added link for Load Weights

Tags:
User avatar
congo
Posts: 16
Joined: Mon Dec 16, 2019 3:09 pm
Has thanked: 7 times
Been thanked: 1 time

How to do fit training?

Post by congo »

Hi everybody,

the best practice guide recommends to use fit training. How is this exactly done?
After training the initial model, do I need to ADD the faces of the actual swap video (i.e. copy the faces to the initial ones and merge the alignments files) or do I need to REPLACE the faces (continue training only using the faces of the swap video and the corresponding alignment file)?
What does "Changing the Swap data for Fit training is not required but recommended" mean?

User avatar
bryanlyon
Site Admin
Posts: 793
Joined: Fri Jul 12, 2019 12:49 am
Answers: 44
Location: San Francisco
Has thanked: 4 times
Been thanked: 218 times
Contact:

Re: [Tip] Training best practices.

Post by bryanlyon »

Fit training is replace all your data for A with the set you're actually going to swap. Changing B is unecessary but can also help avoid overtraining problems..

User avatar
abigflea
Posts: 182
Joined: Sat Feb 22, 2020 10:59 pm
Answers: 2
Has thanked: 20 times
Been thanked: 62 times

Re: [Tip] Training best practices.

Post by abigflea »

clarifying question.
So I train A&B sets for 100,000 iterations and feel its pretty good.
Then using that same model, change the A data set to a set that contains more information of what I'm going to actually be swapping , and less of all other sources?
Then train this for another 5-10K iterations and see?

:o I dunno what I'm doing :shock:
2X RTX 3090 : RTX 3080 : RTX: 2060 : 2x RTX 2080 Super : Ghetto 1060

User avatar
torzdf
Posts: 2649
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 128 times
Been thanked: 622 times

Re: [Tip] Training best practices.

Post by torzdf »

That sounds about right.

My word is final

User avatar
AaronLaw
Posts: 6
Joined: Mon May 11, 2020 10:42 am
Has thanked: 6 times

Re: [Tip] Training best practices.

Post by AaronLaw »

Dear facewap:

Training model, can I do this:

For example, I used Nicholas Cage ’s face to replace the footage of Harrison Ford ’s movie;

After deep learning, I obtained the training model file,

A few days later, I found the lens of another movie of Harrison Ford. Can I use the model I trained before to train the face of Harrison Ford in this movie?

Can this model trained many times accelerate the learning of other Harrison Ford lenses?

Looking forward for your response, thank you.

User avatar
bryanlyon
Site Admin
Posts: 793
Joined: Fri Jul 12, 2019 12:49 am
Answers: 44
Location: San Francisco
Has thanked: 4 times
Been thanked: 218 times
Contact:

Re: [Tip] Training best practices.

Post by bryanlyon »

Models are 1:1 identity to identity. In your example Cage:Ford. Using that model you can always convert Harrison Ford into Nicholas Cage. But it will always work best if the actors look the same as the one you originally trained. I.E. Old Harrison Ford is probably different than young Harrison Ford.

User avatar
liucfy
Posts: 3
Joined: Tue Sep 08, 2020 4:54 pm
Has thanked: 1 time

Re: [Tip] Training best practices.

Post by liucfy »

bryanlyon wrote: Sun Feb 09, 2020 10:34 pm

Fit training is replace all your data for A with the set you're actually going to swap. Changing B is unecessary but can also help avoid overtraining problems..

When I was using the Dlight model, I found that B Face training seemed to be faster.
But what I actually want is to replace A Face with B Face.
Although I know that "Swap Model" can be used, it seems to be slower,
so I switched A to B (about 200K iteration). I know i made a mistake........
Now it has 500K, and it seems that there is no problem.

Dlight Model:
Iterations 598K

Image

I want to know, if I continue to train, will I slowly ignore the 200K?

The result of the conversion seems to be okay, but if i continue to train like this, will it become overtraining?
Image

loss seems too high.....Although I know this data is not meaningful

Image

The source quality of the picture looks very important

512px

Image

User avatar
tokafondo
Posts: 32
Joined: Mon Dec 16, 2019 1:43 pm
Has thanked: 10 times
Been thanked: 5 times

Re: [Guide] Training best practices.

Post by tokafondo »

bryanlyon wrote: Fri Aug 23, 2019 6:26 pm

Don't Pretrain or Reuse models

[...]
There is a "safe" way to do pretraining/model reuse that avoids this particular drawback, unfortunately it requires a significant effort to get it working properly. To do this, you want to start a new model that matches the pretrained model, then replace the Encoder with the pretrained encoder. Unfortunately, this is not a perfect solution and will still take longer than a full pretrained model, but it does prevent the identity leakage situation.
[...]

I would like to test this. I have found an HDF file explorer/editor, and I would like to know which files/folders should I transfer to one model to another to have this. Thanks for replies.

User avatar
jpebcac
Posts: 5
Joined: Tue Nov 03, 2020 3:38 pm

Re: [Guide] Training best practices.

Post by jpebcac »

I have to admit, this has me very confused.

So, for Model A (the face I want to put on Model B) I took several good quality videos I have and turned it into about 4k shots, merged them together as shown in another area. So, now I have all of those in one folder as a set.

For the target I want to move the face to, what I had done is take 4 videos that person was in and extract faces, merge, and then that was my Model B set.

I set up trainer (Villain, mostly) to go A-B, and just let it run. I get, hmm fair results on all videos after a bit.

The only part I am really struggling with is later in the convert because it seems like when I do extract every 1 frame, it still misses a few or mis-aligns, and I'm having to go back manual and fix a lot of frames. But, that is at least time I have.

This seems to be saying I'm doing it all wrong.

User avatar
Erick Lestrange
Posts: 1
Joined: Thu Jan 14, 2021 2:35 pm

is it worth trying with another race?

Post by Erick Lestrange »

new to the gui etc. before starting extensive training i was wondering if it would be worth using obama or another race on my white face and how realistic or awkward it would look?

User avatar
bryanlyon
Site Admin
Posts: 793
Joined: Fri Jul 12, 2019 12:49 am
Answers: 44
Location: San Francisco
Has thanked: 4 times
Been thanked: 218 times
Contact:

Re: [Guide] Training best practices.

Post by bryanlyon »

Generally, it's not worth it. Much better to get an actor who matches closer. If that's simply not possible, however, you may be able to do it, as posted in the main post, get lots of data and leave color augmentation on.

Post Reply