Can I get a little advice on how to avoid complete potato faces?

Want to understand the training process better? Got tips for which model to use and when? This is the place for you


Forum rules

Read the FAQs and search the forum before posting a new topic.

This forum is for discussing tips and understanding the process involved with Training a Faceswap model.

If you have found a bug are having issues with the Training process not working, then you should post in the Training Support forum.

Please mark any answers that fixed your problems so others can find the solutions.

Locked
User avatar
SirDeepsALot
Posts: 35
Joined: Sat May 28, 2022 7:10 am
Has thanked: 7 times
Been thanked: 2 times

Can I get a little advice on how to avoid complete potato faces?

Post by SirDeepsALot »

I think I'm about 2 months into using this program now, with zero successful faceswaps to show for it - I'm really hoping a little advice can help solve my major problem, which seems to be what I lovingly refer to as 'complete potato faces' during training eg:

Initially I thought the issue was the extraction / mask capture so I kept trying different settings and variations, then it was clear that the automatic capturing of the facial area was radically off quite often, so there's weeks wasted training there with bad masks. So I spent weeks after that editing mask captures, thinking that would fix it... which is when it became clear, maybe the videos I was trying to use were the issue - too dark, frequent movement, poor angles...

So for the last few weeks I have instead been editing all the source videos to only include frames that faceswap can easily automatically create a mask for, if faceswap struggles to get any part of the video, I edit it down, and re-enter the source - so rather than edit via faceswap, I'm editing the source to make faceswaps job, theoretically, easy.

Still, in training I seem to be getting a good chunk of potato faces - even though every mask / face captured during the extraction process was made perfectly by faceswap with zero need for manual mask editing on my part.

I thought maybe the issue was the Dfaker trainer. I went through the forums and Realface seemed to be getting a good review so I tried that, but my system can't apparently handle realface and anything else, not even a web browser, being open at the same time. when I try realface won't start at all or crashes part way through training when I try and do work on anything else. So now I'm trying Unbalanced as my trainer, but still the potatoes persist - for instance look at this preview frame, why is this trying to paste a profile shot, over the top of a frontal video:

Any advice on settings /trainer type etc that I should be using I would be incredibly grateful for. I am using videos that are high quality and both the source and the swap video are almost exclusively edited down to only full frontal faces.

User avatar
torzdf
Posts: 2651
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 129 times
Been thanked: 622 times

Re: Can I get a little advice on how to avoid complete potato faces?

Post by torzdf »

Your data is no way near varied enough.

Good (varied) data:

varied.jpg
varied.jpg (142.54 KiB) Viewed 1078 times

Bad (unvaried) data:

unvaried.jpg
unvaried.jpg (237.51 KiB) Viewed 1078 times

You can only get away with very unvaried data if you have complete control over every aspect of pose/lighting etc. etc. and you know how to control it.

The models feed/learn off highly varied data. They cannot imagine what they have not seen.

My word is final

User avatar
SirDeepsALot
Posts: 35
Joined: Sat May 28, 2022 7:10 am
Has thanked: 7 times
Been thanked: 2 times

Re: Can I get a little advice on how to avoid complete potato faces?

Post by SirDeepsALot »

Are you saying to just do a data dump into the A and B output folders even if I'm only going to use a fraction of that source material in the conversion process?

Eg all I want to do is create a 30-second video, faceswapping Actress B onto Actress A. But you're saying to include the 30 second video in the data dump, but that it should not be the only source material I use for Extraction or training. I need to data dump for extraction, including the 30 second video I plan to eventually convert, use that extracted data dump for training - but only select the 30 second video I had intended to convert as the "reference video" for conversion?

If that's the case why is everyone against using training presets; like an Arnold Schwarzenegger pre-trained model, that you then add your own Arnold Schwarzenegger video extractions to?

User avatar
torzdf
Posts: 2651
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 129 times
Been thanked: 622 times

Re: Can I get a little advice on how to avoid complete potato faces?

Post by torzdf »

SirDeepsALot wrote: Tue Aug 30, 2022 7:50 am

Are you saying to just do a data dump into the A and B output folders even if I'm only going to use a fraction of that source material in the conversion process?

Yes absolutely. You are not training a model for a scene. You are training a model to be able to swap the features of 2 faces. To do this, the model needs as much data as possible.

If that's the case why is everyone against using training presets; like an Arnold Schwarzenegger pre-trained model, that you then add your own Arnold Schwarzenegger video extractions to?

I'm not sure anyone is against it. We don't provide pre-trained models, but we are not against anyone else sharing them. The main problem is that a model is trained for an A/B pairing, so you can't just share "Schwarzenegger" You have to share Schwarzenegger + another.

My word is final

User avatar
SirDeepsALot
Posts: 35
Joined: Sat May 28, 2022 7:10 am
Has thanked: 7 times
Been thanked: 2 times

Re: Can I get a little advice on how to avoid complete potato faces?

Post by SirDeepsALot »

Ok thanks for that. This is an open source project right so if people contribute to it, you're not against it? What's Faceswap coded in? Sounds like the fastest way to get things going is to data dump from multiple sources, but then have a "delete all" function for extractions that have no faces, misaligned faces, and / or multiple faces, then train based on what is left. That way you only manual edit the portion you plan to convert, but you still have multiple sources.

Locked