unbalanced model best practice

Want to understand the training process better? Got tips for which model to use and when? This is the place for you


Forum rules

Read the FAQs and search the forum before posting a new topic.

This forum is for discussing tips and understanding the process involved with Training a Faceswap model.

If you have found a bug are having issues with the Training process not working, then you should post in the Training Support forum.

Please mark any answers that fixed your problems so others can find the solutions.

Locked
User avatar
afterparty
Posts: 2
Joined: Wed Sep 23, 2020 9:02 pm

unbalanced model best practice

Post by afterparty »

Hi,

I'm looking for any experienced advice!

I'm setting up for training using the unbalanced model with the hope of getting the highest possible resolution for my face. The face swap is being used for a visual effects sequence in a movie where an actor is working alongside herself and I am swapping her face onto her double. I've tested with great results using the Realface model, but now I'm hoping to improve resolution.

If I train on a photoset that is 512 x 512 - what is the setting that I should be using for the input size (training) and for the encoder and decoder "complexity". What specifically does the encoder / decoder value relate to?

I'm training on 4 Tesla T4 GPUs on an AWS G4dn12 server.

Thanks for any insights!

David

User avatar
bryanlyon
Site Admin
Posts: 793
Joined: Fri Jul 12, 2019 12:49 am
Answers: 44
Location: San Francisco
Has thanked: 4 times
Been thanked: 218 times
Contact:

Re: unbalanced model best practice

Post by bryanlyon »

It's best to think of it in terms of compression/decompression. The Encoder "compresses" the face into an intermediate form and the Decoder re-creates the original. This is the basics of an autoencoder. In faceswap we use 2 decoders and by switching decoders we switch the output face.

Encoder dims will enable more data to be stored and be stored more intelligently.

Decoder dims will enable better re-creation from the encoder.

You actually need both to get good results. I'd suggest tweaking the decoder up a bit while leaving the encoder at default. Remember that increasing the resolution provides a non-linear time increase (double the resolution is 4x the time). But having 4 T4 GPUs does help with that.

Locked