bugs on training

If training is failing to start, and you are not receiving an error message telling you what to do, tell us about it here


Forum rules

Read the FAQs and search the forum before posting a new topic.

This forum is for reporting errors with the Training process. If you want to get tips, or better understand the Training process, then you should look in the Training Discussion forum.

Please mark any answers that fixed your problems so others can find the solutions.

Post Reply
User avatar
raven_soul
Posts: 4
Joined: Mon Feb 26, 2024 8:23 pm

bugs on training

Post by raven_soul »

Can someone help me with the training process? I just want to swap faces from the series "a series of unfortunate events" to the movie with the same name (the series would be face A and the movie face B). I tried 5 sec scenes, scenes where only 1 actor is seen and I even cut the scenes to show 1 face and cut scenes to make them shorter, I tried to extract faces in every angle and every scene and tried every tool for extracting, training and convert process but nothing works. It always says there's an error or the images appear blurry or horrible. I spent a month waiting (and trying over again) to convert only to get a jumpscare video. What steps should I follow to do it right?

by torzdf » Wed Mar 20, 2024 12:36 pm

Ok, almost everything here could be improved and is covered in the training guide... in short though...

raven_soul wrote: Wed Mar 20, 2024 4:32 am

"girl face b movie" folder: input b). I used the original trainer with bach size as 16 and 100000 interations, no folders for the timelapse, and no augments chosen.

Original model is 64px. You are never going to get great output if faces are filling the frame for HD sources. See I've trained the model for ages, the previews look good, so why is the swapped face blurry?

raven_soul wrote: Wed Mar 20, 2024 4:32 am

Code: Select all

03/19/2024 21:36:42 INFO Model A Directory: 'E:\asoe\vio face a se' (733 images)
03/19/2024 21:36:42 INFO Model B Directory: 'E:\asoe\viole face b movie' (510 images)

This number of images may be low... It may not be, but I'd generally go higher. More important than quantity is quality/variety

raven_soul wrote: Wed Mar 20, 2024 4:32 am

Code: Select all

The dtype policy mixed_float16 may run slowly because this machine does not have a GPU. Only Nvidia GPUs with compute capability of at least 7.0 run quickly with mixed_float16.
If you will use compatible GPU(s) not attached to this host, e.g. by running a multi-worker model, you can ignore this warning. This message will only be logged once

You don't have a GPU? This is not a great idea, and is likely to take you weeks/months to even get close to half decent results

raven_soul wrote: Wed Mar 20, 2024 4:32 am

In training settings I have this:

Code: Select all

coverage: 87.5 

Change this to 100 for Face centering

raven_soul wrote: Wed Mar 20, 2024 4:32 am

And I have extended mask in mask type.

Use BiseNet-FP (as explained in guides)

raven_soul wrote: Wed Mar 20, 2024 4:32 am

The problem is that the program sometimes reports unexpected crashes or it reports "Caught exception in thread: '_training'" several times.

When the program crashes, the error message will tell you where the crash report is stored (hint: it will be in your faceswap folder). You should provide this report as it contains essential information for diagnosing the problem

Go to full post
User avatar
torzdf
Posts: 2687
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 135 times
Been thanked: 628 times

Re: bugs on training

Post by torzdf »

You're going to need to be a lot more specific about the steps you have taken to get to this point.

Most likely you have missed some crucial steps, have bad data, have a low res model or have not trained enough.

If you haven't been following the guides. then follow the guides. They will help you a lot:

My word is final

User avatar
raven_soul
Posts: 4
Joined: Mon Feb 26, 2024 8:23 pm

Re: bugs on training

Post by raven_soul »

torzdf wrote: Tue Mar 19, 2024 4:04 pm

You're going to need to be a lot more specific about the steps you have taken to get to this point.

Most likely you have missed some crucial steps, have bad data, have a low res model or have not trained enough.

Hi!
This is the process I've done:
First, I extracted the faces from the movie scenes. then I copied the faces to another folder (girl face b movie) and took the other actor's faces from the first folder to put them on another one (boy face b movie). I previously did the same with the faces from the series (girl a series and boy a series.) Now I'm training with those folders ("girl face a series" folder: input a and "girl face b movie" folder: input b). I used the original trainer with bach size as 16 and 100000 interations, no folders for the timelapse, and no augments chosen.
This is what appeared in the text:

Code: Select all

Loading...
Setting Faceswap backend to CPU
03/19/2024 21:36:32 INFO Log level set to: INFO
03/19/2024 21:36:42 INFO Model A Directory: 'E:\asoe\vio face a se' (733 images)
03/19/2024 21:36:42 INFO Model B Directory: 'E:\asoe\viole face b movie' (510 images)
03/19/2024 21:36:42 INFO Training data directory: E:\asoe\modelviomotose
03/19/2024 21:36:42 INFO ===================================================
03/19/2024 21:36:42 INFO Starting
03/19/2024 21:36:42 INFO ===================================================
03/19/2024 21:36:42 INFO Loading data, this may take a while...
03/19/2024 21:36:42 INFO Loading Model from Original plugin...
03/19/2024 21:36:43 INFO No existing state file found. Generating.
03/19/2024 21:36:43 INFO Storing Mixed Precision compatible layers. Please ignore any following warnings about using mixed precision.
03/19/2024 21:36:43 WARNING Mixed precision compatibility check (mixed_float16): WARNING
The dtype policy mixed_float16 may run slowly because this machine does not have a GPU. Only Nvidia GPUs with compute capability of at least 7.0 run quickly with mixed_float16.
If you will use compatible GPU(s) not attached to this host, e.g. by running a multi-worker model, you can ignore this warning. This message will only be logged once
03/19/2024 21:36:45 INFO Loading Trainer from Original plugin...

03/19/2024 21:37:07 INFO [Saved model] - Average loss since last save: face_a: 0.28607, face_b: 0.39576

03/19/2024 21:37:14 INFO [Preview Updated]

In training settings I have this:

Code: Select all

centering: face
coverage: 87.5 
optimizer: adam
learning rate: 5e-5
epsilon exponent: -7
save optimizer: exit
Lr finder interations: 1000
Lr finder mode: set
Lr finder strenght: default
Network: nan protection
Convert batchsize: 16

And I have extended mask in mask type.

The problem is that the program sometimes reports unexpected crashes or it reports "Caught exception in thread: '_training'" several times. I got other report about the image size, something about an exception in file 325, too. I tried eliminating the images with big faces, blurry faces and faces with obstructions, but I keep getting the same reports. Also, I can see the faces I get are always blurry and dark with less than little details.

Last edited by torzdf on Wed Mar 20, 2024 12:19 pm, edited 1 time in total.
User avatar
torzdf
Posts: 2687
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 135 times
Been thanked: 628 times

Re: bugs on training

Post by torzdf »

Ok, almost everything here could be improved and is covered in the training guide... in short though...

raven_soul wrote: Wed Mar 20, 2024 4:32 am

"girl face b movie" folder: input b). I used the original trainer with bach size as 16 and 100000 interations, no folders for the timelapse, and no augments chosen.

Original model is 64px. You are never going to get great output if faces are filling the frame for HD sources. See I've trained the model for ages, the previews look good, so why is the swapped face blurry?

raven_soul wrote: Wed Mar 20, 2024 4:32 am

Code: Select all

03/19/2024 21:36:42 INFO Model A Directory: 'E:\asoe\vio face a se' (733 images)
03/19/2024 21:36:42 INFO Model B Directory: 'E:\asoe\viole face b movie' (510 images)

This number of images may be low... It may not be, but I'd generally go higher. More important than quantity is quality/variety

raven_soul wrote: Wed Mar 20, 2024 4:32 am

Code: Select all

The dtype policy mixed_float16 may run slowly because this machine does not have a GPU. Only Nvidia GPUs with compute capability of at least 7.0 run quickly with mixed_float16.
If you will use compatible GPU(s) not attached to this host, e.g. by running a multi-worker model, you can ignore this warning. This message will only be logged once

You don't have a GPU? This is not a great idea, and is likely to take you weeks/months to even get close to half decent results

raven_soul wrote: Wed Mar 20, 2024 4:32 am

In training settings I have this:

Code: Select all

coverage: 87.5 

Change this to 100 for Face centering

raven_soul wrote: Wed Mar 20, 2024 4:32 am

And I have extended mask in mask type.

Use BiseNet-FP (as explained in guides)

raven_soul wrote: Wed Mar 20, 2024 4:32 am

The problem is that the program sometimes reports unexpected crashes or it reports "Caught exception in thread: '_training'" several times.

When the program crashes, the error message will tell you where the crash report is stored (hint: it will be in your faceswap folder). You should provide this report as it contains essential information for diagnosing the problem

My word is final

Post Reply