Let me see if I have this right regarding training time...

Want to understand the training process better? Got tips for which model to use and when? This is the place for you


Forum rules

Read the FAQs and search the forum before posting a new topic.

This forum is for discussing tips and understanding the process involved with Training a Faceswap model.

If you have found a bug are having issues with the Training process not working, then you should post in the Training Support forum.

Please mark any answers that fixed your problems so others can find the solutions.

Locked
User avatar
martinf
Posts: 27
Joined: Thu Sep 29, 2022 7:58 pm
Been thanked: 3 times

Let me see if I have this right regarding training time...

Post by martinf »

Is training a function of epochs or iterations? Ie., are these two statements correct?

  1. Given a small sample of A and B images (say, 100), training will be fast... but likely poor.

  2. Given a large sample of A and B images (say, 5000), training will be slow... but likely of much higher quality.

I ask because I am training a model now that has 1500 A images and 4000 B images (all good variance and good quality) but the training is taking forever (Dfl Sae at 256). The last time I trained this model was with 1200 images on each side and it was much further along at this point than I am now. Is the large number of B images in the current model (4000) the reason this model is training slower?

User avatar
bryanlyon
Site Admin
Posts: 793
Joined: Fri Jul 12, 2019 12:49 am
Answers: 44
Location: San Francisco
Has thanked: 4 times
Been thanked: 218 times
Contact:

Re: Let me see if I have this right regarding training time...

Post by bryanlyon »

No. Or more accurately mostly no. The model will train at the same speed if the data is good. That's not the whole story though.

If you have a very few images then you'll "overtrain" before the model reaches it's maximum quality. That may show you what looks like high quality faces but it'll be terrible at swapping and fail to generalize to new faces.

That said if you have very poor data it'll also take longer, even if you have a lot of images. Poor data may be very dark, or have a lot of obstructions or be in HDR (which will likely fail to train no matter what).

All this to say: No, training wont be faster with fewer images, but it could take longer if your data is poor.

User avatar
martinf
Posts: 27
Joined: Thu Sep 29, 2022 7:58 pm
Been thanked: 3 times

Re: Let me see if I have this right regarding training time...

Post by martinf »

Thanks for the clarification. It seems as if my issue was mostly dealing with image sets that had a plethora of straight on to 3/4 angles in yaw but a much lower percentage of profiles. The profiles were there, but because they were about 5% of the totals they seemed to be lagging badly.

Brings up a likely very impractical suggestion...

It WOULD be cool to be able to 'scatter plot' the image sets being used on an XY plot. Pitch and yaw for the respective Y and X values. Use red for one set and blue for another. Or even a numeric output would help.

And one more wild idea...

It is said that the lack of blinking is one of the 'tells' in deep fakes. How cool would it be to be able to introduce procedural blinking into a conversion? I know... totally NOT machine learning, but would be cool just the same. Wonder if a blink is short enough to get away with some procedural trickery?

User avatar
torzdf
Posts: 2671
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 131 times
Been thanked: 625 times

Re: Let me see if I have this right regarding training time...

Post by torzdf »

martinf wrote: Thu Oct 20, 2022 12:36 pm

Brings up a likely very impractical suggestion...

It WOULD be cool to be able to 'scatter plot' the image sets being used on an XY plot. Pitch and yaw for the respective Y and X values. Use red for one set and blue for another. Or even a numeric output would help.

It's not that impractical. I'm not sure it's something that I would look to implement though, as the nature of the available material means that forward facing shots would nearly always be more highly represented than more extreme angles.

However, you can go some step of the way by sorting/binning by yaw/pitch and analyzing your data that way. If you bin your training set by yaw (for example) you will fairly quickly find out where there are gaps in your data

It is said that the lack of blinking is one of the 'tells' in deep fakes. How cool would it be to be able to introduce procedural blinking into a conversion? I know... totally NOT machine learning, but would be cool just the same. Wonder if a blink is short enough to get away with some procedural trickery?

This 'tell' is nonsense, imho. I have seen it mentioned a lot since the early days of deepfakes. I believe it came about from a lot of fakes being generated from posed photographs (which, by their nature are never taken with the target blinking). Getting data from a wide variety of video clips will include blinking data. At this point it becomes an issue with data collection rather than anything intrinsically to do with deepfakes themselves.

My word is final

Locked