Dynamic algorithmic input source trimming while training

Want to understand the training process better? Got tips for which model to use and when? This is the place for you


Forum rules

Read the FAQs and search the forum before posting a new topic.

This forum is for discussing tips and understanding the process involved with Training a Faceswap model.

If you have found a bug are having issues with the Training process not working, then you should post in the Training Support forum.

Please mark any answers that fixed your problems so others can find the solutions.

Post Reply
User avatar
artisan
Posts: 14
Joined: Sat Feb 12, 2022 1:22 am
Has thanked: 9 times
Been thanked: 2 times

Dynamic algorithmic input source trimming while training

Post by artisan »

I'm not sure if the deep learning algorithm already has what I will describe, or alternative algos that would have the same effect, or if this is simply a bad idea. Sharing for discussion.

Precursor: I notice while training that in the preview, while some face swaps are doing fantastic, among them are some that are not just uncanny, but Frankenstein-ish.

Basic idea: As training proceeds, each input source face tracks its average loss (function) when compared with the processed swap. Every n iterations (e.g. 10000), starting at x iterations (e.g. 50,000), the algorithm trims out y% of the input images with the highest relative loss calculated—either deleting them from the input source A folder or placing them into a designated folder, possibly renaming them to indicate at which iteration they were removed.

Why?

The presumption would be that either through inadequate input source preparation (by us), or even well thought out input source preparation and pruning (by us, with help from the various tools), there may be some faces that don't train well and may have an adverse effect on the weights (model effectiveness) as training proceeds.

Could the algorithm dynamically prune the input source when the loss calculated simply isn't keeping up with the progress of the rest of the model's success, indicating that the model could benefit by removing such input source faces from the sample?

Apologies in advance to the fs veterans if my lack of understanding of the complexities of the algorithm make this idea ill-advised or useless or redundant. ;)


User avatar
torzdf
Posts: 1979
Joined: Fri Jul 12, 2019 12:53 am
Answers: 139
Has thanked: 90 times
Been thanked: 433 times

Re: Dynamic algorithmic input source trimming while training

Post by torzdf »

There are many potential experiments that could be run on data feeding. How useful they would or would not be is an open question. It wouldn't be possible to track score per image (unless validating on a batch size of 1) but scores could be kept at a batch level, and tracked over time, looking to re-feed those faces which score worst.

Of course, this assumes a perfect training set, as otherwise you would just end up re-feeding poorly aligned images back into the model.

My word is final


User avatar
artisan
Posts: 14
Joined: Sat Feb 12, 2022 1:22 am
Has thanked: 9 times
Been thanked: 2 times

Re: Dynamic algorithmic input source trimming while training

Post by artisan »

Do batches always contain the same group of face images?

I'm making the mistake of brainstorming without intricate knowledge of the software under the hood. A caveat born of enthusiasm—and I know how annoying that can be. :)

Recording the following so my hours of having a confused look while browsing the code and reading the forums doesn't feel like completely lost time. Ha ha

During training, track top 5 highest loss "scores" encountered, for 5,000 iterations.
During training, for the next 1,000 iterations, log an identifier for a batch that encounters a loss score higher than the lowest of the top 5 highest encountered in the previous 5,000 iterations.

(Note: Loss is always trending downward, so flagging for high loss scores must not be done from the start of all training, only a recent epoch of training iterations.
The highest loss by batch log is used to impose penalties later.)

Repeat while training in 5,000/1,000 cycles.

Let's say after about 100,000 iterations, a batch that has a problematic face skewing its loss score, should have been encountered and flagged multiple times and can be flagged (many flags if the high loss scores are more than just random variance). In batches of 16, presuming the batch always contains the same faces and they can be identified, they could be manually inspected, allowing for visual determination as to whether they are bad input and should be deleted.


User avatar
torzdf
Posts: 1979
Joined: Fri Jul 12, 2019 12:53 am
Answers: 139
Has thanked: 90 times
Been thanked: 433 times

Re: Dynamic algorithmic input source trimming while training

Post by torzdf »

artisan wrote: Sat Sep 17, 2022 2:34 pm

Do batches always contain the same group of face images?

No they are random. After a certain number of epochs you should be able to predict, with a fair degree of certainty, which individual images impact batch scores the most.

artisan wrote: Sat Sep 17, 2022 2:34 pm

I'm making the mistake of brainstorming without intricate knowledge of the software under the hood. A caveat born of enthusiasm—and I know how annoying that can be. :)

I'm all for this, tbh ;)

During training, track top 5 highest loss "scores" encountered, for 5,000 iterations.
During training, for the next 1,000 iterations, log an identifier for a batch that encounters a loss score higher than the lowest of the top 5 highest encountered in the previous 5,000 iterations.

(Note: Loss is always trending downward, so flagging for high loss scores must not be done from the start of all training, only a recent epoch of training iterations.
The highest loss by batch log is used to impose penalties later.)

Repeat while training in 5,000/1,000 cycles.

Let's say after about 100,000 iterations, a batch that has a problematic face skewing its loss score, should have been encountered and flagged multiple times and can be flagged (many flags if the high loss scores are more than just random variance). In batches of 16, presuming the batch always contains the same faces and they can be identified, they could be manually inspected, allowing for visual determination as to whether they are bad input and should be deleted.

Unfortunately this falls down at the last steps, as the data is reshuffled every epoch.

However, it may just be as simple as re-feeding the worst performing batch(es) per epoch /x iterations. Although, that feels slightly less refined than tracking which individual images are contributing most to low loss. It could even be factored in to some kind of manual intervention to say "these images are performing badly... do you want to remove them from the dataset or try to get them performing better?"

My word is final


User avatar
MaxHunter
Posts: 47
Joined: Thu May 26, 2022 6:02 am
Has thanked: 42 times
Been thanked: 6 times

Re: Dynamic algorithmic input source trimming while training

Post by MaxHunter »

This is actually a fascinating suggestion. I'd love to see not only this be implemented, but curious to know - from a research perspective - if it works. Just last night I was preparing for a new model with a single subject, and no matter how hard I tried to weed out bad examples, there are always one or two I don't catch. I often wondered if the AI through enough training will recognize them and dismiss them as inappropriate examples of the actual subject.


User avatar
artisan
Posts: 14
Joined: Sat Feb 12, 2022 1:22 am
Has thanked: 9 times
Been thanked: 2 times

Re: Dynamic algorithmic input source trimming while training

Post by artisan »

torzdf wrote: Sun Sep 18, 2022 5:15 pm

Unfortunately this falls down at the last steps, as the data is reshuffled every epoch.

However, it may just be as simple as re-feeding the worst performing batch(es) per epoch /x iterations. Although, that feels slightly less refined than tracking which individual images are contributing most to low loss. It could even be factored in to some kind of manual intervention to say "these images are performing badly... do you want to remove them from the dataset or try to get them performing better?"

Note: I used "epoch" imprecisely, but I understand what you mean per the Faceswap meaning (model has encountered all faces).

I'm not 100% clear on the "re-feeding", but I believe it means sending a poorly performing batch (bad loss score) back for processing again to see if it's just variance or really a bad image skewing things—before the epoch ends and batches are reshuffled.

Some kind of manual evaluation seems like a reasonable constraint. I've often thought while looking at the preview images that I'd like to just click one of the A source images and select "Delete" to 86 it right from there. Haha

In any event, if there's a feasible method there, I'm sure you'll be able to zero in on it. ;)


Post Reply