converted video bad even with many training frames and 800k iterations, how can I tell if there's an issue with training

If training is failing to start, and you are not receiving an error message telling you what to do, tell us about it here


Forum rules

Read the FAQs and search the forum before posting a new topic.

This forum is for reporting errors with the Training process. If you want to get tips, or better understand the Training process, then you should look in the Training Discussion forum.

Please mark any answers that fixed your problems so others can find the solutions.

Locked
User avatar
CProdDigital
Posts: 14
Joined: Fri Nov 04, 2022 4:22 pm
Has thanked: 1 time

converted video bad even with many training frames and 800k iterations, how can I tell if there's an issue with training

Post by CProdDigital »

I've got thousands of frames of front on high res head mounted camera footage, as well as lots of other pictures and slightly lower res video from other angles for both actors.
My goal was to put actor B onto HMC footage of actor A, which I thought would be easy since I have so much HMC footage of both actors. I supplemented some extra footage and pictures to flesh out the model just in case, but even after 800K+ iterations, the resulting convert is very blurry and shaky.

There's no obstructions in my training frames for either actor, and the bulk of the pictures are clear, up close, and fairly high res. Is it possible to have TOO much training data? Ive seen better results using a single video of B and a single video of A at only 75k iterations, so clearly something is wrong. What might be causing blurry output despite so much data and iterations? Do I need to train it for longer if I have more training images?

User avatar
torzdf
Posts: 2649
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 128 times
Been thanked: 623 times

Re: converted video bad even with many training frames and 800k iterations, how can I tell if there's an issue with trai

Post by torzdf »

There is no way near enough variety in your data.

If you are working with limited variety, then both sides must be matched for all expressions/lighting/color etc. Matching data in this way is an advanced technique that I would never recommend except to the most experience of ML/VFX users.

The encoder benefits from finding similarities between the A and B side, but those similarities need to exist, They just do not in your training set.

This is an example of highly varied data:
Image

My word is final

Locked