Page 1 of 1

Putting single photo B face on model?

Posted: Mon Jun 14, 2021 6:59 pm
by Replicon

tl;dr: What would the best model/settings be to just put a single face/expression on a video? I don't care about changing expressions... just rotate and stretch the B photo enough that the alignment landmarks are "close enough", and render it.

So, I was reading the "why you need more sources" thread, and thought to myself... that 8 photo one actually looks pretty good, if you don't care about fluid motion... like if you're just trying to put someone's face on someone else, and it's meant to be kind of a caricature, rather than a convincing fake.

... So I tried training some models with a single face... like, I have a full on "A" video, and the "B" video is just a 30+ frame video of a single photo (in your face, "your model has too few photos" check hehe).

Just to get quicker results, I trained these with lightweight BS=1 (only 1 photo, can't imagine batching does anything useful here) for 200K iterations.

It sort of works, but comes out surprisingly faint/blurry. I would have thought it'd converge to something sharper more quickly, given there are so few possible options. Anyone experimented with this?

Note: I know there are easier ways of doing something like this, like I could just cut out a PNG and use Kdenlive's "auto-motion-tracking mask" feature... but that's not nearly as detailed as something that does the motion tracking frame-by-frame, using the alignments/landmarks, rather than just x-y positioning an image...


Re: Putting single photo B face on model?

Posted: Mon Jun 21, 2021 4:21 am
by bryanlyon

One image just isn't how FS works. It will never get unblurry with a single image. I'd suggest if you want to do that go check out First Order Model or "one-shot" solutions.


Re: Putting single photo B face on model?

Posted: Mon Jun 21, 2021 6:16 pm
by Replicon

Got it, thanks! I figured that'd be the case... just trying to cut corners here and there :)

BTW, if you have a small handful of photos, you can probably get some mileage out of them by using deep nostalgia's model to animate them. It's a little bit off-looking, so not sure how convincingly it'd translate.