So I got probably a stupid question

Discussions about research, Faceswapping and things that don't fit in the other categories here.


Locked
User avatar
kruszkush
Posts: 9
Joined: Thu May 07, 2020 10:36 pm
Has thanked: 3 times

So I got probably a stupid question

Post by kruszkush »

It's like I got two different scenes and I make AI learn how to distinguish it etc from scratch. But there is a lot of people doing it, shouldn't there be like a database or something, so that it all makes AI smarter and if you start deepfake then it already knows a lot from other projects?

User avatar
Tekniklee
Posts: 37
Joined: Fri Jan 31, 2020 6:03 pm
Has thanked: 7 times
Been thanked: 3 times

Re: So I got probably a stupid question

Post by Tekniklee »

I'm a bit confused by your use of the word "scenes". FaceSwap doesn't look at anything in a scene except for the face. The extraction pulls out only faces, training trains only faces, and swapping only glues the face from the model on top of the face in the original video. So as far as face swapping programs are concerned, the only "scenes" that matter are faces. Everything else can be tossed, which is why the Frames are only used to extract faces, and to stuff converted faces onto. Even there, the faces are just dropped onto the frame at a particular location, and then some blending is done to match the surrounding skin.

However, if by "scenes" you actually mean "faces", there is a lot of that sharing going on already. For example, some functions such as aligners or maskers (Unet-Dfl, Vgg-Clear, etc) have been "user community trained" against large data sets for feature recognition, or for things such as recognition of hair or hands. What COULD be improved, dramatically I think, are profile (side image) alignments. Also, faces that point too far back or forward. Back tilted noses are especially notorius for getting reliable alignments. Unfortunately, these are capabilities are are specifically excluded from most of those "user trained" aligners, which note that "Profiles may result in sub-par performance". No shit.

It would also be nice to have an "aligner improvement" capability. Face recognition is a tough job when you have to tell an aligner to "look at this whole mess in the frame and see if there is a face in there". It does pretty well considering the job being asked of it. In the case of training, as the model improves, it should be possible to use that information to improve alignments. I think something like this is used for mask learning during training, but masks are still kind of new to me. In the case of conversion however, every single frame needs to have as accurate a landmark alignment as possible. If you skip this you get Marty Feldman eyes and other strange and awful looking artifacts. You may have to tweak 30 alignments for every second of output video. That's a tall order, and you usually end up spending the vast majority of human interaction on getting those alignments right prior to conversion. And on top of that, you don't get the option of only using good faces like you can with training sets. If the face is in a clip, you have to make it work even if blurry or obstructed. However, a trained model already knows what the faces involved look like, where features are, and uses warping to align faces during training. If extraction were able to use information from a trained model to better position the conversion extractions, you would get mostly perfect extractions. That would be a major process improvement, IMHO.

User avatar
djandg
Posts: 43
Joined: Mon Dec 09, 2019 7:00 pm
Has thanked: 4 times
Been thanked: 2 times

Re: So I got probably a stupid question

Post by djandg »

I think kruszkush is expecting there to be a repository or database of pre-trained models that can just be tapped in to - like Getty Images.

User avatar
Tekniklee
Posts: 37
Joined: Fri Jan 31, 2020 6:03 pm
Has thanked: 7 times
Been thanked: 3 times

Re: So I got probably a stupid question

Post by Tekniklee »

Hmmmm. Well, they do have that for Keanau, Nicholas and The Donald, but honestly attempting to grok what makes a person's face unique during the culling is part of the artistic fun.

Locked