Obscure close - Fish Eye Training are there any Z axis training models

redson · Post by **redson** » Sat Jul 16, 2022 3:52 am

Is there a good method of training when you have a series of images in different perspective in real world 3D not 2D.

For example if you have Fisheye or Obscure image with another set of images it can't seem to map and over more iterations can even gets worse pushing a generalize model into 2D for outliers get worse not better. So it gets best around 150k-300k IT once I pass that pushes the outliers which are over 1/4 of training material to curb and none of those models follow.

So for example if you put your go-pro camera to someone's face. Say you put your face 3 inches away the lens may even distort the person's face making a much bigger nose with smaller lower chin / bigger forehead so entire face is still masked and in frame of your fisheye lens. So overall you have many other shots to compensate for this outlying perspective such as a shot of them from distance / side view extra but you want the model to work in 3D space like reality.

I don't know what training to try. Such as if you build a face model from what I reviewed tends to flip x, y (2D). I don't know of any flipping the Z axis in 3D like real world is.

Is there settings to do to prevent or stabilize the models to 3D prospect to follow the face / eyes / mouth rather than mis-train to 2D only.

VR Nvidia GPU card with 4G aka lower memory.
I can push this higher but no point if ends up the same result

nice python faceswap.py train -A /mnt/f/V/FaceSwap/FACEA -B /mnt/f/V/FaceSwap/FACEB -m /mnt/f/V/FaceSwap/MODEL -t original -bs 12 -it 520250 -s 250 -ss 5000 -nl -wl -L INFO

also tried
nice python faceswap.py train -B /mnt/f/V/FaceSwap/FACEA -A /mnt/f/V/FaceSwap/FACEB -m /mnt/f/V/FaceSwap/MODEL -t original -bs 12 -it 520250 -s 250 -ss 5000 -nl -L INFO

Post by **torzdf** » Sat Jul 16, 2022 9:40 am

The short answer is no. Images should all have a similar field of view, or you will run into the issues that you see.

Ultimately, nothing is aligned in the 3D space in Faceswap. We do use some rough 3D modelling for different training types (i.e. legacy/face/head), but again, it makes assumptions that field of view remains the same.

As your use-case is so niche, this is something we're unlikely to ever support, sadly.

redson · Post by **redson** » Sat Jul 16, 2022 5:13 pm

Thanks for tell me.

It explains why many things can never be built to swap in all cases like a true 3D model would. Also explains several things why sometimes the 2D only gets worse with certain shots, if the swap doesn't understand perspective at all or foreground or Z axis in any model then that could explain several of my models training problems. I work harder at getting zero perspective and forward/side facing images. Also have to swap similar faces so you can't have big nose with small one without 3D model.

This explains why a person who is looking at cloud, down at the swimming pool before diving, finding teeth gaps will never work since there is only a 2D flat image being twisted on the Z axis based off color. 2D clipping is not using a basic 3D human skull (nose / eye socket / cheeks / chin ) Odd since the 2D model when debug images, I can see floating on the Z axis, so it's aware of face positions in 3D and can adjust for missing points of the face however I can now see where you could just clip those points off in 2D if the face was turned to much on one of three axis. Explains not knowing where the nose / chin / forehead actually would truly be in an image and why clip hair in front of face / hands or alignment of nose is difficult since doesn't try to build that face in 3D but simply 2D clip or removal of points.

For some reason, I thought the deep learning was building some convex 3D model of layers from some of the models description as it was being trained from their descriptions of nodes connected, since had many angles it has access too, side views, or perspectives but this explains if model is fully 2D that ignores that or make end result worse if you include perspective photos. I don't know why I was assuming like other 2D image software takes pictures from front / side shots and aligns a 3D model composite in such things. Also it takes longer to build face model that I thought was doing the Z-axis too since a human face has a very distinct shape of nose out, lips out, eye socks in, cheeks half between nose and eye on Z-axis extra that I thought was taking that into account to build model of human face from a basic one and slowly moving and overlaying it.

But I should of realized when reviewing some of examples showing the clip that it was 2D mask of alignment nodes not 3D alignment node or human skull. Which case the algorithm would create 2D models and overlay them on Z axis to build the face 3D. In 3D photos you almost could get a real perspective too since the distance on camera is preset distance between a generic human eye's distance and since you get two simultaneous images in the same frame perspectives could be learned quickly. If the sample images never had them turning heads or what the perspective would stay almost 2D or if all 2D inputs, and z axis would be almost immaterial in some cases act the way it does now.

Thanks for info makes me know more what to use as a movie sample and what not to use where clipping from 2D could be an issue.

ianstephens · Post by **ianstephens** » Sat Jul 23, 2022 8:33 pm

Are you trying to train models based on 3D VR sources? I have experience with this.

We have worked with the side to be swapped (A-Side) in "fisheye" VR format and the side to be replaced (B-Side) with standard flat lens non-fisheye images/video sources.

If this is the case, it's not perfect, but the best results we had were always using warp to landmarks and literally running the model for a few million iterations - it gets there in the end. However, if the image is too distorted the AI can end up creating something alien-like - but have found this is rare.

We've had full close-up fisheye sources where the nose is literally taking up 70% of the shot and the eyes are small and distorted and have managed to produce a nice swap successfully after a lot of training (and always with warp to landmarks enabled).

MaxHunter · Post by **MaxHunter** » Sun Jul 31, 2022 9:21 pm

Thanks for the tip. I've been working with VR as well and with some success but without warping to landmarks and about 150,000 ITs.

It makes sense to do with warp-to-landmarks, I don't know why I didn't think about this.

Faceswap Forum

Obscure close - Fish Eye Training are there any Z axis training models

Obscure close - Fish Eye Training are there any Z axis training models

Re: Obscure close - Fish Eye Training are there any Z axis training models

Re: Obscure close - Fish Eye Training are there any Z axis training models

Re: Obscure close - Fish Eye Training are there any Z axis training models

Re: Obscure close - Fish Eye Training are there any Z axis training models