Questions about resolution - input and ouput

TrigFX · Post by **TrigFX** » Fri Oct 30, 2020 2:44 am

I saw the recent post comparing the different models, and the quality difference between them – which has been a great help! Since then, I’ve been trying to wrap my head around input and output sizes compared to the resolution of the final video – and I did the rough diagram below to help me decide which sizes I should use for the best results.

My go-to model at the moment seems to be DFL-SAE (using 2x2070 supers, 128GB system RAM and Ryzen 3900X CPU), and so far, I have only gone up to 192 input/ output, and got better results than previously, thanks to info provided in the aforementioned post. But I am determined to try to get at least a decent quality medium close-up shot (if not perfect), rather than the blurry result I seem to be stuck with so far....

I’m aware that the quality of the B face source is paramount here – the better the resolution, the more varied the sources we feed it, and the more examples we can provide from all the angles needed, etc, the better. Plus, that combined with which model we use to train, along with the amount of time we are prepared to let it train.

Let’s say that I have a good source for training, and I am prepared to let it train for a minimum of 70-100 hours – preferably a week, possibly more, for good results (depending on the final shot result for Face A, the angles seen in that footage, that Face B will be swapped onto, how many close-ups or medium sized shots, if any, etc, etc). I realise that close-ups are not something that Faceswap is good at – unless left to run for a very long time, with great training sources and a high res model.

But in my quest for better results, it suddenly occurred to me, that if I were to extract from a 4K source at say, an input/ output of 256 - from a 4K source, and train that, using great sources – but then do the final conversion onto footage of 720p resolution, surely those close-ups would then look better, than if I tried to convert them onto 4K footage?

(There are options for upscaling the footage afterwards: I’ve been dabbling with Topaz Labs’ Video Enhance AI, with mixed results…)

Or is it best to convert to 4K - keeping both A and B sides the same resolution - and downscale the final result to 1080p HD afterwards?

Obviously, if I had a GPU with more VRAM (oh, for an RTX 3090…!!), I’d be going for a model with higher input/ output sizes, and longer training times to get better results – but I’m trying to stick with my current setup limitations without killing my system, AND still attempting to get better results…!

So, would this actually work? (To my tiny brain, looking at the diagram below, logically, it seems like this might work - at least, it makes sense to me…!)

Hopefully, my diagram will make more sense of what I am trying to ask…!

Or does it make no difference, ultimately, whether I convert to a 720p shot, or a 4K one, and that the final result will just be purely down to which model I use and how long I let it train for?

(Apologies for the long post - just trying to make sure I had covered everything...!)

Post by **torzdf** » Sat Oct 31, 2020 11:30 am

I would suspect it is unlikely to make much difference which way you did it, although in terms of speed of processing, you would be better off swapping straight onto your lower res frames, rather than downsizing afterwards.

I have got fairly decent results training a 256px model for 720p output. It's not perfect, but it's not a blocky mess.

Incidentally, be careful with 4K footage. A lot of it is in HDR, and HDR does not train well... at all!

TrigFX · Post by **TrigFX** » Sun Nov 01, 2020 5:10 pm

Thanks for the reply - and that's good to know, about 4K HDR....good point - I hadn't even considered that!

So am I right in thinking, that for training I could use say, 4K (non HDR!) or just 1080p HD - but then convert the final result straight onto 720p version of the same footage...and that would look better? Or would it be basically the same? Just that the speed would be faster with a lower res convert? I'm kinda hoping that it would be slightly better...!

That's basically what I was trying to say before, but it was clearly too late at night for me to explain it properly!

Post by **bryanlyon** » Mon Nov 02, 2020 3:44 am

I think the best way for you to find how it looks is to test it . Honestly, training data matters far more than the convert process -- Unless you are compositing it manually. I'd suggest trying it at various scales and seeing how it looks. Some glitches may become a non-issue if you convert at 720p while others may be worse and actually look best at 4k. There is no one "right" way (or it'd be in the guides).

TrigFX · Post by **TrigFX** » Mon Nov 02, 2020 1:07 pm

Ah, fair enough - that makes sense!

I just ran a quick test, converting to a resolution lower than 720p (whatever resolution the original footage was) - and it DOES look better....but I ran into an error: out of just over 10,000 frames, after converting almost 8,500 frames it now keeps crashing, and won't convert the final 2,000 frames. I've tried using the Frame Ranges option - in fact, I had to use it twice to get THIS far - but it refuses to go further.

I haven't tried converting it to HD (the resolution I used for training), but after having spent a good few hours fixing the alignments on the lower res version, I don't fancy going back through the same 10,000 plus frames again at HD quality to fix them just to see if this works and looks any different...!!! At least, not right now...I may come back to it on a rainy day.... This was just a test project, so nothing of importance thankfully - but you've given me a few things to consider next time on the next project, and I've learned even more about how this works, so it's not been wasted time! Many thanks!

TrigFX · Post by **TrigFX** » Mon Nov 02, 2020 1:46 pm

As an additional note - in case it helps someone else who reads this post - I managed to sort out the conversion error I was getting, as mentioned above.

I have 2 x 2070 super GPUs, and I had "distributed" turned on....it seems that converting doesn't like having that option ticked! Obviously, it's fine for extracting and especially for training - but seems to cause problems when converting. As soon as I unchecked the Distributed box, it just flew, with no errors.

Post by **torzdf** » Mon Nov 02, 2020 5:15 pm

This is good information. Thanks.

I don't have a multi-gpu rig to test, but included it in convert, just because I could.

To be honest, the bottleneck on convert is not the GPU it's the CPU, so I will probably remove Distributed from convert from a future update.

Post by **torzdf** » Tue Dec 29, 2020 3:59 pm

To confirm. Distributed got removed from convert in a recent update

Faceswap Forum

Questions about resolution - input and ouput

Questions about resolution - input and ouput

Re: Questions about resolution - input and ouput

Re: Questions about resolution - input and ouput

Re: Questions about resolution - input and ouput

Re: Questions about resolution - input and ouput

Re: Questions about resolution - input and ouput

Re: Questions about resolution - input and ouput

Re: Questions about resolution - input and ouput