I'd like advice on the best configuration for high resolution face swaps. I'll have roughly 15K images of each face extracted at 1024 from hi-res photos. I did prep some of the photos using GFP-GAN. I assume Phaze-A, but which training preset to start with and what settings to toy with. The end result is what matters, not the time it takes to run the models. I should have 40GB of VRAM to play with, so the model can be pretty large.
Does a smaller batch size and shallow learning curve sacrifice time for a better result? I've read that increasing filters in Phaze-A is the best use of VRAM, but not sure where to start. Also, since photos don't present a consistent "range" of faces should I use instance normalization rather than batch normalization?
So many settings in Phaze-A it makes my shot glass of a brain overflow.