I can't upload the .JSON.
I call this Phaze-A setting, "Max-512" (because it's about as far as you can go on a 24Gb card) with Mixed Precision turned on.
Size: 23.25Gb (according to System Output)
Batch: 1
Recommended Learning Rate: 4.625e-6/-7 (With MS_SSM@100; Logcosh@50; FFL@100; LPIPS_VGG16@25) E.G.s 2.3-ish
Alt Learning Rate: 6.25e-6/-7 (With MS_SSM@100; Logcosh@50)
Code: Select all
{
"output_size": 512,
"shared_fc": "none",
"enable_gblock": true,
"split_fc": true,
"split_gblock": false,
"split_decoders": true,
"enc_architecture": "efficientnet_v2_l",
"enc_scaling": 100,
"enc_load_weights": true,
"bottleneck_type": "dense",
"bottleneck_norm": "none",
"bottleneck_size": 512,
"bottleneck_in_encoder": true,
"fc_depth": 1,
"fc_min_filters": 1280,
"fc_max_filters": 1280,
"fc_dimensions": 8,
"fc_filter_slope": -0.5,
"fc_dropout": 0.0,
"fc_upsampler": "upscale_hybrid",
"fc_upsamples": 1,
"fc_upsample_filters": 512,
"fc_gblock_depth": 3,
"fc_gblock_min_nodes": 512,
"fc_gblock_max_nodes": 512,
"fc_gblock_filter_slope": -0.5,
"fc_gblock_dropout": 0.0,
"dec_upscale_method": "upscale_hybrid",
"dec_upscales_in_fc": 0,
"dec_norm": "none",
"dec_min_filters": 160,
"dec_max_filters": 640,
"dec_slope_mode": "full",
"dec_filter_slope": -0.33,
"dec_res_blocks": 1,
"dec_output_kernel": 3,
"dec_gaussian": true,
"dec_skip_last_residual": false,
"freeze_layers": "keras_encoder",
"load_layers": "encoder",
"fs_original_depth": 4,
"fs_original_min_filters": 128,
"fs_original_max_filters": 1024,
"fs_original_use_alt": false,
"mobilenet_width": 1.0,
"mobilenet_depth": 1,
"mobilenet_dropout": 0.001,
"mobilenet_minimalistic": false,
"__filetype": "faceswap_preset",
"__section": "train|model|phaze_a"
}
Explanation and a few thoughts:
This was based on the STOJO setting with some of the @Icarus modifications, and added modifications by myself. It uses Efficienetv2 L @100, and instead of the Subpixel that Icarus likes to use, I used the Upscale Hybrid to save some VRAM.
The learning rate was based on a ratio formula suggested by @couleurs and @torzdf original 5e-5. If you would like to use it as a basis for you own Learning Rates it looks like this: ✓Batch Size ÷ 8 * 5 (Square root of your batch size divide by 8 times 5). So, in this instance, √1÷8x5=.625. .626e-5 or 6.25e-6.
I don't know if I'd have this peer reviewed but it worked for me. For further reading see viewtopic.php?t=2083&start=20.
After finding your base learning rate you then adjust by your percent difference of you egs. when adding different losses.
For example, when I added FFL and VGG16 (to my MS_SSIM & Logcosh learning rate) there was a rough 26% difference in egs, and I subtracted 26% from 6.25e-6, which is how I came about 4.625. I am not saying this is ideal, I am just saying it's stable for my rtx3090. It's possible you can up this learning rate. (Please report back if you've found better learning rates so everyone can benefit. )
Again, I'm not sure if this formula is something I'd bring to a PhD in computer science, but it worked for this Mathematically challenged person. Maybe it will help you.
I had G-Block split originally but turned it off due to "Googley Eyes". Let me know if you have the same problem, and/or how you fixed it.
As a comparison, it took me around 2.1 million iterations (over 9million e.g.s) with a slightly modified DNY512 w/fs original, with a (struggling) LR of 1e-5 to reach losses of: face_a: 0.05342 / face_b: 0.03503. (The last 100K had No Warp)
This setting w/efficienetv2_L @100, took roughly 700K ITs (1.75 million e.g.s) to reach the same losses and IMHO better visual fidelity (Last 100K had No Warp) And the LR had no warnings, ooms, or problems.
If anyone has any suggestions to the settings or learning rate, please post a reply. Nothing is set in stone, we are all learning, and we're all building off other's suggestions. What may seem obvious and silly to you, will save hours for newbies and others.