[Guide] Introducing - Phaze-A

Want to understand the training process better? Got tips for which model to use and when? This is the place for you


Forum rules

Read the FAQs and search the forum before posting a new topic.

This forum is for discussing tips and understanding the process involved with Training a Faceswap model.

If you have found a bug are having issues with the Training process not working, then you should post in the Training Support forum.

Please mark any answers that fixed your problems so others can find the solutions.

User avatar
MaxHunter
Posts: 193
Joined: Thu May 26, 2022 6:02 am
Has thanked: 176 times
Been thanked: 13 times

Re: [Guide] Introducing - Phaze-A

Post by MaxHunter »

I can't upload the .JSON.

I call this Phaze-A setting, "Max-512" (because it's about as far as you can go on a 24Gb card) with Mixed Precision turned on.

Size: 23.25Gb (according to System Output)

Batch: 1

Recommended Learning Rate: 4.625e-6/-7 (With MS_SSM@100; Logcosh@50; FFL@100; LPIPS_VGG16@25) E.G.s 2.3-ish

Alt Learning Rate: 6.25e-6/-7 (With MS_SSM@100; Logcosh@50)

Code: Select all


{
"output_size": 512,
  "shared_fc": "none",
  "enable_gblock": true,
  "split_fc": true,
  "split_gblock": false,
  "split_decoders": true,
  "enc_architecture": "efficientnet_v2_l",
  "enc_scaling": 100,
  "enc_load_weights": true,
  "bottleneck_type": "dense",
  "bottleneck_norm": "none",
  "bottleneck_size": 512,
  "bottleneck_in_encoder": true,
  "fc_depth": 1,
  "fc_min_filters": 1280,
  "fc_max_filters": 1280,
  "fc_dimensions": 8,
  "fc_filter_slope": -0.5,
  "fc_dropout": 0.0,
  "fc_upsampler": "upscale_hybrid",
  "fc_upsamples": 1,
  "fc_upsample_filters": 512,
  "fc_gblock_depth": 3,
  "fc_gblock_min_nodes": 512,
  "fc_gblock_max_nodes": 512,
  "fc_gblock_filter_slope": -0.5,
  "fc_gblock_dropout": 0.0,
  "dec_upscale_method": "upscale_hybrid",
  "dec_upscales_in_fc": 0,
  "dec_norm": "none",
  "dec_min_filters": 160,
  "dec_max_filters": 640,
  "dec_slope_mode": "full",
  "dec_filter_slope": -0.33,
  "dec_res_blocks": 1,
  "dec_output_kernel": 3,
  "dec_gaussian": true,
  "dec_skip_last_residual": false,
  "freeze_layers": "keras_encoder",
  "load_layers": "encoder",
  "fs_original_depth": 4,
  "fs_original_min_filters": 128,
  "fs_original_max_filters": 1024,
  "fs_original_use_alt": false,
  "mobilenet_width": 1.0,
  "mobilenet_depth": 1,
  "mobilenet_dropout": 0.001,
  "mobilenet_minimalistic": false,
  "__filetype": "faceswap_preset",
  "__section": "train|model|phaze_a"

}

Explanation and a few thoughts:

This was based on the STOJO setting with some of the @Icarus modifications, and added modifications by myself. It uses Efficienetv2 L @100, and instead of the Subpixel that Icarus likes to use, I used the Upscale Hybrid to save some VRAM.

The learning rate was based on a ratio formula suggested by @couleurs and @torzdf original 5e-5. If you would like to use it as a basis for you own Learning Rates it looks like this: ✓Batch Size ÷ 8 * 5 (Square root of your batch size divide by 8 times 5). So, in this instance, √1÷8x5=.625. .626e-5 or 6.25e-6.

I don't know if I'd have this peer reviewed :lol: but it worked for me. For further reading see viewtopic.php?t=2083&start=20.

After finding your base learning rate you then adjust by your percent difference of you egs. when adding different losses.

For example, when I added FFL and VGG16 (to my MS_SSIM & Logcosh learning rate) there was a rough 26% difference in egs, and I subtracted 26% from 6.25e-6, which is how I came about 4.625. I am not saying this is ideal, I am just saying it's stable for my rtx3090. It's possible you can up this learning rate. (Please report back if you've found better learning rates so everyone can benefit. 🙂)

Again, I'm not sure if this formula is something I'd bring to a PhD in computer science, but it worked for this Mathematically challenged person. Maybe it will help you.

I had G-Block split originally but turned it off due to "Googley Eyes". Let me know if you have the same problem, and/or how you fixed it.

As a comparison, it took me around 2.1 million iterations (over 9million e.g.s) with a slightly modified DNY512 w/fs original, with a (struggling) LR of 1e-5 to reach losses of: face_a: 0.05342 / face_b: 0.03503. (The last 100K had No Warp)

This setting w/efficienetv2_L @100, took roughly 700K ITs (1.75 million e.g.s) to reach the same losses and IMHO better visual fidelity (Last 100K had No Warp) And the LR had no warnings, ooms, or problems.

If anyone has any suggestions to the settings or learning rate, please post a reply. Nothing is set in stone, we are all learning, and we're all building off other's suggestions. 🙂 What may seem obvious and silly to you, will save hours for newbies and others.

Last edited by MaxHunter on Thu Feb 16, 2023 12:39 am, edited 30 times in total.

Tags:
User avatar
torzdf
Posts: 2636
Joined: Fri Jul 12, 2019 12:53 am
Answers: 156
Has thanked: 128 times
Been thanked: 614 times

Re: [Guide] Introducing - Phaze-A

Post by torzdf »

You can save a preset in the Phaze-A config settings and upload it here, if you want.

My word is final

User avatar
bryanlyon
Site Admin
Posts: 793
Joined: Fri Jul 12, 2019 12:49 am
Answers: 44
Location: San Francisco
Has thanked: 4 times
Been thanked: 215 times
Contact:

Re: [Guide] Introducing - Phaze-A

Post by bryanlyon »

Also, a note, you can "save draft" at the bottom of a post that lets you edit it before you decide it's ready for you to post it ;) .

User avatar
MaxHunter
Posts: 193
Joined: Thu May 26, 2022 6:02 am
Has thanked: 176 times
Been thanked: 13 times

Re: [Guide] Introducing - Phaze-A

Post by MaxHunter »

I couldn't upload the JSON, so I just re-edited the above post with final thoughts, and deleted my last post as it was duplicated in the new edit. :)

User avatar
MaxHunter
Posts: 193
Joined: Thu May 26, 2022 6:02 am
Has thanked: 176 times
Been thanked: 13 times

Re: [Guide] Introducing - Phaze-A

Post by MaxHunter »

@bryanlyon
Yeah I know, but I was writing the post on my phone and then was going to insert the JSON from my computer in another room and thought it would be just a quick edit, but turned into a SNAFU of sorts. LOL faceslap Sorry. Story of my life. LOL

User avatar
bryanlyon
Site Admin
Posts: 793
Joined: Fri Jul 12, 2019 12:49 am
Answers: 44
Location: San Francisco
Has thanked: 4 times
Been thanked: 215 times
Contact:

Re: [Guide] Introducing - Phaze-A

Post by bryanlyon »

Not a problem, just trying to help you avoid edits on your posts.

User avatar
torzdf
Posts: 2636
Joined: Fri Jul 12, 2019 12:53 am
Answers: 156
Has thanked: 128 times
Been thanked: 614 times

Re: [Guide] Introducing - Phaze-A

Post by torzdf »

Pro tip: Use a code block and mark it as JSON....

I have to put this in a code block so it doesn't render, but you do it like this:

Code: Select all

```json

{"test": "json"}```

This will render as

Code: Select all

{"test": "json"}

People can then just press the copy button on the code block

Last edited by torzdf on Tue Feb 14, 2023 11:40 pm, edited 1 time in total.

My word is final

User avatar
MaxHunter
Posts: 193
Joined: Thu May 26, 2022 6:02 am
Has thanked: 176 times
Been thanked: 13 times

Re: [Guide] Introducing - Phaze-A

Post by MaxHunter »

@torzdf
Thanks. It took me a few tries to do it, and find the "code box" edit feature, but looks like we got it now. 😁

User avatar
Hotel85
Posts: 1
Joined: Sun Sep 17, 2023 3:31 pm

Re: [Guide] Introducing - Phaze-A

Post by Hotel85 »

Hi folks,

Today, I tried out the DFL-SAEHD-DF preset from Phaze A. It's really impressive with an input size of 192 and a batch size of 16 (110 EGs/sec). I was expecting it to be much slower.

And here comes the newbie question:
Why is it that the original DFL-SAE-DF is much slower (80 EGs/sec)?

Thank you

User avatar
torzdf
Posts: 2636
Joined: Fri Jul 12, 2019 12:53 am
Answers: 156
Has thanked: 128 times
Been thanked: 614 times

Re: [Guide] Introducing - Phaze-A

Post by torzdf »

Without having looked at the actual layouts of the models (I don't use those presets myself), I would guess that the latter is a 'deeper' model. That is, it has more parameters to train.

You can check this yourself by initiating each with the "summary" option checked and looking at the model structures.

My word is final

Post Reply