[Guide] Introducing - Phaze-A

Want to understand the training process better? Got tips for which model to use and when? This is the place for you


Forum rules

Read the FAQs and search the forum before posting a new topic.

This forum is for discussing tips and understanding the process involved with Training a Faceswap model.

If you have found a bug are having issues with the Training process not working, then you should post in the Training Support forum.

Please mark any answers that fixed your problems so others can find the solutions.

User avatar
Scrapemist
Posts: 11
Joined: Sat Nov 05, 2022 2:26 pm
Has thanked: 7 times

Re: [Guide] Introducing - Phaze-A

Post by Scrapemist »

Wow, very interesting!
Thnx for showcasing the results.

Ryzen1988 wrote: Wed Nov 23, 2022 2:50 am

So my default was a simple network with just split decoders, Adabelief, Layer norm and subpixel.
""...""
A version with separate fully connected & shared FC with a single decoder.

Its surprizing how little difference there is between these two (The first and last in the second row)


Tags:
User avatar
ianstephens
Posts: 113
Joined: Sun Feb 14, 2021 7:20 pm
Has thanked: 12 times
Been thanked: 11 times

Re: [Guide] Introducing - Phaze-A

Post by ianstephens »

Am just about to play with the sym384 model.

The DNY 1024 is a great model but a very slow burner due to low batch size meaning crazily high iterations and time are needed (over a month+ to get clarity with our tested datasets) and I want to play with something that can run on efficientnetv2 as I've had fantastic results when it comes to learning speed when testing with the StoJo model.

What would you guys go with in terms of the learning rate for the sym384 model?

We're currently set at 2e-5 from the DNY 1024 model but I can't imagine it needing it this low.

Thank you for your suggestions in advance.

User avatar
torzdf
Posts: 2112
Joined: Fri Jul 12, 2019 12:53 am
Answers: 140
Has thanked: 101 times
Been thanked: 476 times

Re: [Guide] Introducing - Phaze-A

Post by torzdf »

If you can get away with not using mixed precision, do, as I had NaN issues with SYM-384 using MP. It resolves nice and quickly but NaNs earlier than I would have liked. Otherwise, I would say start with a fairly low lr. Maybe 3.5e-5, may 2e-5. Can't give more guidance beyond that sadly.

My word is final

User avatar
ianstephens
Posts: 113
Joined: Sun Feb 14, 2021 7:20 pm
Has thanked: 12 times
Been thanked: 11 times

Re: [Guide] Introducing - Phaze-A

Post by ianstephens »

torzdf wrote: Sat Nov 26, 2022 11:23 am

If you can get away with not using mixed precision, do, as I had NaN issues with SYM-384 using MP. It resolves nice and quickly but NaNs earlier than I would have liked. Otherwise, I would say start with a fairly low lr. Maybe 3.5e-5, may 2e-5. Can't give more guidance beyond that sadly.

Thanks for the reply @torzdf. To be honest, I've currently shelved MP as I got sick of the NaNs. So for me, it will always be FP provided I am able with the current hardware.

I'll settle in the middle at 3e-5 and report back :D

User avatar
ianstephens
Posts: 113
Joined: Sun Feb 14, 2021 7:20 pm
Has thanked: 12 times
Been thanked: 11 times

Re: [Guide] Introducing - Phaze-A

Post by ianstephens »

I, for the life of me, cannot get a SYM-384 model started.

After the first model save (500its) the preview always turns out like this:

Screenshot 2022-11-28 at 10.27.49.png
Screenshot 2022-11-28 at 10.27.49.png (38.54 KiB) Viewed 154 times

I've tried lowering the learning rate as low as 1e-5 and still the same after the first model save (500its).

I've also tried starting the model with ICNR Init and Conv Aware Init disabled but still the same.

Any tips/advice?

Thank you in advance.

User avatar
MaxHunter
Posts: 99
Joined: Thu May 26, 2022 6:02 am
Has thanked: 84 times
Been thanked: 8 times

Re: [Guide] Introducing - Phaze-A

Post by MaxHunter »

I've currently shelved MP as I got sick of the NaNs. So for me, it will always be FP provided I am able with the current hardware

@ianstephens I know you have a big rig, but out of curiosity what is your batch size and how long has it taken you to train a 512 on FP?

User avatar
ianstephens
Posts: 113
Joined: Sun Feb 14, 2021 7:20 pm
Has thanked: 12 times
Been thanked: 11 times

Re: [Guide] Introducing - Phaze-A

Post by ianstephens »

MaxHunter wrote: Mon Nov 28, 2022 5:25 pm

I've currently shelved MP as I got sick of the NaNs. So for me, it will always be FP provided I am able with the current hardware

@ianstephens I know you have a big rig, but out of curiosity what is your batch size and how long has it taken you to train a 512 on FP?

@MaxHunter:

I've not played with the DNY 512 model yet - I have been working with the DNY 1024 model.

For the DNY 1024 model, we managed to get BS 2-3 at FP. However, it was more stable at BS 2 as sometimes it hit OOM.

We ran the model with ~ 6000 faces on both sides A/B.

It took over four weeks of non-stop training (minus a few hiccups and pauses) and several million iterations to start seeing clarity with the previews. However, on a test convert and using a full 1024+ px (face taking up most of the frame) swap it was clear it still had a long way to go. I imagine once I switch off warp it'll really bring it together but we never got that far and it's not quite ready for that yet. I have the model saved and may resume training and finally training with nowarp once I get some time.

But now I'm playing with the SYM-384 model (finally managed to get it started after some tweaking) and when I woke up this morning after running it through the night I am amazed at how the previews look already. I really do love efficientnetv2. We run this at BS 12.

Hope this helps!

User avatar
torzdf
Posts: 2112
Joined: Fri Jul 12, 2019 12:53 am
Answers: 140
Has thanked: 101 times
Been thanked: 476 times

Re: [Guide] Introducing - Phaze-A

Post by torzdf »

ianstephens wrote: Mon Nov 28, 2022 10:32 am

I, for the life of me, cannot get a SYM-384 model started.

After the first model save (500its) the preview always turns out like this:

I've tried lowering the learning rate as low as 1e-5 and still the same after the first model save (500its).

I've also tried starting the model with ICNR Init and Conv Aware Init disabled but still the same.

Any tips/advice?

Thank you in advance.

Sadly not. I have heard of 1 other person this has happened to, and I have not been able to replicate, so I wonder whether it is in some way hardware or data-set related.

My word is final

Post Reply