Training Graph Error since Latest Update

If training is failing to start, and you are not receiving an error message telling you what to do, tell us about it here


Forum rules

Read the FAQs and search the forum before posting a new topic.

This forum is for reporting errors with the Training process. If you want to get tips, or better understand the Training process, then you should look in the Training Discussion forum.

Please mark any answers that fixed your problems so others can find the solutions.

Locked
User avatar
ianstephens
Posts: 117
Joined: Sun Feb 14, 2021 7:20 pm
Has thanked: 12 times
Been thanked: 15 times

Training Graph Error since Latest Update

Post by ianstephens »

We are unable to use training graph data (I believe since the latest update).

When attempting to load up a training graph to see progress, we now get:

Code: Select all

05/14/2021 12:58:56 WARNING  No handles with labels found to put in legend.

We have logs enabled.

We are running a DFL-SAE training session.

It's worth noting that we launched the training session before the update. Perhaps something has changed internally with session logging since the update.

Any help, advice, or fixes are greatly appreciated.

User avatar
torzdf
Posts: 2649
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 128 times
Been thanked: 622 times

Re: Training Graph Error since Latest Update

Post by torzdf »

As far as I'm aware, graphing should now be relatively bug free.

No handles to place basically means there is no data to show. If you start a new session, let it run for a few iterations on the preview window, prior to switching to the graph, does it still not work?

My word is final

User avatar
ianstephens
Posts: 117
Joined: Sun Feb 14, 2021 7:20 pm
Has thanked: 12 times
Been thanked: 15 times

Re: Training Graph Error since Latest Update

Post by ianstephens »

Just updated to the latest version, closed, and reloaded the model state file.

Still getting this (attached).

example-error-logging.jpg
example-error-logging.jpg (109.09 KiB) Viewed 11343 times

The model was initialized before the latest updates - perhaps something changed in the state files and the new code is not backwards compatible?

User avatar
torzdf
Posts: 2649
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 128 times
Been thanked: 622 times

Re: Training Graph Error since Latest Update

Post by torzdf »

Nothing should have changed there.

Which version of Tensorflow are you currently running?

Are you able to zip up your model folder and provide it to me for analysis?

My word is final

User avatar
ianstephens
Posts: 117
Joined: Sun Feb 14, 2021 7:20 pm
Has thanked: 12 times
Been thanked: 15 times

Re: Training Graph Error since Latest Update

Post by ianstephens »

Thanks for the reply [mention]torzdf[/mention] !

Everything is stock/using default Faceswap bundles. Running on a 2080Ti.

The only difference I can think of with this particular training project is that we are saving every 1000 iterations as opposed to our usual 500. That's the only difference.

Could that be the issue?

I will see what I can do re. the model folder.

User avatar
torzdf
Posts: 2649
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 128 times
Been thanked: 622 times

Re: Training Graph Error since Latest Update

Post by torzdf »

ianstephens wrote: Sun May 16, 2021 6:28 pm

The only difference I can think of with this particular training project is that we are saving every 1000 iterations as opposed to our usual 500. That's the only difference.

No. it's more likely due to enabling the pointless "multi-scale output" feature in dfl-sae

My word is final

User avatar
ianstephens
Posts: 117
Joined: Sun Feb 14, 2021 7:20 pm
Has thanked: 12 times
Been thanked: 15 times

Re: Training Graph Error since Latest Update

Post by ianstephens »

Ah yes, we do have that enabled - "multiscale decoder".

We did with previous projects too though - and graphing seemed to work fine.

Is it possible to issue a patch/update so we are able to view graphs even with multiscale on?

Just about to donate to the project once again :)

User avatar
torzdf
Posts: 2649
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 128 times
Been thanked: 622 times

Re: Training Graph Error since Latest Update

Post by torzdf »

If you cannot provide the model, then please at least provide the stat.json file from your model folder. This works fine for me.

My word is final

User avatar
ianstephens
Posts: 117
Joined: Sun Feb 14, 2021 7:20 pm
Has thanked: 12 times
Been thanked: 15 times

Re: Training Graph Error since Latest Update

Post by ianstephens »

I have just PM'd you - thank you again :)

User avatar
ianstephens
Posts: 117
Joined: Sun Feb 14, 2021 7:20 pm
Has thanked: 12 times
Been thanked: 15 times

Re: Training Graph Error since Latest Update

Post by ianstephens »

I've dug a bit deeper and yes, it seems as though the multiscale is using multiple face entries in the "loss_names" section.

Code: Select all

"loss_names": [
        "total",
        "face_a_0",
        "face_a_1",
        "face_a_2",
        "face_b_0",
        "face_b_1",
        "face_b_2"
      ],

I believe the grapher only accepts a/b like;

Code: Select all

      "loss_names": [
        "total",
        "face_a",
        "face_b"
      ],

As there are only loss entries for a/b:

Code: Select all

  "lowest_avg_loss": {
    "a": 0.018949756398797035,
    "b": 0.022925493773072958
  },

What I could try is a search/replace (removing the multiple loss_names and only having face_a and face_b) on the multiscale state JSON file that I have the issue with just so we can see the data. Of course, after a session has been finished it'll post multiple faces to the next session entry.

User avatar
ianstephens
Posts: 117
Joined: Sun Feb 14, 2021 7:20 pm
Has thanked: 12 times
Been thanked: 15 times

Re: Training Graph Error since Latest Update

Post by ianstephens »

Using the idea I mentioned above, I modified the JSON data and was finally able to load the graphs:

example-graphs.jpg
example-graphs.jpg (60.31 KiB) Viewed 10778 times

However, of course, this is only a temporary fix to view historic stats when using the DFL-SAE multiscale decoder option. Going forwards, the JSON file would need to be fixed again to view newer sessions.

In the meantime, I can confirm we started another project with multiscale decoder disabled. Logging works with no issues. It was indeed the multiscale decoder option.

I'm not sure if you have plans to patch/fix this but in the meantime, we will run new projects without multiscale enabled.

Side question - there isn't a lot online about the benefits of the multiscale decoder - [mention]torzdf[/mention] mentioned it was "pointless" - it'd be interesting to know more about this option and what the supposed benefits are meant to be.

We donated to the project again via PayPal a few days ago - thank you for continuing to maintain it - it's a fantastic project.

User avatar
torzdf
Posts: 2649
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 128 times
Been thanked: 622 times

Re: Training Graph Error since Latest Update

Post by torzdf »

This should be fixed in latest update.

I have not dug into it in any detail, but there is no evidence to suggest that multi-scale output helps in any way.

My word is final

Locked