Page 1 of 1

Training Graph Error since Latest Update

Posted: Fri May 14, 2021 12:04 pm
by ianstephens

We are unable to use training graph data (I believe since the latest update).

When attempting to load up a training graph to see progress, we now get:

Code: Select all

05/14/2021 12:58:56 WARNING  No handles with labels found to put in legend.

We have logs enabled.

We are running a DFL-SAE training session.

It's worth noting that we launched the training session before the update. Perhaps something has changed internally with session logging since the update.

Any help, advice, or fixes are greatly appreciated.


Re: Training Graph Error since Latest Update

Posted: Sat May 15, 2021 10:25 am
by torzdf

As far as I'm aware, graphing should now be relatively bug free.

No handles to place basically means there is no data to show. If you start a new session, let it run for a few iterations on the preview window, prior to switching to the graph, does it still not work?


Re: Training Graph Error since Latest Update

Posted: Sat May 15, 2021 11:47 am
by ianstephens

Just updated to the latest version, closed, and reloaded the model state file.

Still getting this (attached).

example-error-logging.jpg
example-error-logging.jpg (109.09 KiB) Viewed 11453 times

The model was initialized before the latest updates - perhaps something changed in the state files and the new code is not backwards compatible?


Re: Training Graph Error since Latest Update

Posted: Sat May 15, 2021 12:20 pm
by torzdf

Nothing should have changed there.

Which version of Tensorflow are you currently running?

Are you able to zip up your model folder and provide it to me for analysis?


Re: Training Graph Error since Latest Update

Posted: Sun May 16, 2021 6:28 pm
by ianstephens

Thanks for the reply [mention]torzdf[/mention] !

Everything is stock/using default Faceswap bundles. Running on a 2080Ti.

The only difference I can think of with this particular training project is that we are saving every 1000 iterations as opposed to our usual 500. That's the only difference.

Could that be the issue?

I will see what I can do re. the model folder.


Re: Training Graph Error since Latest Update

Posted: Mon May 17, 2021 10:19 am
by torzdf
ianstephens wrote: Sun May 16, 2021 6:28 pm

The only difference I can think of with this particular training project is that we are saving every 1000 iterations as opposed to our usual 500. That's the only difference.

No. it's more likely due to enabling the pointless "multi-scale output" feature in dfl-sae


Re: Training Graph Error since Latest Update

Posted: Tue May 18, 2021 11:00 am
by ianstephens

Ah yes, we do have that enabled - "multiscale decoder".

We did with previous projects too though - and graphing seemed to work fine.

Is it possible to issue a patch/update so we are able to view graphs even with multiscale on?

Just about to donate to the project once again :)


Re: Training Graph Error since Latest Update

Posted: Wed May 19, 2021 9:35 am
by torzdf

If you cannot provide the model, then please at least provide the stat.json file from your model folder. This works fine for me.


Re: Training Graph Error since Latest Update

Posted: Wed May 19, 2021 11:44 am
by ianstephens

I have just PM'd you - thank you again :)


Re: Training Graph Error since Latest Update

Posted: Sat May 22, 2021 12:48 pm
by ianstephens

I've dug a bit deeper and yes, it seems as though the multiscale is using multiple face entries in the "loss_names" section.

Code: Select all

"loss_names": [
        "total",
        "face_a_0",
        "face_a_1",
        "face_a_2",
        "face_b_0",
        "face_b_1",
        "face_b_2"
      ],

I believe the grapher only accepts a/b like;

Code: Select all

      "loss_names": [
        "total",
        "face_a",
        "face_b"
      ],

As there are only loss entries for a/b:

Code: Select all

  "lowest_avg_loss": {
    "a": 0.018949756398797035,
    "b": 0.022925493773072958
  },

What I could try is a search/replace (removing the multiple loss_names and only having face_a and face_b) on the multiscale state JSON file that I have the issue with just so we can see the data. Of course, after a session has been finished it'll post multiple faces to the next session entry.


Re: Training Graph Error since Latest Update

Posted: Sat May 22, 2021 6:35 pm
by ianstephens

Using the idea I mentioned above, I modified the JSON data and was finally able to load the graphs:

example-graphs.jpg
example-graphs.jpg (60.31 KiB) Viewed 10888 times

However, of course, this is only a temporary fix to view historic stats when using the DFL-SAE multiscale decoder option. Going forwards, the JSON file would need to be fixed again to view newer sessions.

In the meantime, I can confirm we started another project with multiscale decoder disabled. Logging works with no issues. It was indeed the multiscale decoder option.

I'm not sure if you have plans to patch/fix this but in the meantime, we will run new projects without multiscale enabled.

Side question - there isn't a lot online about the benefits of the multiscale decoder - [mention]torzdf[/mention] mentioned it was "pointless" - it'd be interesting to know more about this option and what the supposed benefits are meant to be.

We donated to the project again via PayPal a few days ago - thank you for continuing to maintain it - it's a fantastic project.


Re: Training Graph Error since Latest Update

Posted: Sun May 23, 2021 3:11 pm
by torzdf

This should be fixed in latest update.

I have not dug into it in any detail, but there is no evidence to suggest that multi-scale output helps in any way.