Page 1 of 1

crash report while training: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize

Posted: Sun Jan 24, 2021 1:09 am
by dedude

I did everything like ( except I entered the alignments directory manualy, but then I got a crash report.

Here the crashreport i received.

Code: Select all

01/24/2021 01:59:03 MainProcess     _run_0                         training_data   _expand_partials               DEBUG    Generating mask. side: 'b', filename: 'C:\Users\hanse\Desktop\df1\faceB\generated(3)(1)_000005_0.png'
01/24/2021 01:59:03 MainProcess     _run_0                         aligned_face    extract_face                   DEBUG    _extract_face called without a loaded image. Returning empty face.
01/24/2021 01:59:03 MainProcess     _run_1                         training_data   _expand_partials               DEBUG    Generating mask. side: 'a', filename: 'C:\Users\hanse\Desktop\df1\faceA\jp tiktok_000411_0.png'
01/24/2021 01:59:03 MainProcess     _run_1                         aligned_face    extract_face                   DEBUG    _extract_face called without a loaded image. Returning empty face.
01/24/2021 01:59:03 MainProcess     _run_0                         training_data   _expand_partials               DEBUG    Generating mask. side: 'b', filename: 'C:\Users\hanse\Desktop\df1\faceB\generated(3)(1)_000085_0.png'
01/24/2021 01:59:03 MainProcess     _run_0                         aligned_face    extract_face                   DEBUG    _extract_face called without a loaded image. Returning empty face.
01/24/2021 01:59:03 MainProcess     _run_1                         training_data   _expand_partials               DEBUG    Generating mask. side: 'a', filename: 'C:\Users\hanse\Desktop\df1\faceA\jp tiktok_000399_0.png'
01/24/2021 01:59:03 MainProcess     _run_1                         aligned_face    extract_face                   DEBUG    _extract_face called without a loaded image. Returning empty face.
01/24/2021 01:59:03 MainProcess     _run_0                         training_data   _expand_partials               DEBUG    Generating mask. side: 'b', filename: 'C:\Users\hanse\Desktop\df1\faceB\generated(3)(1)_000305_0.png'
01/24/2021 01:59:03 MainProcess     _run_0                         aligned_face    extract_face                   DEBUG    _extract_face called without a loaded image. Returning empty face.
01/24/2021 01:59:03 MainProcess     _run_1                         training_data   _expand_partials               DEBUG    Generating mask. side: 'a', filename: 'C:\Users\hanse\Desktop\df1\faceA\jp tiktok_000128_0.png'
01/24/2021 01:59:03 MainProcess     _run_1                         aligned_face    extract_face                   DEBUG    _extract_face called without a loaded image. Returning empty face.
01/24/2021 01:59:03 MainProcess     _run_0                         training_data   _expand_partials               DEBUG    Generating mask. side: 'b', filename: 'C:\Users\hanse\Desktop\df1\faceB\generated(3)(1)_000026_0.png'
01/24/2021 01:59:03 MainProcess     _run_0                         aligned_face    extract_face                   DEBUG    _extract_face called without a loaded image. Returning empty face.
01/24/2021 01:59:03 MainProcess     _run_1                         training_data   _expand_partials               DEBUG    Generating mask. side: 'a', filename: 'C:\Users\hanse\Desktop\df1\faceA\jp tiktok_000173_0.png'
01/24/2021 01:59:03 MainProcess     _run_1                         aligned_face    extract_face                   DEBUG    _extract_face called without a loaded image. Returning empty face.
01/24/2021 01:59:03 MainProcess     _run_0                         training_data   _expand_partials               DEBUG    Generating mask. side: 'b', filename: 'C:\Users\hanse\Desktop\df1\faceB\generated(3)(1)_000032_0.png'
01/24/2021 01:59:03 MainProcess     _run_0                         aligned_face    extract_face                   DEBUG    _extract_face called without a loaded image. Returning empty face.
01/24/2021 01:59:03 MainProcess     _run_1                         training_data   _expand_partials               DEBUG    Generating mask. side: 'a', filename: 'C:\Users\hanse\Desktop\df1\faceA\jp tiktok_000215_0.png'
01/24/2021 01:59:03 MainProcess     _run_1                         aligned_face    extract_face                   DEBUG    _extract_face called without a loaded image. Returning empty face.
01/24/2021 01:59:03 MainProcess     _run_0                         training_data   _expand_partials               DEBUG    Generating mask. side: 'b', filename: 'C:\Users\hanse\Desktop\df1\faceB\generated(3)(1)_000015_0.png'
01/24/2021 01:59:03 MainProcess     _run_0                         aligned_face    extract_face                   DEBUG    _extract_face called without a loaded image. Returning empty face.
01/24/2021 01:59:03 MainProcess     _run_1                         training_data   _expand_partials               DEBUG    Generating mask. side: 'a', filename: 'C:\Users\hanse\Desktop\df1\faceA\jp tiktok_000404_0.png'
01/24/2021 01:59:03 MainProcess     _run_1                         aligned_face    extract_face                   DEBUG    _extract_face called without a loaded image. Returning empty face.
01/24/2021 01:59:03 MainProcess     _run_0                         training_data   _expand_partials               DEBUG    Generating mask. side: 'b', filename: 'C:\Users\hanse\Desktop\df1\faceB\generated(3)(1)_000088_0.png'
01/24/2021 01:59:03 MainProcess     _run_0                         aligned_face    extract_face                   DEBUG    _extract_face called without a loaded image. Returning empty face.
01/24/2021 01:59:03 MainProcess     _run_1                         training_data   _expand_partials               DEBUG    Generating mask. side: 'a', filename: 'C:\Users\hanse\Desktop\df1\faceA\jp tiktok_000137_0.png'
01/24/2021 01:59:03 MainProcess     _run_1                         aligned_face    extract_face                   DEBUG    _extract_face called without a loaded image. Returning empty face.
01/24/2021 01:59:03 MainProcess     _run_0                         training_data   _expand_partials               DEBUG    Generating mask. side: 'b', filename: 'C:\Users\hanse\Desktop\df1\faceB\generated(3)(1)_000035_0.png'
01/24/2021 01:59:03 MainProcess     _run_0                         aligned_face    extract_face                   DEBUG    _extract_face called without a loaded image. Returning empty face.
01/24/2021 01:59:03 MainProcess     _run_1                         training_data   _expand_partials               DEBUG    Mask already generated. side: 'a', filename: 'C:\Users\hanse\Desktop\df1\faceA\jp tiktok_000189_0.png'
01/24/2021 01:59:03 MainProcess     _run_1                         training_data   _expand_partials               DEBUG    Generating mask. side: 'a', filename: 'C:\Users\hanse\Desktop\df1\faceA\jp tiktok_000023_0.png'
01/24/2021 01:59:03 MainProcess     _run_1                         aligned_face    extract_face                   DEBUG    _extract_face called without a loaded image. Returning empty face.
01/24/2021 01:59:03 MainProcess     _run_1                         training_data   _expand_partials               DEBUG    Generating mask. side: 'a', filename: 'C:\Users\hanse\Desktop\df1\faceA\jp tiktok_000246_0.png'
01/24/2021 01:59:03 MainProcess     _run_1                         aligned_face    extract_face                   DEBUG    _extract_face called without a loaded image. Returning empty face.
01/24/2021 01:59:03 MainProcess     _run_1                         training_data   _expand_partials               DEBUG    Generating mask. side: 'a', filename: 'C:\Users\hanse\Desktop\df1\faceA\jp tiktok_000089_0.png'
01/24/2021 01:59:03 MainProcess     _run_1                         aligned_face    extract_face                   DEBUG    _extract_face called without a loaded image. Returning empty face.
01/24/2021 01:59:03 MainProcess     _run_1                         training_data   _expand_partials               DEBUG    Mask already generated. side: 'a', filename: 'C:\Users\hanse\Desktop\df1\faceA\jp tiktok_000071_0.png'
01/24/2021 01:59:03 MainProcess     _training_0                    ag_logging      warn                           DEBUG    AutoGraph could not transform <bound method Logger.isEnabledFor of <FaceswapLogger lib.model.losses_tf (DEBUG)>> and will run it as-is.\nPlease report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.\nCause: module 'gast' has no attribute 'Index'\nTo silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
01/24/2021 01:59:03 MainProcess     _training_0                    ag_logging      warn                           DEBUG    AutoGraph could not transform <bound method Logger.findCaller of <FaceswapLogger lib.model.losses_tf (DEBUG)>> and will run it as-is.\nPlease report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.\nCause: module 'gast' has no attribute 'Index'\nTo silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
01/24/2021 01:59:03 MainProcess     _training_0                    ag_logging      warn                           DEBUG    AutoGraph could not transform <bound method Logger.makeRecord of <FaceswapLogger lib.model.losses_tf (DEBUG)>> and will run it as-is.\nPlease report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.\nCause: module 'gast' has no attribute 'Index'\nTo silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
01/24/2021 01:59:03 MainProcess     _training_0                    ag_logging      warn                           DEBUG    AutoGraph could not transform <bound method FaceswapFormatter.format of <lib.logger.FaceswapFormatter object at 0x000001856AC57F70>> and will run it as-is.\nPlease report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.\nCause: module 'gast' has no attribute 'Index'\nTo silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
01/24/2021 01:59:03 MainProcess     _training_0                    api             converted_call                 DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x0000018512090D30>, weight: 1.0, mask_channel: 3)
01/24/2021 01:59:04 MainProcess     _training_0                    ag_logging      warn                           DEBUG    AutoGraph could not transform <bound method LossWrapper._apply_mask of <class 'lib.model.losses_tf.LossWrapper'>> and will run it as-is.\nPlease report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.\nCause: module 'gast' has no attribute 'Index'\nTo silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
01/24/2021 01:59:04 MainProcess     _training_0                    losses_tf       _apply_mask                    DEBUG    Applying mask from channel 3
01/24/2021 01:59:04 MainProcess     _training_0                    ag_logging      warn                           DEBUG    AutoGraph could not transform <bound method DSSIMObjective.call of <lib.model.losses_tf.DSSIMObjective object at 0x0000018575AAA910>> and will run it as-is.\nPlease report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.\nCause: module 'gast' has no attribute 'Index'\nTo silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
01/24/2021 01:59:04 MainProcess     _training_0                    tmp4mdlxfxe     if_body                        DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x000001851208F820>, weight: 1.0, mask_channel: 3)
01/24/2021 01:59:04 MainProcess     _training_0                    losses_tf       _apply_mask                    DEBUG    Applying mask from channel 3
01/24/2021 01:59:04 MainProcess     _training_0                    tmp4mdlxfxe     if_body                        DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x00000185120A21C0>, weight: 3.0, mask_channel: 4)
01/24/2021 01:59:04 MainProcess     _training_0                    losses_tf       _apply_mask                    DEBUG    Applying mask from channel 4
01/24/2021 01:59:04 MainProcess     _training_0                    tmp4mdlxfxe     if_body                        DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x00000185120A2AF0>, weight: 1.0, mask_channel: 1)
01/24/2021 01:59:04 MainProcess     _training_0                    losses_tf       _apply_mask                    DEBUG    Applying mask from channel 1
01/24/2021 01:59:04 MainProcess     _training_0                    tmp4mdlxfxe     if_body                        DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x00000185101C3070>, weight: 2.0, mask_channel: 5)
01/24/2021 01:59:04 MainProcess     _training_0                    losses_tf       _apply_mask                    DEBUG    Applying mask from channel 5
01/24/2021 01:59:04 MainProcess     _training_0                    tmp4mdlxfxe     if_body                        DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x00000185101C3AC0>, weight: 1.0, mask_channel: 2)
01/24/2021 01:59:04 MainProcess     _training_0                    losses_tf       _apply_mask                    DEBUG    Applying mask from channel 2
01/24/2021 01:59:04 MainProcess     _training_0                    tmp4mdlxfxe     if_body                        DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x00000185101CF070>, weight: 1.0, mask_channel: 3)
01/24/2021 01:59:04 MainProcess     _training_0                    losses_tf       _apply_mask                    DEBUG    Applying mask from channel 3
01/24/2021 01:59:04 MainProcess     _training_0                    tmp4mdlxfxe     if_body                        DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x00000185101CFAC0>, weight: 1.0, mask_channel: 3)
01/24/2021 01:59:04 MainProcess     _training_0                    losses_tf       _apply_mask                    DEBUG    Applying mask from channel 3
01/24/2021 01:59:04 MainProcess     _training_0                    tmp4mdlxfxe     if_body                        DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x00000185101B0040>, weight: 3.0, mask_channel: 4)
01/24/2021 01:59:04 MainProcess     _training_0                    losses_tf       _apply_mask                    DEBUG    Applying mask from channel 4
01/24/2021 01:59:04 MainProcess     _training_0                    tmp4mdlxfxe     if_body                        DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x00000185101B0B20>, weight: 1.0, mask_channel: 1)
01/24/2021 01:59:04 MainProcess     _training_0                    losses_tf       _apply_mask                    DEBUG    Applying mask from channel 1
01/24/2021 01:59:04 MainProcess     _training_0                    tmp4mdlxfxe     if_body                        DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x00000185101920A0>, weight: 2.0, mask_channel: 5)
01/24/2021 01:59:04 MainProcess     _training_0                    losses_tf       _apply_mask                    DEBUG    Applying mask from channel 5
01/24/2021 01:59:04 MainProcess     _training_0                    tmp4mdlxfxe     if_body                        DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x0000018510192AF0>, weight: 1.0, mask_channel: 2)
01/24/2021 01:59:04 MainProcess     _training_0                    losses_tf       _apply_mask                    DEBUG    Applying mask from channel 2
01/24/2021 01:59:06 MainProcess     _training_0                    tmp4mdlxfxe     if_body                        DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x0000018512090D30>, weight: 1.0, mask_channel: 3)
01/24/2021 01:59:06 MainProcess     _training_0                    losses_tf       _apply_mask                    DEBUG    Applying mask from channel 3
01/24/2021 01:59:06 MainProcess     _training_0                    tmp4mdlxfxe     if_body                        DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x000001851208F820>, weight: 1.0, mask_channel: 3)
01/24/2021 01:59:06 MainProcess     _training_0                    losses_tf       _apply_mask                    DEBUG    Applying mask from channel 3
01/24/2021 01:59:06 MainProcess     _training_0                    tmp4mdlxfxe     if_body                        DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x00000185120A21C0>, weight: 3.0, mask_channel: 4)
01/24/2021 01:59:06 MainProcess     _training_0                    losses_tf       _apply_mask                    DEBUG    Applying mask from channel 4
01/24/2021 01:59:06 MainProcess     _training_0                    tmp4mdlxfxe     if_body                        DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x00000185120A2AF0>, weight: 1.0, mask_channel: 1)
01/24/2021 01:59:06 MainProcess     _training_0                    losses_tf       _apply_mask                    DEBUG    Applying mask from channel 1
01/24/2021 01:59:06 MainProcess     _training_0                    tmp4mdlxfxe     if_body                        DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x00000185101C3070>, weight: 2.0, mask_channel: 5)
01/24/2021 01:59:06 MainProcess     _training_0                    losses_tf       _apply_mask                    DEBUG    Applying mask from channel 5
01/24/2021 01:59:06 MainProcess     _training_0                    tmp4mdlxfxe     if_body                        DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x00000185101C3AC0>, weight: 1.0, mask_channel: 2)
01/24/2021 01:59:06 MainProcess     _training_0                    losses_tf       _apply_mask                    DEBUG    Applying mask from channel 2
01/24/2021 01:59:06 MainProcess     _training_0                    tmp4mdlxfxe     if_body                        DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x00000185101CF070>, weight: 1.0, mask_channel: 3)
01/24/2021 01:59:06 MainProcess     _training_0                    losses_tf       _apply_mask                    DEBUG    Applying mask from channel 3
01/24/2021 01:59:06 MainProcess     _training_0                    tmp4mdlxfxe     if_body                        DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x00000185101CFAC0>, weight: 1.0, mask_channel: 3)
01/24/2021 01:59:06 MainProcess     _training_0                    losses_tf       _apply_mask                    DEBUG    Applying mask from channel 3
01/24/2021 01:59:06 MainProcess     _training_0                    tmp4mdlxfxe     if_body                        DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x00000185101B0040>, weight: 3.0, mask_channel: 4)
01/24/2021 01:59:06 MainProcess     _training_0                    losses_tf       _apply_mask                    DEBUG    Applying mask from channel 4
01/24/2021 01:59:06 MainProcess     _training_0                    tmp4mdlxfxe     if_body                        DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x00000185101B0B20>, weight: 1.0, mask_channel: 1)
01/24/2021 01:59:06 MainProcess     _training_0                    losses_tf       _apply_mask                    DEBUG    Applying mask from channel 1
01/24/2021 01:59:06 MainProcess     _training_0                    tmp4mdlxfxe     if_body                        DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x00000185101920A0>, weight: 2.0, mask_channel: 5)
01/24/2021 01:59:06 MainProcess     _training_0                    losses_tf       _apply_mask                    DEBUG    Applying mask from channel 5
01/24/2021 01:59:07 MainProcess     _training_0                    tmp4mdlxfxe     if_body                        DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x0000018510192AF0>, weight: 1.0, mask_channel: 2)
01/24/2021 01:59:07 MainProcess     _training_0                    losses_tf       _apply_mask                    DEBUG    Applying mask from channel 2
01/24/2021 01:59:10 MainProcess     _training_0                    multithreading  run                            DEBUG    Error in thread (_training_0): 2 root error(s) found.\n  (0) Unknown:  Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.\n	 [[node original/encoder/conv_128_0_conv2d/Conv2D (defined at Software\faceswap\plugins\train\trainer\_base.py:238) ]]\n	 [[Func/cond/then/_0/input/_32/_46]]\n  (1) Unknown:  Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.\n	 [[node original/encoder/conv_128_0_conv2d/Conv2D (defined at Software\faceswap\plugins\train\trainer\_base.py:238) ]]\n0 successful operations.\n0 derived errors ignored. [Op:__inference_train_function_8926]\n\nFunction call stack:\ntrain_function -> train_function\n
01/24/2021 01:59:11 MainProcess     MainThread                     train           _monitor                       DEBUG    Thread error detected
01/24/2021 01:59:11 MainProcess     MainThread                     train           _monitor                       DEBUG    Closed Monitor
01/24/2021 01:59:11 MainProcess     MainThread                     train           _end_thread                    DEBUG    Ending Training thread
01/24/2021 01:59:11 MainProcess     MainThread                     train           _end_thread                    CRITICAL Error caught! Exiting...
01/24/2021 01:59:11 MainProcess     MainThread                     multithreading  join                           DEBUG    Joining Threads: '_training'
01/24/2021 01:59:11 MainProcess     MainThread                     multithreading  join                           DEBUG    Joining Thread: '_training_0'
01/24/2021 01:59:11 MainProcess     MainThread                     multithreading  join                           ERROR    Caught exception in thread: '_training_0'
Traceback (most recent call last):
  File "C:\Software\faceswap\lib\cli\launcher.py", line 182, in execute_script
    process.process()
  File "C:\Software\faceswap\scripts\train.py", line 170, in process
    self._end_thread(thread, err)
  File "C:\Software\faceswap\scripts\train.py", line 210, in _end_thread
    thread.join()
  File "C:\Software\faceswap\lib\multithreading.py", line 121, in join
    raise thread.err[1].with_traceback(thread.err[2])
  File "C:\Software\faceswap\lib\multithreading.py", line 37, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Software\faceswap\scripts\train.py", line 232, in _training
    raise err
  File "C:\Software\faceswap\scripts\train.py", line 222, in _training
    self._run_training_cycle(model, trainer)
  File "C:\Software\faceswap\scripts\train.py", line 302, in _run_training_cycle
    trainer.train_one_step(viewer, timelapse)
  File "C:\Software\faceswap\plugins\train\trainer\_base.py", line 238, in train_one_step
    loss = self._model.model.train_on_batch(model_inputs, y=model_targets)
  File "C:\Users\hanse\MiniConda3\envs\faceswap\lib\site-packages\tensorflow\python\keras\engine\training.py", line 1695, in train_on_batch
    logs = train_function(iterator)
  File "C:\Users\hanse\MiniConda3\envs\faceswap\lib\site-packages\tensorflow\python\eager\def_function.py", line 780, in __call__
    result = self._call(*args, **kwds)
  File "C:\Users\hanse\MiniConda3\envs\faceswap\lib\site-packages\tensorflow\python\eager\def_function.py", line 840, in _call
    return self._stateless_fn(*args, **kwds)
  File "C:\Users\hanse\MiniConda3\envs\faceswap\lib\site-packages\tensorflow\python\eager\function.py", line 2829, in __call__
    return graph_function._filtered_call(args, kwargs)  # pylint: disable=protected-access
  File "C:\Users\hanse\MiniConda3\envs\faceswap\lib\site-packages\tensorflow\python\eager\function.py", line 1843, in _filtered_call
    return self._call_flat(
  File "C:\Users\hanse\MiniConda3\envs\faceswap\lib\site-packages\tensorflow\python\eager\function.py", line 1923, in _call_flat
    return self._build_call_outputs(self._inference_function.call(
  File "C:\Users\hanse\MiniConda3\envs\faceswap\lib\site-packages\tensorflow\python\eager\function.py", line 545, in call
    outputs = execute.execute(
  File "C:\Users\hanse\MiniConda3\envs\faceswap\lib\site-packages\tensorflow\python\eager\execute.py", line 59, in quick_execute
    tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.UnknownError: 2 root error(s) found.
  (0) Unknown:  Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
	 [[node original/encoder/conv_128_0_conv2d/Conv2D (defined at Software\faceswap\plugins\train\trainer\_base.py:238) ]]
	 [[Func/cond/then/_0/input/_32/_46]]
  (1) Unknown:  Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
	 [[node original/encoder/conv_128_0_conv2d/Conv2D (defined at Software\faceswap\plugins\train\trainer\_base.py:238) ]]
0 successful operations.
0 derived errors ignored. [Op:__inference_train_function_8926]

Function call stack:
train_function -> train_function


============ System Information ============
encoding:            cp1252
git_branch:          Not Found
git_commits:         Not Found
gpu_cuda:            No global version found. Check Conda packages for Conda Cuda
gpu_cudnn:           No global version found. Check Conda packages for Conda cuDNN
gpu_devices:         GPU_0: GeForce RTX 2070
gpu_devices_active:  GPU_0
gpu_driver:          461.09
gpu_vram:            GPU_0: 8192MB
os_machine:          AMD64
os_platform:         Windows-10-10.0.18362-SP0
os_release:          10
py_command:          C:\Software\faceswap\faceswap.py train -A C:/Users/hanse/Desktop/df1/faceA -ala C:/Users/hanse/Desktop/df1/jp tiktok_alignments.fsa -B C:/Users/hanse/Desktop/df1/faceB -alb C:/Users/hanse/Desktop/df1/generated(3)(1)_alignments.fsa -m C:/Users/hanse/Desktop/df1/faceAB -t original -bs 12 -it 1000000 -s 250 -ss 25000 -tia C:/Users/hanse/Desktop/df1/faceA -tib C:/Users/hanse/Desktop/df1/faceB -to C:/Users/hanse/Desktop/df1/tl -ps 50 -L INFO -gui
py_conda_version:    conda 4.9.2
py_implementation:   CPython
py_version:          3.8.5
py_virtual_env:      True
sys_cores:           6
sys_processor:       Intel64 Family 6 Model 158 Stepping 10, GenuineIntel
sys_ram:             Total: 16319MB, Available: 7017MB, Used: 9301MB, Free: 7017MB

=============== Pip Packages ===============
absl-py @ file:///tmp/build/80754af9/absl-py_1607439979954/work
aiohttp @ file:///C:/ci/aiohttp_1607109697839/work
astunparse==1.6.3
async-timeout==3.0.1
attrs @ file:///tmp/build/80754af9/attrs_1604765588209/work
blinker==1.4
brotlipy==0.7.0
cachetools @ file:///tmp/build/80754af9/cachetools_1607706694405/work
certifi==2020.12.5
cffi @ file:///C:/ci/cffi_1606255208697/work
chardet @ file:///C:/ci/chardet_1605303225733/work
click @ file:///home/linux1/recipes/ci/click_1610990599742/work
cryptography==2.9.2
cycler==0.10.0
fastcluster==1.1.26
ffmpy==0.2.3
gast @ file:///tmp/build/80754af9/gast_1597433534803/work
google-auth @ file:///tmp/build/80754af9/google-auth_1607969906642/work
google-auth-oauthlib @ file:///tmp/build/80754af9/google-auth-oauthlib_1603929124518/work
google-pasta==0.2.0
grpcio @ file:///C:/ci/grpcio_1597406462198/work
h5py==2.10.0
idna @ file:///home/linux1/recipes/ci/idna_1610986105248/work
imageio @ file:///tmp/build/80754af9/imageio_1594161405741/work
imageio-ffmpeg @ file:///home/conda/feedstock_root/build_artifacts/imageio-ffmpeg_1609799311556/work
importlib-metadata @ file:///tmp/build/80754af9/importlib-metadata_1602276842396/work
joblib @ file:///tmp/build/80754af9/joblib_1607970656719/work
Keras-Applications @ file:///tmp/build/80754af9/keras-applications_1594366238411/work
Keras-Preprocessing==1.1.0
kiwisolver @ file:///C:/ci/kiwisolver_1604014703538/work
Markdown @ file:///C:/ci/markdown_1605111189761/work
matplotlib @ file:///C:/ci/matplotlib-base_1592837548929/work
mkl-fft==1.2.0
mkl-random==1.1.1
mkl-service==2.3.0
multidict @ file:///C:/ci/multidict_1600456481656/work
numpy @ file:///C:/ci/numpy_and_numpy_base_1603466732592/work
nvidia-ml-py3 @ git+https://github.com/deepfakes/nvidia-ml-py3.git@6fc29ac84b32bad877f078cb4a777c1548a00bf6
oauthlib==3.1.0
olefile==0.46
opencv-python==4.5.1.48
opt-einsum==3.1.0
pathlib==1.0.1
Pillow @ file:///C:/ci/pillow_1609786840597/work
protobuf==3.13.0
psutil @ file:///C:/ci/psutil_1598370330503/work
pyasn1==0.4.8
pyasn1-modules==0.2.8
pycparser @ file:///tmp/build/80754af9/pycparser_1594388511720/work
PyJWT @ file:///C:/ci/pyjwt_1610893382614/work
pyOpenSSL @ file:///tmp/build/80754af9/pyopenssl_1608057966937/work
pyparsing @ file:///home/linux1/recipes/ci/pyparsing_1610983426697/work
pyreadline==2.1
PySocks @ file:///C:/ci/pysocks_1605287845585/work
python-dateutil==2.8.1
pywin32==227
requests @ file:///tmp/build/80754af9/requests_1608241421344/work
requests-oauthlib==1.3.0
rsa @ file:///tmp/build/80754af9/rsa_1610483308194/work
scikit-learn @ file:///C:/ci/scikit-learn_1598377018496/work
scipy @ file:///C:/ci/scipy_1604596260408/work
sip==4.19.13
six @ file:///C:/ci/six_1605187374963/work
tensorboard @ file:///home/builder/ktietz/conda/conda-bld/tensorboard_1604313476433/work/tmp_pip_dir
tensorboard-plugin-wit==1.6.0
tensorflow==2.3.0
tensorflow-estimator @ file:///tmp/build/80754af9/tensorflow-estimator_1599136169057/work/whl_temp/tensorflow_estimator-2.3.0-py2.py3-none-any.whl
termcolor==1.1.0
threadpoolctl @ file:///tmp/tmp9twdgx9k/threadpoolctl-2.1.0-py3-none-any.whl
tornado @ file:///C:/ci/tornado_1606942392901/work
tqdm @ file:///tmp/build/80754af9/tqdm_1609788246169/work
typing-extensions @ file:///tmp/build/80754af9/typing_extensions_1598376058250/work
urllib3 @ file:///tmp/build/80754af9/urllib3_1606938623459/work
Werkzeug==1.0.1
win-inet-pton @ file:///C:/ci/win_inet_pton_1605306167264/work
wincertstore==0.2
wrapt==1.12.1
yarl @ file:///C:/ci/yarl_1598045274898/work
zipp @ file:///tmp/build/80754af9/zipp_1604001098328/work

============== Conda Packages ==============
# packages in environment at C:\Users\hanse\MiniConda3\envs\faceswap:
#
# Name                    Version                   Build  Channel
_tflow_select             2.3.0                       gpu  
absl-py                   0.11.0             pyhd3eb1b0_1  
aiohttp                   3.7.3            py38h2bbff1b_1  
astunparse                1.6.3                      py_0  
async-timeout             3.0.1                    py38_0  
attrs                     20.3.0             pyhd3eb1b0_0  
blas                      1.0                         mkl  
blinker                   1.4                      py38_0  
brotlipy                  0.7.0           py38h2bbff1b_1003  
ca-certificates           2021.1.19            haa95532_0  
cachetools                4.2.0              pyhd3eb1b0_0  
certifi                   2020.12.5        py38haa95532_0  
cffi                      1.14.4           py38hcd4344a_0  
chardet                   3.0.4           py38haa95532_1003  
click                     7.1.2              pyhd3eb1b0_0  
cryptography              2.9.2            py38h7a1dbc1_0  
cudatoolkit               10.1.243             h74a9793_0  
cudnn                     7.6.5                cuda10.1_0  
cycler                    0.10.0                   py38_0  
fastcluster               1.1.26           py38h251f6bf_2    conda-forge
ffmpeg                    4.3.1                ha925a31_0    conda-forge
ffmpy                     0.2.3                    pypi_0    pypi
freetype                  2.10.4               hd328e21_0  
gast                      0.4.0                      py_0  
git                       2.23.0               h6bb4b03_0  
google-auth               1.24.0             pyhd3eb1b0_0  
google-auth-oauthlib      0.4.2              pyhd3eb1b0_2  
google-pasta              0.2.0                      py_0  
grpcio                    1.31.0           py38he7da953_0  
h5py                      2.10.0           py38h5e291fa_0  
hdf5                      1.10.4               h7ebc959_0  
icc_rt                    2019.0.0             h0cc432a_1  
icu                       58.2                 ha925a31_3  
idna                      2.10               pyhd3eb1b0_0  
imageio                   2.9.0                      py_0  
imageio-ffmpeg            0.4.3              pyhd8ed1ab_0    conda-forge
importlib-metadata        2.0.0                      py_1  
intel-openmp              2020.2                      254  
joblib                    1.0.0              pyhd3eb1b0_0  
jpeg                      9b                   hb83a4c4_2  
keras-applications        1.0.8                      py_1  
keras-preprocessing       1.1.0                      py_1  
kiwisolver                1.3.0            py38hd77b12b_0  
libpng                    1.6.37               h2a8f88b_0  
libprotobuf               3.13.0.1             h200bbdf_0  
libtiff                   4.1.0                h56a325e_1  
lz4-c                     1.9.3                h2bbff1b_0  
markdown                  3.3.3            py38haa95532_0  
matplotlib                3.2.2                         0  
matplotlib-base           3.2.2            py38h64f37c6_0  
mkl                       2020.2                      256  
mkl-service               2.3.0            py38h196d8e1_0  
mkl_fft                   1.2.0            py38h45dec08_0  
mkl_random                1.1.1            py38h47e9c7a_0  
multidict                 4.7.6            py38he774522_1  
numpy                     1.19.2           py38hadc3359_0  
numpy-base                1.19.2           py38ha3acd2a_0  
nvidia-ml-py3             7.352.1                  pypi_0    pypi
oauthlib                  3.1.0                      py_0  
olefile                   0.46                       py_0  
opencv-python             4.5.1.48                 pypi_0    pypi
openssl                   1.1.1i               h2bbff1b_0  
opt_einsum                3.1.0                      py_0  
pathlib                   1.0.1                      py_1  
pillow                    8.1.0            py38h4fa10fc_0  
pip                       20.3.3           py38haa95532_0  
protobuf                  3.13.0.1         py38ha925a31_1  
psutil                    5.7.2            py38he774522_0  
pyasn1                    0.4.8                      py_0  
pyasn1-modules            0.2.8                      py_0  
pycparser                 2.20                       py_2  
pyjwt                     2.0.1            py38haa95532_0  
pyopenssl                 20.0.1             pyhd3eb1b0_1  
pyparsing                 2.4.7              pyhd3eb1b0_0  
pyqt                      5.9.2            py38ha925a31_4  
pyreadline                2.1                      py38_1  
pysocks                   1.7.1            py38haa95532_0  
python                    3.8.5                h5fd99cc_1  
python-dateutil           2.8.1                      py_0  
python_abi                3.8                      1_cp38    conda-forge
pywin32                   227              py38he774522_1  
qt                        5.9.7            vc14h73c81de_0  
requests                  2.25.1             pyhd3eb1b0_0  
requests-oauthlib         1.3.0                      py_0  
rsa                       4.7                pyhd3eb1b0_1  
scikit-learn              0.23.2           py38h47e9c7a_0  
scipy                     1.5.2            py38h14eb087_0  
setuptools                51.3.3           py38haa95532_4  
sip                       4.19.13          py38ha925a31_0  
six                       1.15.0           py38haa95532_0  
sqlite                    3.33.0               h2a8f88b_0  
tensorboard               2.3.0              pyh4dce500_0  
tensorboard-plugin-wit    1.6.0                      py_0  
tensorflow                2.3.0           mkl_py38h1fcfbd6_0  
tensorflow-base           2.3.0           gpu_py38h7339f5a_0  
tensorflow-estimator      2.3.0              pyheb71bc4_0  
tensorflow-gpu            2.3.0                he13fc11_0  
termcolor                 1.1.0                    py38_1  
threadpoolctl             2.1.0              pyh5ca1d4c_0  
tk                        8.6.10               he774522_0  
tornado                   6.1              py38h2bbff1b_0  
tqdm                      4.55.1             pyhd3eb1b0_0  
typing-extensions         3.7.4.3                       0  
typing_extensions         3.7.4.3                    py_0  
urllib3                   1.26.2             pyhd3eb1b0_0  
vc                        14.2                 h21ff451_1  
vs2015_runtime            14.27.29016          h5e58377_2  
werkzeug                  1.0.1                      py_0  
wheel                     0.36.2             pyhd3eb1b0_0  
win_inet_pton             1.1.0            py38haa95532_0  
wincertstore              0.2                      py38_0  
wrapt                     1.12.1           py38he774522_1  
xz                        5.2.5                h62dcd97_0  
yarl                      1.5.1            py38he774522_0  
zipp                      3.4.0              pyhd3eb1b0_0  
zlib                      1.2.11               h62dcd97_4  
zstd                      1.4.5                h04227a9_0  

================= Configs ==================
--------- .faceswap ---------
backend:                  nvidia

--------- convert.ini ---------

[color.color_transfer]
clip:                     True
preserve_paper:           True

[color.manual_balance]
colorspace:               HSV
balance_1:                0.0
balance_2:                0.0
balance_3:                0.0
contrast:                 0.0
brightness:               0.0

[color.match_hist]
threshold:                99.0

[mask.box_blend]
type:                     gaussian
distance:                 11.0
radius:                   5.0
passes:                   1

[mask.mask_blend]
type:                     normalized
kernel_size:              3
passes:                   4
threshold:                4
erosion:                  0.0

[scaling.sharpen]
method:                   none
amount:                   150
radius:                   0.3
threshold:                5.0

[writer.ffmpeg]
container:                mp4
codec:                    libx264
crf:                      23
preset:                   medium
tune:                     none
profile:                  auto
level:                    auto
skip_mux:                 False

[writer.gif]
fps:                      25
loop:                     0
palettesize:              256
subrectangles:            False

[writer.opencv]
format:                   png
draw_transparent:         False
jpg_quality:              75
png_compress_level:       3

[writer.pillow]
format:                   png
draw_transparent:         False
optimize:                 False
gif_interlace:            True
jpg_quality:              75
png_compress_level:       3
tif_compression:          tiff_deflate

--------- extract.ini ---------

[global]
allow_growth:             True

[align.fan]
batch-size:               12

[detect.cv2_dnn]
confidence:               50

[detect.mtcnn]
minsize:                  20
threshold_1:              0.6
threshold_2:              0.7
threshold_3:              0.7
scalefactor:              0.709
batch-size:               8

[detect.s3fd]
confidence:               70
batch-size:               1

[mask.unet_dfl]
batch-size:               8

[mask.vgg_clear]
batch-size:               6

[mask.vgg_obstructed]
batch-size:               2

--------- gui.ini ---------

[global]
fullscreen:               False
tab:                      extract
options_panel_width:      30
console_panel_height:     20
icon_size:                14
font:                     default
font_size:                9
autosave_last_session:    prompt
timeout:                  120
auto_load_model_stats:    True

--------- train.ini ---------

[global]
centering:                face
coverage:                 68.75
icnr_init:                False
conv_aware_init:          False
optimizer:                adam
learning_rate:            5e-05
reflect_padding:          False
allow_growth:             False
mixed_precision:          False
convert_batchsize:        16

[global.loss]
loss_function:            ssim
mask_loss_function:       mse
l2_reg_term:              100
eye_multiplier:           3
mouth_multiplier:         2
penalized_mask_loss:      True
mask_type:                extended
mask_blur_kernel:         3
mask_threshold:           4
learn_mask:               False

[model.dfaker]
output_size:              128

[model.dfl_h128]
lowmem:                   False

[model.dfl_sae]
input_size:               128
clipnorm:                 True
architecture:             df
autoencoder_dims:         0
encoder_dims:             42
decoder_dims:             21
multiscale_decoder:       False

[model.dlight]
features:                 best
details:                  good
output_size:              256

[model.original]
lowmem:                   False

[model.realface]
input_size:               64
output_size:              128
dense_nodes:              1536
complexity_encoder:       128
complexity_decoder:       512

[model.unbalanced]
input_size:               128
lowmem:                   False
clipnorm:                 True
nodes:                    1024
complexity_encoder:       128
complexity_decoder_a:     384
complexity_decoder_b:     512

[model.villain]
lowmem:                   False

[trainer.original]
preview_images:           14
zoom_amount:              5
rotation_range:           10
shift_range:              5
flip_chance:              50
disable_warp:             False
color_lightness:          30
color_ab:                 8
color_clahe_chance:       50
color_clahe_max_size:     4

Re: crash report while training: Failed to get convolution algorithm. This is probably because cuDNN failed to initializ

Posted: Mon Jan 25, 2021 11:13 am
by torzdf

Go to Tools > Settings > Train and enable "Allow Growth".


Caught exception in thread: '_training_0'

Posted: Sun Jul 25, 2021 7:30 am
by lamakaha

Hi folks, doing first steps with a faceswap and hitting the wall here

Starting the training process, getting following error

Code: Select all

Loading...
Setting Faceswap backend to NVIDIA
07/25/2021 09:14:05 INFO     Log level set to: INFO
07/25/2021 09:14:07 INFO     Model A Directory: '/media/lamakaha/work/projects/deepfake/06_Renders/SanneFace' (156 images)
07/25/2021 09:14:07 INFO     Model B Directory: '/media/lamakaha/work/projects/deepfake/06_Renders/AlexeyFace' (758 images)
07/25/2021 09:14:07 WARNING  At least one of your input folders contains fewer than 250 images. Results are likely to be poor.
07/25/2021 09:14:07 WARNING  You need to provide a significant number of images to successfully train a Neural Network. Aim for between 500 - 5000 images per side.
07/25/2021 09:14:07 INFO     Training data directory: /media/lamakaha/work/projects/deepfake/03_Data/model_v001
07/25/2021 09:14:07 INFO     ===================================================
07/25/2021 09:14:07 INFO       Starting
07/25/2021 09:14:07 INFO       Press 'Stop' to save and quit
07/25/2021 09:14:07 INFO     ===================================================
07/25/2021 09:14:08 INFO     Loading data, this may take a while...
07/25/2021 09:14:08 INFO     Loading Model from Original plugin...
07/25/2021 09:14:08 INFO     No existing state file found. Generating.
07/25/2021 09:14:10 INFO     Loading Trainer from Original plugin...
2021-07-25 09:14:17.532879: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2021-07-25 09:14:17.534480: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2021-07-25 09:14:17.535888: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2021-07-25 09:14:17.537152: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
07/25/2021 09:14:18 CRITICAL Error caught! Exiting...
07/25/2021 09:14:18 ERROR    Caught exception in thread: '_training_0'
07/25/2021 09:14:20 ERROR    Got Exception on main handler:
Traceback (most recent call last):
  File "/home/lamakaha/faceswap/lib/cli/launcher.py", line 182, in execute_script
    process.process()
  File "/home/lamakaha/faceswap/scripts/train.py", line 190, in process
    self._end_thread(thread, err)
  File "/home/lamakaha/faceswap/scripts/train.py", line 230, in _end_thread
    thread.join()
  File "/home/lamakaha/faceswap/lib/multithreading.py", line 121, in join
    raise thread.err[1].with_traceback(thread.err[2])
  File "/home/lamakaha/faceswap/lib/multithreading.py", line 37, in run
    self._target(*self._args, **self._kwargs)
  File "/home/lamakaha/faceswap/scripts/train.py", line 252, in _training
    raise err
  File "/home/lamakaha/faceswap/scripts/train.py", line 242, in _training
    self._run_training_cycle(model, trainer)
  File "/home/lamakaha/faceswap/scripts/train.py", line 327, in _run_training_cycle
    trainer.train_one_step(viewer, timelapse)
  File "/home/lamakaha/faceswap/plugins/train/trainer/_base.py", line 193, in train_one_step
    loss = self._model.model.train_on_batch(model_inputs, y=model_targets)
  File "/home/lamakaha/miniconda3/envs/faceswap/lib/python3.8/site-packages/tensorflow/python/keras/engine/training.py", line 1348, in train_on_batch
    logs = train_function(iterator)
  File "/home/lamakaha/miniconda3/envs/faceswap/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py", line 580, in __call__
    result = self._call(*args, **kwds)
  File "/home/lamakaha/miniconda3/envs/faceswap/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py", line 644, in _call
    return self._stateless_fn(*args, **kwds)
  File "/home/lamakaha/miniconda3/envs/faceswap/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 2420, in __call__
    return graph_function._filtered_call(args, kwargs)  # pylint: disable=protected-access
  File "/home/lamakaha/miniconda3/envs/faceswap/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 1661, in _filtered_call
    return self._call_flat(
  File "/home/lamakaha/miniconda3/envs/faceswap/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 1745, in _call_flat
    return self._build_call_outputs(self._inference_function.call(
  File "/home/lamakaha/miniconda3/envs/faceswap/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 593, in call
    outputs = execute.execute(
  File "/home/lamakaha/miniconda3/envs/faceswap/lib/python3.8/site-packages/tensorflow/python/eager/execute.py", line 59, in quick_execute
    tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.UnknownError:  Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
	 [[node original/encoder_1/conv_128_0_conv2d/Conv2D (defined at /faceswap/plugins/train/trainer/_base.py:193) ]] [Op:__inference_train_function_8730]

Function call stack:
train_function

07/25/2021 09:14:20 CRITICAL An unexpected crash has occurred. Crash report written to '/home/lamakaha/faceswap/crash_report.2021.07.25.091418406632.log'. You MUST provide this file if seeking assistance. Please verify you are running the latest version of faceswap before reporting
Process exited.

full log is here

Code: Select all

07/25/2021 09:14:10 MainProcess     _training_0                    _base           _set_preview_feed              DEBUG    Setting preview feed: (side: 'b')
07/25/2021 09:14:10 MainProcess     _training_0                    _base           _load_generator                DEBUG    Loading generator
07/25/2021 09:14:10 MainProcess     _training_0                    _base           _load_generator                DEBUG    input_size: 64, output_shapes: [(64, 64, 3)]
07/25/2021 09:14:10 MainProcess     _training_0                    generator       __init__                       DEBUG    Initializing TrainingDataGenerator: (model_input_size: 64, model_output_shapes: [(64, 64, 3)], coverage_ratio: 0.6875, color_order: bgr, augment_color: True, no_flip: False, no_warp: False, warp_to_landmarks: False, config: {'centering': 'face', 'coverage': 68.75, 'icnr_init': False, 'conv_aware_init': False, 'optimizer': 'adam', 'learning_rate': 5e-05, 'epsilon_exponent': -7, 'reflect_padding': False, 'allow_growth': False, 'mixed_precision': False, 'nan_protection': True, 'convert_batchsize': 16, 'loss_function': 'ssim', 'mask_loss_function': 'mse', 'l2_reg_term': 100, 'eye_multiplier': 3, 'mouth_multiplier': 2, 'penalized_mask_loss': True, 'mask_type': 'extended', 'mask_blur_kernel': 3, 'mask_threshold': 4, 'learn_mask': False, 'preview_images': 14, 'zoom_amount': 5, 'rotation_range': 10, 'shift_range': 5, 'flip_chance': 50, 'color_lightness': 30, 'color_ab': 8, 'color_clahe_chance': 50, 'color_clahe_max_size': 4})
07/25/2021 09:14:10 MainProcess     _training_0                    generator       __init__                       DEBUG    Initialized TrainingDataGenerator
07/25/2021 09:14:10 MainProcess     _training_0                    generator       minibatch_ab                   DEBUG    Queue batches: (image_count: 758, batchsize: 14, side: 'b', do_shuffle: True, is_preview, True, is_timelapse: False)
07/25/2021 09:14:10 MainProcess     _training_0                    augmentation    __init__                       DEBUG    Initializing ImageAugmentation: (batchsize: 14, is_display: True, input_size: 64, output_shapes: [(64, 64, 3)], coverage_ratio: 0.6875, config: {'centering': 'face', 'coverage': 68.75, 'icnr_init': False, 'conv_aware_init': False, 'optimizer': 'adam', 'learning_rate': 5e-05, 'epsilon_exponent': -7, 'reflect_padding': False, 'allow_growth': False, 'mixed_precision': False, 'nan_protection': True, 'convert_batchsize': 16, 'loss_function': 'ssim', 'mask_loss_function': 'mse', 'l2_reg_term': 100, 'eye_multiplier': 3, 'mouth_multiplier': 2, 'penalized_mask_loss': True, 'mask_type': 'extended', 'mask_blur_kernel': 3, 'mask_threshold': 4, 'learn_mask': False, 'preview_images': 14, 'zoom_amount': 5, 'rotation_range': 10, 'shift_range': 5, 'flip_chance': 50, 'color_lightness': 30, 'color_ab': 8, 'color_clahe_chance': 50, 'color_clahe_max_size': 4})
07/25/2021 09:14:10 MainProcess     _training_0                    augmentation    __init__                       DEBUG    Output sizes: [64]
07/25/2021 09:14:10 MainProcess     _training_0                    augmentation    __init__                       DEBUG    Initialized ImageAugmentation
07/25/2021 09:14:10 MainProcess     _training_0                    multithreading  __init__                       DEBUG    Initializing BackgroundGenerator: (target: '_run', thread_count: 2)
07/25/2021 09:14:10 MainProcess     _training_0                    multithreading  __init__                       DEBUG    Initialized BackgroundGenerator: '_run'
07/25/2021 09:14:10 MainProcess     _training_0                    multithreading  start                          DEBUG    Starting thread(s): '_run'
07/25/2021 09:14:10 MainProcess     _training_0                    multithreading  start                          DEBUG    Starting thread 1 of 2: '_run_0'
07/25/2021 09:14:10 MainProcess     _run_0                         generator       _minibatch                     DEBUG    Loading minibatch generator: (image_count: 758, side: 'b', do_shuffle: True)
07/25/2021 09:14:10 MainProcess     _training_0                    multithreading  start                          DEBUG    Starting thread 2 of 2: '_run_1'
07/25/2021 09:14:10 MainProcess     _run_1                         generator       _minibatch                     DEBUG    Loading minibatch generator: (image_count: 758, side: 'b', do_shuffle: True)
07/25/2021 09:14:10 MainProcess     _training_0                    multithreading  start                          DEBUG    Started all threads '_run': 2
07/25/2021 09:14:10 MainProcess     _training_0                    _base           _set_preview_feed              DEBUG    Set preview feed. Batchsize: 14
07/25/2021 09:14:10 MainProcess     _training_0                    _base           __init__                       DEBUG    Initialized _Feeder:
07/25/2021 09:14:10 MainProcess     _training_0                    _base           _set_tensorboard               DEBUG    Enabling TensorBoard Logging
07/25/2021 09:14:10 MainProcess     _training_0                    _base           _set_tensorboard               DEBUG    Setting up TensorBoard Logging
07/25/2021 09:14:10 MainProcess     _run_0                         generator       _validate_version              DEBUG    Setting initial extract version: 2.2
07/25/2021 09:14:10 MainProcess     _training_0                    _base           _set_tensorboard               VERBOSE  Enabled TensorBoard Logging
07/25/2021 09:14:10 MainProcess     _training_0                    _base           __init__                       DEBUG    Initializing _Samples: model: '<plugins.train.model.original.Model object at 0x7f5157a24df0>', coverage_ratio: 0.6875)
07/25/2021 09:14:10 MainProcess     _training_0                    _base           __init__                       DEBUG    Initialized _Samples
07/25/2021 09:14:10 MainProcess     _training_0                    _base           __init__                       DEBUG    Initializing _Timelapse: model: <plugins.train.model.original.Model object at 0x7f5157a24df0>, coverage_ratio: 0.6875, image_count: 14, feeder: '<plugins.train.trainer._base._Feeder object at 0x7f51571a49d0>', image_paths: 2)
07/25/2021 09:14:10 MainProcess     _training_0                    _base           __init__                       DEBUG    Initializing _Samples: model: '<plugins.train.model.original.Model object at 0x7f5157a24df0>', coverage_ratio: 0.6875)
07/25/2021 09:14:10 MainProcess     _training_0                    _base           __init__                       DEBUG    Initialized _Samples
07/25/2021 09:14:10 MainProcess     _training_0                    _base           __init__                       DEBUG    Initialized _Timelapse
07/25/2021 09:14:10 MainProcess     _training_0                    _base           __init__                       DEBUG    Initialized Trainer
07/25/2021 09:14:10 MainProcess     _training_0                    train           _load_trainer                  DEBUG    Loaded Trainer
07/25/2021 09:14:10 MainProcess     _training_0                    train           _run_training_cycle            DEBUG    Running Training Cycle
07/25/2021 09:14:10 MainProcess     _run_0                         augmentation    initialize                     DEBUG    Initializing constants. training_size: 384
07/25/2021 09:14:10 MainProcess     _run_0                         augmentation    initialize                     DEBUG    Initialized constants: {'clahe_base_contrast': 3, 'tgt_slices': slice(60, 324, None), 'warp_mapx': '[[[ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]]\n\n [[ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]]]', 'warp_mapy': '[[[ 60.  60.  60.  60.  60.]\n  [126. 126. 126. 126. 126.]\n  [192. 192. 192. 192. 192.]\n  [258. 258. 258. 258. 258.]\n  [324. 324. 324. 324. 324.]]\n\n [[ 60.  60.  60.  60.  60.]\n  [126. 126. 126. 126. 126.]\n  [192. 192. 192. 192. 192.]\n  [258. 258. 258. 258. 258.]\n  [324. 324. 324. 324. 324.]]]', 'warp_pad': 80, 'warp_slices': slice(8, -8, None), 'warp_lm_edge_anchors': '[[[  0   0]\n  [  0 383]\n  [383 383]\n  [383   0]\n  [191   0]\n  [191 383]\n  [383 191]\n  [  0 191]]\n\n [[  0   0]\n  [  0 383]\n  [383 383]\n  [383   0]\n  [191   0]\n  [191 383]\n  [383 191]\n  [  0 191]]]', 'warp_lm_grids': '[[[  0.   0.   0. ...   0.   0.   0.]\n  [  1.   1.   1. ...   1.   1.   1.]\n  [  2.   2.   2. ...   2.   2.   2.]\n  ...\n  [381. 381. 381. ... 381. 381. 381.]\n  [382. 382. 382. ... 382. 382. 382.]\n  [383. 383. 383. ... 383. 383. 383.]]\n\n [[  0.   1.   2. ... 381. 382. 383.]\n  [  0.   1.   2. ... 381. 382. 383.]\n  [  0.   1.   2. ... 381. 382. 383.]\n  ...\n  [  0.   1.   2. ... 381. 382. 383.]\n  [  0.   1.   2. ... 381. 382. 383.]\n  [  0.   1.   2. ... 381. 382. 383.]]]'}
07/25/2021 09:14:10 MainProcess     _run_0                         generator       cache_metadata                 DEBUG    All metadata already cached for: ['Sanne_000104_0.png', 'Sanne_000152_0.png']
07/25/2021 09:14:10 MainProcess     _run_0                         generator       _validate_version              DEBUG    Setting initial extract version: 2.2
07/25/2021 09:14:10 MainProcess     _run_1                         augmentation    initialize                     DEBUG    Initializing constants. training_size: 384
07/25/2021 09:14:10 MainProcess     _run_1                         augmentation    initialize                     DEBUG    Initialized constants: {'clahe_base_contrast': 3, 'tgt_slices': slice(60, 324, None), 'warp_mapx': '[[[ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]]\n\n [[ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]]\n\n [[ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]]\n\n [[ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]]\n\n [[ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]]\n\n [[ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]]\n\n [[ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]]\n\n [[ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]]\n\n [[ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]]\n\n [[ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]]\n\n [[ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]]\n\n [[ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]]\n\n [[ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]]\n\n [[ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]]]', 'warp_mapy': '[[[ 60.  60.  60.  60.  60.]\n  [126. 126. 126. 126. 126.]\n  [192. 192. 192. 192. 192.]\n  [258. 258. 258. 258. 258.]\n  [324. 324. 324. 324. 324.]]\n\n [[ 60.  60.  60.  60.  60.]\n  [126. 126. 126. 126. 126.]\n  [192. 192. 192. 192. 192.]\n  [258. 258. 258. 258. 258.]\n  [324. 324. 324. 324. 324.]]\n\n [[ 60.  60.  60.  60.  60.]\n  [126. 126. 126. 126. 126.]\n  [192. 192. 192. 192. 192.]\n  [258. 258. 258. 258. 258.]\n  [324. 324. 324. 324. 324.]]\n\n [[ 60.  60.  60.  60.  60.]\n  [126. 126. 126. 126. 126.]\n  [192. 192. 192. 192. 192.]\n  [258. 258. 258. 258. 258.]\n  [324. 324. 324. 324. 324.]]\n\n [[ 60.  60.  60.  60.  60.]\n  [126. 126. 126. 126. 126.]\n  [192. 192. 192. 192. 192.]\n  [258. 258. 258. 258. 258.]\n  [324. 324. 324. 324. 324.]]\n\n [[ 60.  60.  60.  60.  60.]\n  [126. 126. 126. 126. 126.]\n  [192. 192. 192. 192. 192.]\n  [258. 258. 258. 258. 258.]\n  [324. 324. 324. 324. 324.]]\n\n [[ 60.  60.  60.  60.  60.]\n  [126. 126. 126. 126. 126.]\n  [192. 192. 192. 192. 192.]\n  [258. 258. 258. 258. 258.]\n  [324. 324. 324. 324. 324.]]\n\n [[ 60.  60.  60.  60.  60.]\n  [126. 126. 126. 126. 126.]\n  [192. 192. 192. 192. 192.]\n  [258. 258. 258. 258. 258.]\n  [324. 324. 324. 324. 324.]]\n\n [[ 60.  60.  60.  60.  60.]\n  [126. 126. 126. 126. 126.]\n  [192. 192. 192. 192. 192.]\n  [258. 258. 258. 258. 258.]\n  [324. 324. 324. 324. 324.]]\n\n [[ 60.  60.  60.  60.  60.]\n  [126. 126. 126. 126. 126.]\n  [192. 192. 192. 192. 192.]\n  [258. 258. 258. 258. 258.]\n  [324. 324. 324. 324. 324.]]\n\n [[ 60.  60.  60.  60.  60.]\n  [126. 126. 126. 126. 126.]\n  [192. 192. 192. 192. 192.]\n  [258. 258. 258. 258. 258.]\n  [324. 324. 324. 324. 324.]]\n\n [[ 60.  60.  60.  60.  60.]\n  [126. 126. 126. 126. 126.]\n  [192. 192. 192. 192. 192.]\n  [258. 258. 258. 258. 258.]\n  [324. 324. 324. 324. 324.]]\n\n [[ 60.  60.  60.  60.  60.]\n  [126. 126. 126. 126. 126.]\n  [192. 192. 192. 192. 192.]\n  [258. 258. 258. 258. 258.]\n  [324. 324. 324. 324. 324.]]\n\n [[ 60.  60.  60.  60.  60.]\n  [126. 126. 126. 126. 126.]\n  [192. 192. 192. 192. 192.]\n  [258. 258. 258. 258. 258.]\n  [324. 324. 324. 324. 324.]]]', 'warp_pad': 80, 'warp_slices': slice(8, -8, None), 'warp_lm_edge_anchors': '[[[  0   0]\n  [  0 383]\n  [383 383]\n  [383   0]\n  [191   0]\n  [191 383]\n  [383 191]\n  [  0 191]]\n\n [[  0   0]\n  [  0 383]\n  [383 383]\n  [383   0]\n  [191   0]\n  [191 383]\n  [383 191]\n  [  0 191]]\n\n [[  0   0]\n  [  0 383]\n  [383 383]\n  [383   0]\n  [191   0]\n  [191 383]\n  [383 191]\n  [  0 191]]\n\n [[  0   0]\n  [  0 383]\n  [383 383]\n  [383   0]\n  [191   0]\n  [191 383]\n  [383 191]\n  [  0 191]]\n\n [[  0   0]\n  [  0 383]\n  [383 383]\n  [383   0]\n  [191   0]\n  [191 383]\n  [383 191]\n  [  0 191]]\n\n [[  0   0]\n  [  0 383]\n  [383 383]\n  [383   0]\n  [191   0]\n  [191 383]\n  [383 191]\n  [  0 191]]\n\n [[  0   0]\n  [  0 383]\n  [383 383]\n  [383   0]\n  [191   0]\n  [191 383]\n  [383 191]\n  [  0 191]]\n\n [[  0   0]\n  [  0 383]\n  [383 383]\n  [383   0]\n  [191   0]\n  [191 383]\n  [383 191]\n  [  0 191]]\n\n [[  0   0]\n  [  0 383]\n  [383 383]\n  [383   0]\n  [191   0]\n  [191 383]\n  [383 191]\n  [  0 191]]\n\n [[  0   0]\n  [  0 383]\n  [383 383]\n  [383   0]\n  [191   0]\n  [191 383]\n  [383 191]\n  [  0 191]]\n\n [[  0   0]\n  [  0 383]\n  [383 383]\n  [383   0]\n  [191   0]\n  [191 383]\n  [383 191]\n  [  0 191]]\n\n [[  0   0]\n  [  0 383]\n  [383 383]\n  [383   0]\n  [191   0]\n  [191 383]\n  [383 191]\n  [  0 191]]\n\n [[  0   0]\n  [  0 383]\n  [383 383]\n  [383   0]\n  [191   0]\n  [191 383]\n  [383 191]\n  [  0 191]]\n\n [[  0   0]\n  [  0 383]\n  [383 383]\n  [383   0]\n  [191   0]\n  [191 383]\n  [383 191]\n  [  0 191]]]', 'warp_lm_grids': '[[[  0.   0.   0. ...   0.   0.   0.]\n  [  1.   1.   1. ...   1.   1.   1.]\n  [  2.   2.   2. ...   2.   2.   2.]\n  ...\n  [381. 381. 381. ... 381. 381. 381.]\n  [382. 382. 382. ... 382. 382. 382.]\n  [383. 383. 383. ... 383. 383. 383.]]\n\n [[  0.   1.   2. ... 381. 382. 383.]\n  [  0.   1.   2. ... 381. 382. 383.]\n  [  0.   1.   2. ... 381. 382. 383.]\n  ...\n  [  0.   1.   2. ... 381. 382. 383.]\n  [  0.   1.   2. ... 381. 382. 383.]\n  [  0.   1.   2. ... 381. 382. 383.]]]'}
07/25/2021 09:14:10 MainProcess     _run_0                         augmentation    initialize                     DEBUG    Initializing constants. training_size: 384
07/25/2021 09:14:10 MainProcess     _run_0                         augmentation    initialize                     DEBUG    Initialized constants: {'clahe_base_contrast': 3, 'tgt_slices': slice(60, 324, None), 'warp_mapx': '[[[ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]]\n\n [[ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]]]', 'warp_mapy': '[[[ 60.  60.  60.  60.  60.]\n  [126. 126. 126. 126. 126.]\n  [192. 192. 192. 192. 192.]\n  [258. 258. 258. 258. 258.]\n  [324. 324. 324. 324. 324.]]\n\n [[ 60.  60.  60.  60.  60.]\n  [126. 126. 126. 126. 126.]\n  [192. 192. 192. 192. 192.]\n  [258. 258. 258. 258. 258.]\n  [324. 324. 324. 324. 324.]]]', 'warp_pad': 80, 'warp_slices': slice(8, -8, None), 'warp_lm_edge_anchors': '[[[  0   0]\n  [  0 383]\n  [383 383]\n  [383   0]\n  [191   0]\n  [191 383]\n  [383 191]\n  [  0 191]]\n\n [[  0   0]\n  [  0 383]\n  [383 383]\n  [383   0]\n  [191   0]\n  [191 383]\n  [383 191]\n  [  0 191]]]', 'warp_lm_grids': '[[[  0.   0.   0. ...   0.   0.   0.]\n  [  1.   1.   1. ...   1.   1.   1.]\n  [  2.   2.   2. ...   2.   2.   2.]\n  ...\n  [381. 381. 381. ... 381. 381. 381.]\n  [382. 382. 382. ... 382. 382. 382.]\n  [383. 383. 383. ... 383. 383. 383.]]\n\n [[  0.   1.   2. ... 381. 382. 383.]\n  [  0.   1.   2. ... 381. 382. 383.]\n  [  0.   1.   2. ... 381. 382. 383.]\n  ...\n  [  0.   1.   2. ... 381. 382. 383.]\n  [  0.   1.   2. ... 381. 382. 383.]\n  [  0.   1.   2. ... 381. 382. 383.]]]'}
07/25/2021 09:14:10 MainProcess     _training_0                    losses_tf       call                           DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x7f51541a33a0>, weight: 1.0, mask_channel: 3)
07/25/2021 09:14:10 MainProcess     _training_0                    losses_tf       _apply_mask                    DEBUG    Applying mask from channel 3
07/25/2021 09:14:11 MainProcess     _run_1                         generator       cache_metadata                 DEBUG    All metadata already cached for: ['Sanne_000104_0.png', 'Sanne_000152_0.png']
07/25/2021 09:14:11 MainProcess     _training_0                    losses_tf       call                           DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x7f51541a34c0>, weight: 1.0, mask_channel: 3)
07/25/2021 09:14:11 MainProcess     _training_0                    losses_tf       _apply_mask                    DEBUG    Applying mask from channel 3
07/25/2021 09:14:11 MainProcess     _training_0                    losses_tf       call                           DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x7f5154202760>, weight: 3.0, mask_channel: 4)
07/25/2021 09:14:11 MainProcess     _training_0                    losses_tf       _apply_mask                    DEBUG    Applying mask from channel 4
07/25/2021 09:14:11 MainProcess     _training_0                    losses_tf       call                           DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x7f5154202e80>, weight: 1.0, mask_channel: 1)
07/25/2021 09:14:11 MainProcess     _training_0                    losses_tf       _apply_mask                    DEBUG    Applying mask from channel 1
07/25/2021 09:14:11 MainProcess     _training_0                    losses_tf       call                           DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x7f515421aa60>, weight: 2.0, mask_channel: 5)
07/25/2021 09:14:11 MainProcess     _training_0                    losses_tf       _apply_mask                    DEBUG    Applying mask from channel 5
07/25/2021 09:14:11 MainProcess     _training_0                    losses_tf       call                           DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x7f515421a970>, weight: 1.0, mask_channel: 2)
07/25/2021 09:14:11 MainProcess     _training_0                    losses_tf       _apply_mask                    DEBUG    Applying mask from channel 2
07/25/2021 09:14:11 MainProcess     _run_0                         augmentation    initialize                     DEBUG    Initializing constants. training_size: 384
07/25/2021 09:14:11 MainProcess     _run_0                         augmentation    initialize                     DEBUG    Initialized constants: {'clahe_base_contrast': 3, 'tgt_slices': slice(60, 324, None), 'warp_mapx': '[[[ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]]\n\n [[ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]]\n\n [[ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]]\n\n [[ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]]\n\n [[ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]]\n\n [[ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]]\n\n [[ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]]\n\n [[ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]]\n\n [[ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]]\n\n [[ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]]\n\n [[ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]]\n\n [[ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]]\n\n [[ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]]\n\n [[ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]\n  [ 60. 126. 192. 258. 324.]]]', 'warp_mapy': '[[[ 60.  60.  60.  60.  60.]\n  [126. 126. 126. 126. 126.]\n  [192. 192. 192. 192. 192.]\n  [258. 258. 258. 258. 258.]\n  [324. 324. 324. 324. 324.]]\n\n [[ 60.  60.  60.  60.  60.]\n  [126. 126. 126. 126. 126.]\n  [192. 192. 192. 192. 192.]\n  [258. 258. 258. 258. 258.]\n  [324. 324. 324. 324. 324.]]\n\n [[ 60.  60.  60.  60.  60.]\n  [126. 126. 126. 126. 126.]\n  [192. 192. 192. 192. 192.]\n  [258. 258. 258. 258. 258.]\n  [324. 324. 324. 324. 324.]]\n\n [[ 60.  60.  60.  60.  60.]\n  [126. 126. 126. 126. 126.]\n  [192. 192. 192. 192. 192.]\n  [258. 258. 258. 258. 258.]\n  [324. 324. 324. 324. 324.]]\n\n [[ 60.  60.  60.  60.  60.]\n  [126. 126. 126. 126. 126.]\n  [192. 192. 192. 192. 192.]\n  [258. 258. 258. 258. 258.]\n  [324. 324. 324. 324. 324.]]\n\n [[ 60.  60.  60.  60.  60.]\n  [126. 126. 126. 126. 126.]\n  [192. 192. 192. 192. 192.]\n  [258. 258. 258. 258. 258.]\n  [324. 324. 324. 324. 324.]]\n\n [[ 60.  60.  60.  60.  60.]\n  [126. 126. 126. 126. 126.]\n  [192. 192. 192. 192. 192.]\n  [258. 258. 258. 258. 258.]\n  [324. 324. 324. 324. 324.]]\n\n [[ 60.  60.  60.  60.  60.]\n  [126. 126. 126. 126. 126.]\n  [192. 192. 192. 192. 192.]\n  [258. 258. 258. 258. 258.]\n  [324. 324. 324. 324. 324.]]\n\n [[ 60.  60.  60.  60.  60.]\n  [126. 126. 126. 126. 126.]\n  [192. 192. 192. 192. 192.]\n  [258. 258. 258. 258. 258.]\n  [324. 324. 324. 324. 324.]]\n\n [[ 60.  60.  60.  60.  60.]\n  [126. 126. 126. 126. 126.]\n  [192. 192. 192. 192. 192.]\n  [258. 258. 258. 258. 258.]\n  [324. 324. 324. 324. 324.]]\n\n [[ 60.  60.  60.  60.  60.]\n  [126. 126. 126. 126. 126.]\n  [192. 192. 192. 192. 192.]\n  [258. 258. 258. 258. 258.]\n  [324. 324. 324. 324. 324.]]\n\n [[ 60.  60.  60.  60.  60.]\n  [126. 126. 126. 126. 126.]\n  [192. 192. 192. 192. 192.]\n  [258. 258. 258. 258. 258.]\n  [324. 324. 324. 324. 324.]]\n\n [[ 60.  60.  60.  60.  60.]\n  [126. 126. 126. 126. 126.]\n  [192. 192. 192. 192. 192.]\n  [258. 258. 258. 258. 258.]\n  [324. 324. 324. 324. 324.]]\n\n [[ 60.  60.  60.  60.  60.]\n  [126. 126. 126. 126. 126.]\n  [192. 192. 192. 192. 192.]\n  [258. 258. 258. 258. 258.]\n  [324. 324. 324. 324. 324.]]]', 'warp_pad': 80, 'warp_slices': slice(8, -8, None), 'warp_lm_edge_anchors': '[[[  0   0]\n  [  0 383]\n  [383 383]\n  [383   0]\n  [191   0]\n  [191 383]\n  [383 191]\n  [  0 191]]\n\n [[  0   0]\n  [  0 383]\n  [383 383]\n  [383   0]\n  [191   0]\n  [191 383]\n  [383 191]\n  [  0 191]]\n\n [[  0   0]\n  [  0 383]\n  [383 383]\n  [383   0]\n  [191   0]\n  [191 383]\n  [383 191]\n  [  0 191]]\n\n [[  0   0]\n  [  0 383]\n  [383 383]\n  [383   0]\n  [191   0]\n  [191 383]\n  [383 191]\n  [  0 191]]\n\n [[  0   0]\n  [  0 383]\n  [383 383]\n  [383   0]\n  [191   0]\n  [191 383]\n  [383 191]\n  [  0 191]]\n\n [[  0   0]\n  [  0 383]\n  [383 383]\n  [383   0]\n  [191   0]\n  [191 383]\n  [383 191]\n  [  0 191]]\n\n [[  0   0]\n  [  0 383]\n  [383 383]\n  [383   0]\n  [191   0]\n  [191 383]\n  [383 191]\n  [  0 191]]\n\n [[  0   0]\n  [  0 383]\n  [383 383]\n  [383   0]\n  [191   0]\n  [191 383]\n  [383 191]\n  [  0 191]]\n\n [[  0   0]\n  [  0 383]\n  [383 383]\n  [383   0]\n  [191   0]\n  [191 383]\n  [383 191]\n  [  0 191]]\n\n [[  0   0]\n  [  0 383]\n  [383 383]\n  [383   0]\n  [191   0]\n  [191 383]\n  [383 191]\n  [  0 191]]\n\n [[  0   0]\n  [  0 383]\n  [383 383]\n  [383   0]\n  [191   0]\n  [191 383]\n  [383 191]\n  [  0 191]]\n\n [[  0   0]\n  [  0 383]\n  [383 383]\n  [383   0]\n  [191   0]\n  [191 383]\n  [383 191]\n  [  0 191]]\n\n [[  0   0]\n  [  0 383]\n  [383 383]\n  [383   0]\n  [191   0]\n  [191 383]\n  [383 191]\n  [  0 191]]\n\n [[  0   0]\n  [  0 383]\n  [383 383]\n  [383   0]\n  [191   0]\n  [191 383]\n  [383 191]\n  [  0 191]]]', 'warp_lm_grids': '[[[  0.   0.   0. ...   0.   0.   0.]\n  [  1.   1.   1. ...   1.   1.   1.]\n  [  2.   2.   2. ...   2.   2.   2.]\n  ...\n  [381. 381. 381. ... 381. 381. 381.]\n  [382. 382. 382. ... 382. 382. 382.]\n  [383. 383. 383. ... 383. 383. 383.]]\n\n [[  0.   1.   2. ... 381. 382. 383.]\n  [  0.   1.   2. ... 381. 382. 383.]\n  [  0.   1.   2. ... 381. 382. 383.]\n  ...\n  [  0.   1.   2. ... 381. 382. 383.]\n  [  0.   1.   2. ... 381. 382. 383.]\n  [  0.   1.   2. ... 381. 382. 383.]]]'}
07/25/2021 09:14:11 MainProcess     _training_0                    losses_tf       call                           DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x7f51541c1520>, weight: 1.0, mask_channel: 3)
07/25/2021 09:14:11 MainProcess     _training_0                    losses_tf       _apply_mask                    DEBUG    Applying mask from channel 3
07/25/2021 09:14:11 MainProcess     _training_0                    losses_tf       call                           DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x7f51541c12e0>, weight: 1.0, mask_channel: 3)
07/25/2021 09:14:11 MainProcess     _training_0                    losses_tf       _apply_mask                    DEBUG    Applying mask from channel 3
07/25/2021 09:14:11 MainProcess     _run_1                         generator       cache_metadata                 DEBUG    All metadata already cached for: ['Alexey_000731_0.png', 'Alexey_000708_0.png']
07/25/2021 09:14:11 MainProcess     _training_0                    losses_tf       call                           DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x7f515414e5e0>, weight: 3.0, mask_channel: 4)
07/25/2021 09:14:11 MainProcess     _training_0                    losses_tf       _apply_mask                    DEBUG    Applying mask from channel 4
07/25/2021 09:14:11 MainProcess     _training_0                    losses_tf       call                           DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x7f515414ea90>, weight: 1.0, mask_channel: 1)
07/25/2021 09:14:11 MainProcess     _training_0                    losses_tf       _apply_mask                    DEBUG    Applying mask from channel 1
07/25/2021 09:14:11 MainProcess     _training_0                    losses_tf       call                           DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x7f515415a0a0>, weight: 2.0, mask_channel: 5)
07/25/2021 09:14:11 MainProcess     _training_0                    losses_tf       _apply_mask                    DEBUG    Applying mask from channel 5
07/25/2021 09:14:11 MainProcess     _training_0                    losses_tf       call                           DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x7f515415a5b0>, weight: 1.0, mask_channel: 2)
07/25/2021 09:14:11 MainProcess     _training_0                    losses_tf       _apply_mask                    DEBUG    Applying mask from channel 2
07/25/2021 09:14:13 MainProcess     _training_0                    losses_tf       call                           DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x7f51541a33a0>, weight: 1.0, mask_channel: 3)
07/25/2021 09:14:13 MainProcess     _training_0                    losses_tf       _apply_mask                    DEBUG    Applying mask from channel 3
07/25/2021 09:14:13 MainProcess     _training_0                    losses_tf       call                           DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x7f51541a34c0>, weight: 1.0, mask_channel: 3)
07/25/2021 09:14:13 MainProcess     _training_0                    losses_tf       _apply_mask                    DEBUG    Applying mask from channel 3
07/25/2021 09:14:13 MainProcess     _training_0                    losses_tf       call                           DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x7f5154202760>, weight: 3.0, mask_channel: 4)
07/25/2021 09:14:13 MainProcess     _training_0                    losses_tf       _apply_mask                    DEBUG    Applying mask from channel 4
07/25/2021 09:14:13 MainProcess     _training_0                    losses_tf       call                           DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x7f5154202e80>, weight: 1.0, mask_channel: 1)
07/25/2021 09:14:13 MainProcess     _training_0                    losses_tf       _apply_mask                    DEBUG    Applying mask from channel 1
07/25/2021 09:14:13 MainProcess     _training_0                    losses_tf       call                           DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x7f515421aa60>, weight: 2.0, mask_channel: 5)
07/25/2021 09:14:13 MainProcess     _training_0                    losses_tf       _apply_mask                    DEBUG    Applying mask from channel 5
07/25/2021 09:14:13 MainProcess     _training_0                    losses_tf       call                           DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x7f515421a970>, weight: 1.0, mask_channel: 2)
07/25/2021 09:14:13 MainProcess     _training_0                    losses_tf       _apply_mask                    DEBUG    Applying mask from channel 2
07/25/2021 09:14:13 MainProcess     _training_0                    losses_tf       call                           DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x7f51541c1520>, weight: 1.0, mask_channel: 3)
07/25/2021 09:14:13 MainProcess     _training_0                    losses_tf       _apply_mask                    DEBUG    Applying mask from channel 3
07/25/2021 09:14:13 MainProcess     _training_0                    losses_tf       call                           DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x7f51541c12e0>, weight: 1.0, mask_channel: 3)
07/25/2021 09:14:13 MainProcess     _training_0                    losses_tf       _apply_mask                    DEBUG    Applying mask from channel 3
07/25/2021 09:14:13 MainProcess     _training_0                    losses_tf       call                           DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x7f515414e5e0>, weight: 3.0, mask_channel: 4)
07/25/2021 09:14:13 MainProcess     _training_0                    losses_tf       _apply_mask                    DEBUG    Applying mask from channel 4
07/25/2021 09:14:13 MainProcess     _training_0                    losses_tf       call                           DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x7f515414ea90>, weight: 1.0, mask_channel: 1)
07/25/2021 09:14:13 MainProcess     _training_0                    losses_tf       _apply_mask                    DEBUG    Applying mask from channel 1
07/25/2021 09:14:14 MainProcess     _training_0                    losses_tf       call                           DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x7f515415a0a0>, weight: 2.0, mask_channel: 5)
07/25/2021 09:14:14 MainProcess     _training_0                    losses_tf       _apply_mask                    DEBUG    Applying mask from channel 5
07/25/2021 09:14:14 MainProcess     _training_0                    losses_tf       call                           DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x7f515415a5b0>, weight: 1.0, mask_channel: 2)
07/25/2021 09:14:14 MainProcess     _training_0                    losses_tf       _apply_mask                    DEBUG    Applying mask from channel 2
07/25/2021 09:14:17 MainProcess     _training_0                    multithreading  run                            DEBUG    Error in thread (_training_0):  Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.\n	 [[node original/encoder_1/conv_128_0_conv2d/Conv2D (defined at /faceswap/plugins/train/trainer/_base.py:193) ]] [Op:__inference_train_function_8730]\n\nFunction call stack:\ntrain_function\n
07/25/2021 09:14:18 MainProcess     MainThread                     train           _monitor                       DEBUG    Thread error detected
07/25/2021 09:14:18 MainProcess     MainThread                     train           _monitor                       DEBUG    Closed Monitor
07/25/2021 09:14:18 MainProcess     MainThread                     train           _end_thread                    DEBUG    Ending Training thread
07/25/2021 09:14:18 MainProcess     MainThread                     train           _end_thread                    CRITICAL Error caught! Exiting...
07/25/2021 09:14:18 MainProcess     MainThread                     multithreading  join                           DEBUG    Joining Threads: '_training'
07/25/2021 09:14:18 MainProcess     MainThread                     multithreading  join                           DEBUG    Joining Thread: '_training_0'
07/25/2021 09:14:18 MainProcess     MainThread                     multithreading  join                           ERROR    Caught exception in thread: '_training_0'
Traceback (most recent call last):
  File "/home/lamakaha/faceswap/lib/cli/launcher.py", line 182, in execute_script
    process.process()
  File "/home/lamakaha/faceswap/scripts/train.py", line 190, in process
    self._end_thread(thread, err)
  File "/home/lamakaha/faceswap/scripts/train.py", line 230, in _end_thread
    thread.join()
  File "/home/lamakaha/faceswap/lib/multithreading.py", line 121, in join
    raise thread.err[1].with_traceback(thread.err[2])
  File "/home/lamakaha/faceswap/lib/multithreading.py", line 37, in run
    self._target(*self._args, **self._kwargs)
  File "/home/lamakaha/faceswap/scripts/train.py", line 252, in _training
    raise err
  File "/home/lamakaha/faceswap/scripts/train.py", line 242, in _training
    self._run_training_cycle(model, trainer)
  File "/home/lamakaha/faceswap/scripts/train.py", line 327, in _run_training_cycle
    trainer.train_one_step(viewer, timelapse)
  File "/home/lamakaha/faceswap/plugins/train/trainer/_base.py", line 193, in train_one_step
    loss = self._model.model.train_on_batch(model_inputs, y=model_targets)
  File "/home/lamakaha/miniconda3/envs/faceswap/lib/python3.8/site-packages/tensorflow/python/keras/engine/training.py", line 1348, in train_on_batch
    logs = train_function(iterator)
  File "/home/lamakaha/miniconda3/envs/faceswap/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py", line 580, in __call__
    result = self._call(*args, **kwds)
  File "/home/lamakaha/miniconda3/envs/faceswap/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py", line 644, in _call
    return self._stateless_fn(*args, **kwds)
  File "/home/lamakaha/miniconda3/envs/faceswap/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 2420, in __call__
    return graph_function._filtered_call(args, kwargs)  # pylint: disable=protected-access
  File "/home/lamakaha/miniconda3/envs/faceswap/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 1661, in _filtered_call
    return self._call_flat(
  File "/home/lamakaha/miniconda3/envs/faceswap/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 1745, in _call_flat
    return self._build_call_outputs(self._inference_function.call(
  File "/home/lamakaha/miniconda3/envs/faceswap/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 593, in call
    outputs = execute.execute(
  File "/home/lamakaha/miniconda3/envs/faceswap/lib/python3.8/site-packages/tensorflow/python/eager/execute.py", line 59, in quick_execute
    tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.UnknownError:  Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
	 [[node original/encoder_1/conv_128_0_conv2d/Conv2D (defined at /faceswap/plugins/train/trainer/_base.py:193) ]] [Op:__inference_train_function_8730]

Function call stack:
train_function


============ System Information ============
encoding:            UTF-8
git_branch:          Not Found
git_commits:         Not Found
gpu_cuda:            9.1
gpu_cudnn:           No global version found. Check Conda packages for Conda cuDNN
gpu_devices:         GPU_0: GeForce RTX 2060
gpu_devices_active:  GPU_0
gpu_driver:          440.82
gpu_vram:            GPU_0: 5933MB
os_machine:          x86_64
os_platform:         Linux-5.3.0-42-generic-x86_64-with-glibc2.17
os_release:          5.3.0-42-generic
py_command:          /home/lamakaha/faceswap/faceswap.py train -A /media/lamakaha/work/projects/deepfake/06_Renders/SanneFace -B /media/lamakaha/work/projects/deepfake/06_Renders/AlexeyFace -m /media/lamakaha/work/projects/deepfake/03_Data/model_v001 -t original -bs 2 -it 1000000 -s 250 -ss 25000 -ps 100 -L INFO -gui
py_conda_version:    conda 4.10.3
py_implementation:   CPython
py_version:          3.8.10
py_virtual_env:      True
sys_cores:           12
sys_processor:       x86_64
sys_ram:             Total: 64263MB, Available: 58550MB, Used: 4752MB, Free: 54832MB

=============== Pip Packages ===============
absl-py @ file:///tmp/build/80754af9/absl-py_1623867230185/work
aiohttp @ file:///tmp/build/80754af9/aiohttp_1614360992924/work
astor==0.8.1
astunparse==1.6.3
async-timeout==3.0.1
attrs @ file:///tmp/build/80754af9/attrs_1620827162558/work
blinker==1.4
brotlipy==0.7.0
cachetools @ file:///tmp/build/80754af9/cachetools_1619597386817/work
certifi==2021.5.30
cffi @ file:///tmp/build/80754af9/cffi_1625807838443/work
chardet @ file:///tmp/build/80754af9/chardet_1605303185383/work
click @ file:///tmp/build/80754af9/click_1621604852318/work
coverage @ file:///tmp/build/80754af9/coverage_1614613670853/work
cryptography @ file:///tmp/build/80754af9/cryptography_1616769286105/work
cycler==0.10.0
Cython @ file:///tmp/build/80754af9/cython_1626256955500/work
fastcluster==1.1.26
ffmpy==0.2.3
gast==0.3.3
google-auth @ file:///tmp/build/80754af9/google-auth_1626320605116/work
google-auth-oauthlib @ file:///tmp/build/80754af9/google-auth-oauthlib_1617120569401/work
google-pasta==0.2.0
grpcio @ file:///tmp/build/80754af9/grpcio_1614884175859/work
h5py @ file:///tmp/build/80754af9/h5py_1593454122442/work
idna @ file:///home/linux1/recipes/ci/idna_1610986105248/work
imageio @ file:///tmp/build/80754af9/imageio_1617700267927/work
imageio-ffmpeg @ file:///home/conda/feedstock_root/build_artifacts/imageio-ffmpeg_1621542018480/work
importlib-metadata @ file:///tmp/build/80754af9/importlib-metadata_1617874469820/work
joblib @ file:///tmp/build/80754af9/joblib_1613502643832/work
Keras-Preprocessing @ file:///tmp/build/80754af9/keras-preprocessing_1612283640596/work
kiwisolver @ file:///tmp/build/80754af9/kiwisolver_1612282420641/work
Markdown @ file:///tmp/build/80754af9/markdown_1614363528767/work
matplotlib @ file:///tmp/build/80754af9/matplotlib-base_1592846008246/work
mkl-fft==1.3.0
mkl-random==1.1.1
mkl-service==2.3.0
multidict @ file:///tmp/build/80754af9/multidict_1607367757617/work
numpy @ file:///tmp/build/80754af9/numpy_and_numpy_base_1603570489231/work
nvidia-ml-py3 @ git+https://github.com/deepfakes/nvidia-ml-py3.git@6fc29ac84b32bad877f078cb4a777c1548a00bf6
oauthlib @ file:///tmp/build/80754af9/oauthlib_1623060228408/work
olefile==0.46
opencv-python==4.5.3.56
opt-einsum @ file:///tmp/build/80754af9/opt_einsum_1621500238896/work
pathlib==1.0.1
Pillow @ file:///tmp/build/80754af9/pillow_1625655817137/work
protobuf==3.17.2
psutil @ file:///tmp/build/80754af9/psutil_1612298023621/work
pyasn1==0.4.8
pyasn1-modules==0.2.8
pycparser @ file:///tmp/build/80754af9/pycparser_1594388511720/work
PyJWT @ file:///tmp/build/80754af9/pyjwt_1619651636675/work
pyOpenSSL @ file:///tmp/build/80754af9/pyopenssl_1608057966937/work
pyparsing @ file:///home/linux1/recipes/ci/pyparsing_1610983426697/work
PySocks @ file:///tmp/build/80754af9/pysocks_1605305779399/work
python-dateutil @ file:///tmp/build/80754af9/python-dateutil_1626374649649/work
requests @ file:///tmp/build/80754af9/requests_1608241421344/work
requests-oauthlib==1.3.0
rsa @ file:///tmp/build/80754af9/rsa_1614366226499/work
scikit-learn @ file:///tmp/build/80754af9/scikit-learn_1621370412049/work
scipy @ file:///tmp/build/80754af9/scipy_1616703172749/work
sip==4.19.13
six @ file:///tmp/build/80754af9/six_1623709665295/work
tensorboard @ file:///home/builder/ktietz/aggregate/tensorflow_recipes/ci_te/tensorboard_1614593728657/work/tmp_pip_dir
tensorboard-plugin-wit==1.6.0
tensorflow==2.2.0
tensorflow-estimator @ file:///home/builder/ktietz/aggregate/tensorflow_recipes/ci_baze37/tensorflow-estimator_1622026529081/work/tensorflow_estimator-2.5.0-py2.py3-none-any.whl
termcolor==1.1.0
threadpoolctl @ file:///tmp/build/80754af9/threadpoolctl_1626115094421/work
tornado @ file:///tmp/build/80754af9/tornado_1606942300299/work
tqdm @ file:///tmp/build/80754af9/tqdm_1625563689033/work
typing-extensions @ file:///tmp/build/80754af9/typing_extensions_1624965014186/work
urllib3 @ file:///tmp/build/80754af9/urllib3_1625084269274/work
Werkzeug @ file:///home/ktietz/src/ci/werkzeug_1611932622770/work
wrapt==1.12.1
yarl @ file:///tmp/build/80754af9/yarl_1606939922162/work
zipp @ file:///tmp/build/80754af9/zipp_1625570634446/work

============== Conda Packages ==============
# packages in environment at /home/lamakaha/miniconda3/envs/faceswap:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                        main  
_openmp_mutex 4.5 1_gnu
_tflow_select 2.1.0 gpu
absl-py 0.13.0 py38h06a4308_0
aiohttp 3.7.4 py38h27cfd23_1
astor 0.8.1 py38h06a4308_0
astunparse 1.6.3 py_0
async-timeout 3.0.1 py38h06a4308_0
attrs 21.2.0 pyhd3eb1b0_0
blas 1.0 mkl
blinker 1.4 py38h06a4308_0
brotlipy 0.7.0 py38h27cfd23_1003
bzip2 1.0.8 h7f98852_4 conda-forge c-ares 1.17.1 h27cfd23_0
ca-certificates 2021.7.5 h06a4308_1
cachetools 4.2.2 pyhd3eb1b0_0
certifi 2021.5.30 py38h06a4308_0
cffi 1.14.6 py38h400218f_0
chardet 3.0.4 py38h06a4308_1003
click 8.0.1 pyhd3eb1b0_0
coverage 5.5 py38h27cfd23_2
cryptography 3.4.7 py38hd23ed53_0
cudatoolkit 10.1.243 h6bb024c_0
cudnn 7.6.5 cuda10.1_0
cupti 10.1.168 0
cycler 0.10.0 py38_0
cython 0.29.24 py38h295c915_0
dbus 1.13.18 hb2f20db_0
expat 2.4.1 h2531618_2
fastcluster 1.1.26 py38hc5bc63f_2 conda-forge ffmpeg 4.3.1 hca11adc_2 conda-forge ffmpy 0.2.3 pypi_0 pypi fontconfig 2.13.1 h6c09931_0
freetype 2.10.4 h5ab3b9f_0
gast 0.3.3 py_0
git 2.23.0 pl526hacde149_0
glib 2.69.0 h5202010_0
gmp 6.2.1 h58526e2_0 conda-forge gnutls 3.6.13 h85f3911_1 conda-forge google-auth 1.33.0 pyhd3eb1b0_0
google-auth-oauthlib 0.4.4 pyhd3eb1b0_0
google-pasta 0.2.0 py_0
grpcio 1.36.1 py38h2157cd5_1
gst-plugins-base 1.14.0 h8213a91_2
gstreamer 1.14.0 h28cd5cc_2
h5py 2.10.0 py38hd6299e0_1
hdf5 1.10.6 hb1b8bf9_0
icu 58.2 he6710b0_3
idna 2.10 pyhd3eb1b0_0
imageio 2.9.0 pyhd3eb1b0_0
imageio-ffmpeg 0.4.4 pyhd8ed1ab_0 conda-forge importlib-metadata 3.10.0 py38h06a4308_0
intel-openmp 2021.3.0 h06a4308_3350
joblib 1.0.1 pyhd3eb1b0_0
jpeg 9b h024ee3a_2
keras-preprocessing 1.1.2 pyhd3eb1b0_0
kiwisolver 1.3.1 py38h2531618_0
krb5 1.19.1 h3535a68_0
lame 3.100 h7f98852_1001 conda-forge lcms2 2.12 h3be6417_0
ld_impl_linux-64 2.35.1 h7274673_9
libcurl 7.71.1 h303737a_2
libedit 3.1.20210216 h27cfd23_1
libffi 3.3 he6710b0_2
libgcc-ng 9.3.0 h5101ec6_17
libgfortran-ng 7.5.0 ha8ba4b0_17
libgfortran4 7.5.0 ha8ba4b0_17
libgomp 9.3.0 h5101ec6_17
libpng 1.6.37 hbc83047_0
libprotobuf 3.17.2 h4ff587b_1
libssh2 1.9.0 h1ba5d50_1
libstdcxx-ng 9.3.0 hd4cf53a_17
libtiff 4.2.0 h85742a9_0
libuuid 1.0.3 h1bed415_2
libwebp-base 1.2.0 h27cfd23_0
libxcb 1.14 h7b6447c_0
libxml2 2.9.12 h03d6c58_0
lz4-c 1.9.3 h2531618_0
markdown 3.3.4 py38h06a4308_0
matplotlib 3.2.2 0
matplotlib-base 3.2.2 py38hef1b27d_0
mkl 2020.2 256
mkl-service 2.3.0 py38he904b0f_0
mkl_fft 1.3.0 py38h54f3939_0
mkl_random 1.1.1 py38h0573a6f_0
multidict 5.1.0 py38h27cfd23_2
ncurses 6.2 he6710b0_1
nettle 3.6 he412f7d_0 conda-forge numpy 1.19.2 py38h54aff64_0
numpy-base 1.19.2 py38hfa32c7d_0
nvidia-ml-py3 7.352.1 pypi_0 pypi oauthlib 3.1.1 pyhd3eb1b0_0
olefile 0.46 py_0
opencv-python 4.5.3.56 pypi_0 pypi openh264 2.1.1 h780b84a_0 conda-forge openjpeg 2.3.0 h05c96fa_1
openssl 1.1.1k h27cfd23_0
opt_einsum 3.3.0 pyhd3eb1b0_1
pathlib 1.0.1 py_1
pcre 8.45 h295c915_0
perl 5.26.2 h14c3975_0
pillow 8.3.1 py38h2c7a002_0
pip 21.1.3 py38h06a4308_0
protobuf 3.17.2 py38h295c915_0
psutil 5.8.0 py38h27cfd23_1
pyasn1 0.4.8 py_0
pyasn1-modules 0.2.8 py_0
pycparser 2.20 py_2
pyjwt 2.1.0 py38h06a4308_0
pyopenssl 20.0.1 pyhd3eb1b0_1
pyparsing 2.4.7 pyhd3eb1b0_0
pyqt 5.9.2 py38h05f1152_4
pysocks 1.7.1 py38h06a4308_0
python 3.8.10 h12debd9_8
python-dateutil 2.8.2 pyhd3eb1b0_0
python_abi 3.8 2_cp38 conda-forge qt 5.9.7 h5867ecd_1
readline 8.1 h27cfd23_0
requests 2.25.1 pyhd3eb1b0_0
requests-oauthlib 1.3.0 py_0
rsa 4.7.2 pyhd3eb1b0_1
scikit-learn 0.24.2 py38ha9443f7_0
scipy 1.6.2 py38h91f5cce_0
setuptools 52.0.0 py38h06a4308_0
sip 4.19.13 py38he6710b0_0
six 1.16.0 pyhd3eb1b0_0
sqlite 3.36.0 hc218d9a_0
tensorboard 2.4.0 pyhc547734_0
tensorboard-plugin-wit 1.6.0 py_0
tensorflow 2.2.0 gpu_py38hb782248_0
tensorflow-base 2.2.0 gpu_py38h83e3d50_0
tensorflow-estimator 2.5.0 pyh7b7c402_0
tensorflow-gpu 2.2.0 h0d30ee6_0
termcolor 1.1.0 py38h06a4308_1
threadpoolctl 2.2.0 pyhb85f177_0
tk 8.6.10 hbc83047_0
tornado 6.1 py38h27cfd23_0
tqdm 4.61.2 pyhd3eb1b0_1
typing-extensions 3.10.0.0 hd3eb1b0_0
typing_extensions 3.10.0.0 pyh06a4308_0
urllib3 1.26.6 pyhd3eb1b0_1
werkzeug 1.0.1 pyhd3eb1b0_0
wheel 0.36.2 pyhd3eb1b0_0
wrapt 1.12.1 py38h7b6447c_1
x264 1!161.3030 h7f98852_1 conda-forge xz 5.2.5 h7b6447c_0
yarl 1.6.3 py38h27cfd23_0
zipp 3.5.0 pyhd3eb1b0_0
zlib 1.2.11 h7b6447c_3
zstd 1.4.9 haebb681_0 ================= Configs ================== --------- convert.ini --------- [scaling.sharpen] method: none amount: 150 radius: 0.3 threshold: 5.0 [mask.mask_blend] type: normalized kernel_size: 3 passes: 4 threshold: 4 erosion: 0.0 [mask.box_blend] type: gaussian distance: 11.0 radius: 5.0 passes: 1 [writer.ffmpeg] container: mp4 codec: libx264 crf: 23 preset: medium tune: none profile: auto level: auto skip_mux: False [writer.pillow] format: png draw_transparent: False optimize: False gif_interlace: True jpg_quality: 75 png_compress_level: 3 tif_compression: tiff_deflate [writer.opencv] format: png draw_transparent: False jpg_quality: 75 png_compress_level: 3 [writer.gif] fps: 25 loop: 0 palettesize: 256 subrectangles: False [color.manual_balance] colorspace: HSV balance_1: 0.0 balance_2: 0.0 balance_3: 0.0 contrast: 0.0 brightness: 0.0 [color.color_transfer] clip: True preserve_paper: True [color.match_hist] threshold: 99.0 --------- gui.ini --------- [global] fullscreen: False tab: extract options_panel_width: 30 console_panel_height: 20 icon_size: 14 font: default font_size: 9 autosave_last_session: prompt timeout: 120 auto_load_model_stats: True --------- train.ini --------- [global] centering: face coverage: 68.75 icnr_init: False conv_aware_init: False optimizer: adam learning_rate: 5e-05 epsilon_exponent: -7 reflect_padding: False allow_growth: False mixed_precision: False nan_protection: True convert_batchsize: 16 [global.loss] loss_function: ssim mask_loss_function: mse l2_reg_term: 100 eye_multiplier: 3 mouth_multiplier: 2 penalized_mask_loss: True mask_type: extended mask_blur_kernel: 3 mask_threshold: 4 learn_mask: False [model.realface] input_size: 64 output_size: 128 dense_nodes: 1536 complexity_encoder: 128 complexity_decoder: 512 [model.phaze_a] output_size: 128 shared_fc: none enable_gblock: True split_fc: True split_gblock: False split_decoders: False enc_architecture: fs_original enc_scaling: 40 enc_load_weights: True bottleneck_type: dense bottleneck_norm: none bottleneck_size: 1024 bottleneck_in_encoder: True fc_depth: 1 fc_min_filters: 1024 fc_max_filters: 1024 fc_dimensions: 4 fc_filter_slope: -0.5 fc_dropout: 0.0 fc_upsampler: upsample2d fc_upsamples: 1 fc_upsample_filters: 512 fc_gblock_depth: 3 fc_gblock_min_nodes: 512 fc_gblock_max_nodes: 512 fc_gblock_filter_slope: -0.5 fc_gblock_dropout: 0.0 dec_upscale_method: subpixel dec_norm: none dec_min_filters: 64 dec_max_filters: 512 dec_filter_slope: -0.45 dec_res_blocks: 1 dec_output_kernel: 5 dec_gaussian: True dec_skip_last_residual: True freeze_layers: keras_encoder load_layers: encoder fs_original_depth: 4 fs_original_min_filters: 128 fs_original_max_filters: 1024 mobilenet_width: 1.0 mobilenet_depth: 1 mobilenet_dropout: 0.001 [model.dlight] features: best details: good output_size: 256 [model.dfaker] output_size: 128 [model.dfl_sae] input_size: 128 clipnorm: True architecture: df autoencoder_dims: 0 encoder_dims: 42 decoder_dims: 21 multiscale_decoder: False [model.villain] lowmem: False [model.original] lowmem: False [model.dfl_h128] lowmem: False [model.unbalanced] input_size: 128 lowmem: False clipnorm: True nodes: 1024 complexity_encoder: 128 complexity_decoder_a: 384 complexity_decoder_b: 512 [trainer.original] preview_images: 14 zoom_amount: 5 rotation_range: 10 shift_range: 5 flip_chance: 50 color_lightness: 30 color_ab: 8 color_clahe_chance: 50 color_clahe_max_size: 4 --------- .faceswap --------- backend: nvidia --------- extract.ini --------- [global] allow_growth: False [detect.mtcnn] minsize: 20 scalefactor: 0.709 batch-size: 8 threshold_1: 0.6 threshold_2: 0.7 threshold_3: 0.7 [detect.cv2_dnn] confidence: 50 [detect.s3fd] confidence: 70 batch-size: 4 [align.fan] batch-size: 12 [mask.vgg_clear] batch-size: 6 [mask.unet_dfl] batch-size: 8 [mask.vgg_obstructed] batch-size: 2 [mask.bisenet_fp] batch-size: 8 include_ears: False include_hair: False include_glasses: True

i am out of my depth here, please advise.
Thanks you.


Re: crash report while training: Failed to get convolution algorithm. This is probably because cuDNN failed to initializ

Posted: Sun Jul 25, 2021 10:43 am
by torzdf

See above


Re: crash report while training: Failed to get convolution algorithm. This is probably because cuDNN failed to initializ

Posted: Sun Jul 25, 2021 11:21 am
by lamakaha

Hi, thank you for a prompt reply,
I did enable Growth, did not help, have same issue, same error.

Code: Select all

Loading...
Setting Faceswap backend to NVIDIA
07/25/2021 13:07:26 INFO     Log level set to: INFO
07/25/2021 13:07:27 INFO     Model A Directory: '/media/lamakaha/work/projects/deepfake/06_Renders/SanneFace' (156 images)
07/25/2021 13:07:27 INFO     Model B Directory: '/media/lamakaha/work/projects/deepfake/06_Renders/AlexeyFace' (758 images)
07/25/2021 13:07:27 WARNING  At least one of your input folders contains fewer than 250 images. Results are likely to be poor.
07/25/2021 13:07:27 WARNING  You need to provide a significant number of images to successfully train a Neural Network. Aim for between 500 - 5000 images per side.
07/25/2021 13:07:27 INFO     Training data directory: /media/lamakaha/work/projects/deepfake/03_Data/model_v001
07/25/2021 13:07:27 INFO     ===================================================
07/25/2021 13:07:27 INFO       Starting
07/25/2021 13:07:27 INFO       Press 'Stop' to save and quit
07/25/2021 13:07:27 INFO     ===================================================
07/25/2021 13:07:28 INFO     Loading data, this may take a while...
07/25/2021 13:07:28 INFO     Loading Model from Original plugin...
07/25/2021 13:07:28 INFO     No existing state file found. Generating.
07/25/2021 13:07:30 INFO     Loading Trainer from Original plugin...
2021-07-25 13:07:38.632840: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2021-07-25 13:07:38.634598: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2021-07-25 13:07:38.636056: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2021-07-25 13:07:38.637628: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
07/25/2021 13:07:39 CRITICAL Error caught! Exiting...
07/25/2021 13:07:39 ERROR    Caught exception in thread: '_training_0'
07/25/2021 13:07:42 ERROR    Got Exception on main handler:
Traceback (most recent call last):
  File "/home/lamakaha/faceswap/lib/cli/launcher.py", line 182, in execute_script
    process.process()
  File "/home/lamakaha/faceswap/scripts/train.py", line 190, in process
    self._end_thread(thread, err)
  File "/home/lamakaha/faceswap/scripts/train.py", line 230, in _end_thread
    thread.join()
  File "/home/lamakaha/faceswap/lib/multithreading.py", line 121, in join
    raise thread.err[1].with_traceback(thread.err[2])
  File "/home/lamakaha/faceswap/lib/multithreading.py", line 37, in run
    self._target(*self._args, **self._kwargs)
  File "/home/lamakaha/faceswap/scripts/train.py", line 252, in _training
    raise err
  File "/home/lamakaha/faceswap/scripts/train.py", line 242, in _training
    self._run_training_cycle(model, trainer)
  File "/home/lamakaha/faceswap/scripts/train.py", line 327, in _run_training_cycle
    trainer.train_one_step(viewer, timelapse)
  File "/home/lamakaha/faceswap/plugins/train/trainer/_base.py", line 193, in train_one_step
    loss = self._model.model.train_on_batch(model_inputs, y=model_targets)
  File "/home/lamakaha/miniconda3/envs/faceswap/lib/python3.8/site-packages/tensorflow/python/keras/engine/training.py", line 1348, in train_on_batch
    logs = train_function(iterator)
  File "/home/lamakaha/miniconda3/envs/faceswap/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py", line 580, in __call__
    result = self._call(*args, **kwds)
  File "/home/lamakaha/miniconda3/envs/faceswap/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py", line 644, in _call
    return self._stateless_fn(*args, **kwds)
  File "/home/lamakaha/miniconda3/envs/faceswap/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 2420, in __call__
    return graph_function._filtered_call(args, kwargs)  # pylint: disable=protected-access
  File "/home/lamakaha/miniconda3/envs/faceswap/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 1661, in _filtered_call
    return self._call_flat(
  File "/home/lamakaha/miniconda3/envs/faceswap/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 1745, in _call_flat
    return self._build_call_outputs(self._inference_function.call(
  File "/home/lamakaha/miniconda3/envs/faceswap/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 593, in call
    outputs = execute.execute(
  File "/home/lamakaha/miniconda3/envs/faceswap/lib/python3.8/site-packages/tensorflow/python/eager/execute.py", line 59, in quick_execute
    tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.UnknownError:  Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
	 [[node original/encoder_1/conv_128_0_conv2d/Conv2D (defined at /faceswap/plugins/train/trainer/_base.py:193) ]] [Op:__inference_train_function_8730]


Re: crash report while training: Failed to get convolution algorithm. This is probably because cuDNN failed to initializ

Posted: Sun Jul 25, 2021 11:25 am
by torzdf

The next thing to do is to uninstall the global Cuda 9.1 you have installed. It is incompatible.

Reboot and try again.


Re: crash report while training: Failed to get convolution algorithm. This is probably because cuDNN failed to initializ

Posted: Sun Jul 25, 2021 11:49 am
by lamakaha

Thanks, did following:

uninstalled CUDA using

Code: Select all

sudo apt-get remove nvidia-cuda-toolkit

checking if have some leftovers

Code: Select all

lamakaha@lamakaha-HOME:~$ locate cuda | grep /cuda$
/home/lamakaha/miniconda3/envs/faceswap/lib/python3.8/site-packages/tensorflow/include/external/local_config_cuda/cuda
/home/lamakaha/miniconda3/envs/faceswap/lib/python3.8/site-packages/tensorflow/include/external/local_config_cuda/cuda/cuda
/home/lamakaha/miniconda3/envs/faceswap/lib/python3.8/site-packages/tensorflow/include/tensorflow/stream_executor/cuda
/home/lamakaha/miniconda3/pkgs/tensorflow-base-2.2.0-gpu_py38h83e3d50_0/lib/python3.8/site-packages/tensorflow/include/external/local_config_cuda/cuda
/home/lamakaha/miniconda3/pkgs/tensorflow-base-2.2.0-gpu_py38h83e3d50_0/lib/python3.8/site-packages/tensorflow/include/external/local_config_cuda/cuda/cuda
/home/lamakaha/miniconda3/pkgs/tensorflow-base-2.2.0-gpu_py38h83e3d50_0/lib/python3.8/site-packages/tensorflow/include/tensorflow/stream_executor/cuda
/usr/share/blender/scripts/addons/cycles/source/kernel/kernels/cuda

if i am not mistaken i have cuda 10.2

Code: Select all

lamakaha@lamakaha-HOME:~$ nvidia-smi
Sun Jul 25 13:43:40 2021       
+-----------------------------------------------------------------------------+ | NVIDIA-SMI 440.82 Driver Version: 440.82 CUDA Version: 10.2 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 GeForce RTX 2060 Off | 00000000:01:00.0 On | N/A | | 32% 54C P0 41W / 160W | 5910MiB / 5933MiB | 6% Default | +-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 1266 G /usr/lib/xorg/Xorg 331MiB | | 0 2123 G cinnamon 86MiB | | 0 29814 G ...AAgAAAAAAAAACAAAAAAAAAA= --shared-files 114MiB | | 0 30310 C - 5363MiB | +-----------------------------------------------------------------------------+

console print

Code: Select all

Loading...
Setting Faceswap backend to NVIDIA
07/25/2021 13:37:51 INFO     Log level set to: VERBOSE
07/25/2021 13:37:52 INFO     Model A Directory: '/media/lamakaha/work/projects/deepfake/06_Renders/SanneFace' (156 images)
07/25/2021 13:37:52 INFO     Model B Directory: '/media/lamakaha/work/projects/deepfake/06_Renders/AlexeyFace' (758 images)
07/25/2021 13:37:52 WARNING  At least one of your input folders contains fewer than 250 images. Results are likely to be poor.
07/25/2021 13:37:52 WARNING  You need to provide a significant number of images to successfully train a Neural Network. Aim for between 500 - 5000 images per side.
07/25/2021 13:37:52 INFO     Training data directory: /media/lamakaha/work/projects/deepfake/03_Data/model_v001
07/25/2021 13:37:52 INFO     ===================================================
07/25/2021 13:37:52 INFO       Starting
07/25/2021 13:37:52 INFO       Press 'Stop' to save and quit
07/25/2021 13:37:52 INFO     ===================================================
07/25/2021 13:37:53 INFO     Loading data, this may take a while...
07/25/2021 13:37:53 INFO     Loading Model from Original plugin...
07/25/2021 13:37:53 VERBOSE  Loading config: '/home/lamakaha/faceswap/config/train.ini'
07/25/2021 13:37:53 VERBOSE  Loading config: '/home/lamakaha/faceswap/config/train.ini'
07/25/2021 13:37:53 INFO     No existing state file found. Generating.
2021-07-25 13:37:53.997644: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2021-07-25 13:37:54.024059: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-25 13:37:54.024653: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce RTX 2060 computeCapability: 7.5
coreClock: 1.68GHz coreCount: 30 deviceMemorySize: 5.79GiB deviceMemoryBandwidth: 312.97GiB/s
2021-07-25 13:37:54.024882: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2021-07-25 13:37:54.026989: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2021-07-25 13:37:54.028399: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2021-07-25 13:37:54.028666: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2021-07-25 13:37:54.030886: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2021-07-25 13:37:54.032780: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2021-07-25 13:37:54.038215: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2021-07-25 13:37:54.038373: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-25 13:37:54.038783: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2021-07-25 13:37:54.039071: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX
2021-07-25 13:37:54.069915: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 3201885000 Hz
2021-07-25 13:37:54.070756: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7fef0476e0e0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2021-07-25 13:37:54.070807: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2021-07-25 13:37:54.071093: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-25 13:37:54.071764: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce RTX 2060 computeCapability: 7.5
coreClock: 1.68GHz coreCount: 30 deviceMemorySize: 5.79GiB deviceMemoryBandwidth: 312.97GiB/s
2021-07-25 13:37:54.071838: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2021-07-25 13:37:54.071874: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2021-07-25 13:37:54.071908: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2021-07-25 13:37:54.071942: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2021-07-25 13:37:54.071975: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2021-07-25 13:37:54.072010: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2021-07-25 13:37:54.072044: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2021-07-25 13:37:54.072136: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-25 13:37:54.072638: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2021-07-25 13:37:54.072694: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2021-07-25 13:37:54.187843: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-07-25 13:37:54.187876: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108]      0
2021-07-25 13:37:54.187885: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0:   N
2021-07-25 13:37:54.188065: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-25 13:37:54.188515: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-25 13:37:54.188978: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-25 13:37:54.189960: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5023 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2060, pci bus id: 0000:01:00.0, compute capability: 7.5)
2021-07-25 13:37:54.192277: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7fef04f952c0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2021-07-25 13:37:54.192312: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): GeForce RTX 2060, Compute Capability 7.5
07/25/2021 13:37:55 VERBOSE  Using Adam optimizer
07/25/2021 13:37:55 VERBOSE  Model: "original"
07/25/2021 13:37:55 VERBOSE  __________________________________________________________________________________________________
07/25/2021 13:37:55 VERBOSE  Layer (type)                    Output Shape         Param #     Connected to
07/25/2021 13:37:55 VERBOSE  ==================================================================================================
07/25/2021 13:37:55 VERBOSE  face_in_a (InputLayer)          [(None, 64, 64, 3)]  0
07/25/2021 13:37:55 VERBOSE  __________________________________________________________________________________________________
07/25/2021 13:37:55 VERBOSE  face_in_b (InputLayer)          [(None, 64, 64, 3)]  0
07/25/2021 13:37:55 VERBOSE  __________________________________________________________________________________________________
07/25/2021 13:37:55 VERBOSE  encoder (Model)                 (None, 8, 8, 512)    69662976    face_in_a[0][0]
07/25/2021 13:37:55 VERBOSE                                                                   face_in_b[0][0]
07/25/2021 13:37:55 VERBOSE  __________________________________________________________________________________________________
07/25/2021 13:37:55 VERBOSE  decoder_a (Model)               (None, 64, 64, 3)    6199747     encoder[1][0]
07/25/2021 13:37:55 VERBOSE  __________________________________________________________________________________________________
07/25/2021 13:37:55 VERBOSE  decoder_b (Model)               (None, 64, 64, 3)    6199747     encoder[2][0]
07/25/2021 13:37:55 VERBOSE  ==================================================================================================
07/25/2021 13:37:55 VERBOSE  Total params: 82,062,470
07/25/2021 13:37:55 VERBOSE  Trainable params: 82,062,470
07/25/2021 13:37:55 VERBOSE  Non-trainable params: 0
07/25/2021 13:37:55 VERBOSE  __________________________________________________________________________________________________
07/25/2021 13:37:55 VERBOSE  Model: "encoder"
07/25/2021 13:37:55 VERBOSE  _________________________________________________________________
07/25/2021 13:37:55 VERBOSE  Layer (type)                 Output Shape              Param #
07/25/2021 13:37:55 VERBOSE  =================================================================
07/25/2021 13:37:55 VERBOSE  input_1 (InputLayer)         [(None, 64, 64, 3)]       0
07/25/2021 13:37:55 VERBOSE  _________________________________________________________________
07/25/2021 13:37:55 VERBOSE  conv_128_0_conv2d (Conv2D)   (None, 32, 32, 128)       9728
07/25/2021 13:37:55 VERBOSE  _________________________________________________________________
07/25/2021 13:37:55 VERBOSE  conv_128_0_leakyrelu (LeakyR (None, 32, 32, 128)       0
07/25/2021 13:37:55 VERBOSE  _________________________________________________________________
07/25/2021 13:37:55 VERBOSE  conv_256_0_conv2d (Conv2D)   (None, 16, 16, 256)       819456
07/25/2021 13:37:55 VERBOSE  _________________________________________________________________
07/25/2021 13:37:55 VERBOSE  conv_256_0_leakyrelu (LeakyR (None, 16, 16, 256)       0
07/25/2021 13:37:55 VERBOSE  _________________________________________________________________
07/25/2021 13:37:55 VERBOSE  conv_512_0_conv2d (Conv2D)   (None, 8, 8, 512)         3277312
07/25/2021 13:37:55 VERBOSE  _________________________________________________________________
07/25/2021 13:37:55 VERBOSE  conv_512_0_leakyrelu (LeakyR (None, 8, 8, 512)         0
07/25/2021 13:37:55 VERBOSE  _________________________________________________________________
07/25/2021 13:37:55 VERBOSE  conv_1024_0_conv2d (Conv2D)  (None, 4, 4, 1024)        13108224
07/25/2021 13:37:55 VERBOSE  _________________________________________________________________
07/25/2021 13:37:55 VERBOSE  conv_1024_0_leakyrelu (Leaky (None, 4, 4, 1024)        0
07/25/2021 13:37:55 VERBOSE  _________________________________________________________________
07/25/2021 13:37:55 VERBOSE  flatten (Flatten)            (None, 16384)             0
07/25/2021 13:37:55 VERBOSE  _________________________________________________________________
07/25/2021 13:37:55 VERBOSE  dense (Dense)                (None, 1024)              16778240
07/25/2021 13:37:55 VERBOSE  _________________________________________________________________
07/25/2021 13:37:55 VERBOSE  dense_1 (Dense)              (None, 16384)             16793600
07/25/2021 13:37:55 VERBOSE  _________________________________________________________________
07/25/2021 13:37:55 VERBOSE  reshape (Reshape)            (None, 4, 4, 1024)        0
07/25/2021 13:37:55 VERBOSE  _________________________________________________________________
07/25/2021 13:37:55 VERBOSE  upscale_512_0_conv2d_conv2d  (None, 4, 4, 2048)        18876416
07/25/2021 13:37:55 VERBOSE  _________________________________________________________________
07/25/2021 13:37:55 VERBOSE  upscale_512_0_conv2d_leakyre (None, 4, 4, 2048)        0
07/25/2021 13:37:55 VERBOSE  _________________________________________________________________
07/25/2021 13:37:55 VERBOSE  upscale_512_0_pixelshuffler  (None, 8, 8, 512)         0
07/25/2021 13:37:55 VERBOSE  =================================================================
07/25/2021 13:37:55 VERBOSE  Total params: 69,662,976
07/25/2021 13:37:55 VERBOSE  Trainable params: 69,662,976
07/25/2021 13:37:55 VERBOSE  Non-trainable params: 0
07/25/2021 13:37:55 VERBOSE  _________________________________________________________________
07/25/2021 13:37:55 VERBOSE  Model: "decoder_a"
07/25/2021 13:37:55 VERBOSE  _________________________________________________________________
07/25/2021 13:37:55 VERBOSE  Layer (type)                 Output Shape              Param #
07/25/2021 13:37:55 VERBOSE  =================================================================
07/25/2021 13:37:55 VERBOSE  input_2 (InputLayer)         [(None, 8, 8, 512)]       0
07/25/2021 13:37:55 VERBOSE  _________________________________________________________________
07/25/2021 13:37:55 VERBOSE  upscale_256_0_conv2d_conv2d  (None, 8, 8, 1024)        4719616
07/25/2021 13:37:55 VERBOSE  _________________________________________________________________
07/25/2021 13:37:55 VERBOSE  upscale_256_0_conv2d_leakyre (None, 8, 8, 1024)        0
07/25/2021 13:37:55 VERBOSE  _________________________________________________________________
07/25/2021 13:37:55 VERBOSE  upscale_256_0_pixelshuffler  (None, 16, 16, 256)       0
07/25/2021 13:37:55 VERBOSE  _________________________________________________________________
07/25/2021 13:37:55 VERBOSE  upscale_128_0_conv2d_conv2d  (None, 16, 16, 512)       1180160
07/25/2021 13:37:55 VERBOSE  _________________________________________________________________
07/25/2021 13:37:55 VERBOSE  upscale_128_0_conv2d_leakyre (None, 16, 16, 512)       0
07/25/2021 13:37:55 VERBOSE  _________________________________________________________________
07/25/2021 13:37:55 VERBOSE  upscale_128_0_pixelshuffler  (None, 32, 32, 128)       0
07/25/2021 13:37:55 VERBOSE  _________________________________________________________________
07/25/2021 13:37:55 VERBOSE  upscale_64_0_conv2d_conv2d ( (None, 32, 32, 256)       295168
07/25/2021 13:37:55 VERBOSE  _________________________________________________________________
07/25/2021 13:37:55 VERBOSE  upscale_64_0_conv2d_leakyrel (None, 32, 32, 256)       0
07/25/2021 13:37:55 VERBOSE  _________________________________________________________________
07/25/2021 13:37:55 VERBOSE  upscale_64_0_pixelshuffler ( (None, 64, 64, 64)        0
07/25/2021 13:37:55 VERBOSE  _________________________________________________________________
07/25/2021 13:37:55 VERBOSE  face_out_a_conv2d (Conv2D)   (None, 64, 64, 3)         4803
07/25/2021 13:37:55 VERBOSE  _________________________________________________________________
07/25/2021 13:37:55 VERBOSE  face_out_a (Activation)      (None, 64, 64, 3)         0
07/25/2021 13:37:55 VERBOSE  =================================================================
07/25/2021 13:37:55 VERBOSE  Total params: 6,199,747
07/25/2021 13:37:55 VERBOSE  Trainable params: 6,199,747
07/25/2021 13:37:55 VERBOSE  Non-trainable params: 0
07/25/2021 13:37:55 VERBOSE  _________________________________________________________________
07/25/2021 13:37:55 VERBOSE  Model: "decoder_b"
07/25/2021 13:37:55 VERBOSE  _________________________________________________________________
07/25/2021 13:37:55 VERBOSE  Layer (type)                 Output Shape              Param #
07/25/2021 13:37:55 VERBOSE  =================================================================
07/25/2021 13:37:55 VERBOSE  input_3 (InputLayer)         [(None, 8, 8, 512)]       0
07/25/2021 13:37:55 VERBOSE  _________________________________________________________________
07/25/2021 13:37:55 VERBOSE  upscale_256_1_conv2d_conv2d  (None, 8, 8, 1024)        4719616
07/25/2021 13:37:55 VERBOSE  _________________________________________________________________
07/25/2021 13:37:55 VERBOSE  upscale_256_1_conv2d_leakyre (None, 8, 8, 1024)        0
07/25/2021 13:37:55 VERBOSE  _________________________________________________________________
07/25/2021 13:37:55 VERBOSE  upscale_256_1_pixelshuffler  (None, 16, 16, 256)       0
07/25/2021 13:37:55 VERBOSE  _________________________________________________________________
07/25/2021 13:37:55 VERBOSE  upscale_128_1_conv2d_conv2d  (None, 16, 16, 512)       1180160
07/25/2021 13:37:55 VERBOSE  _________________________________________________________________
07/25/2021 13:37:55 VERBOSE  upscale_128_1_conv2d_leakyre (None, 16, 16, 512)       0
07/25/2021 13:37:55 VERBOSE  _________________________________________________________________
07/25/2021 13:37:55 VERBOSE  upscale_128_1_pixelshuffler  (None, 32, 32, 128)       0
07/25/2021 13:37:55 VERBOSE  _________________________________________________________________
07/25/2021 13:37:55 VERBOSE  upscale_64_1_conv2d_conv2d ( (None, 32, 32, 256)       295168
07/25/2021 13:37:55 VERBOSE  _________________________________________________________________
07/25/2021 13:37:55 VERBOSE  upscale_64_1_conv2d_leakyrel (None, 32, 32, 256)       0
07/25/2021 13:37:55 VERBOSE  _________________________________________________________________
07/25/2021 13:37:55 VERBOSE  upscale_64_1_pixelshuffler ( (None, 64, 64, 64)        0
07/25/2021 13:37:55 VERBOSE  _________________________________________________________________
07/25/2021 13:37:55 VERBOSE  face_out_b_conv2d (Conv2D)   (None, 64, 64, 3)         4803
07/25/2021 13:37:55 VERBOSE  _________________________________________________________________
07/25/2021 13:37:55 VERBOSE  face_out_b (Activation)      (None, 64, 64, 3)         0
07/25/2021 13:37:55 VERBOSE  =================================================================
07/25/2021 13:37:55 VERBOSE  Total params: 6,199,747
07/25/2021 13:37:55 VERBOSE  Trainable params: 6,199,747
07/25/2021 13:37:55 VERBOSE  Non-trainable params: 0
07/25/2021 13:37:55 VERBOSE  _________________________________________________________________
07/25/2021 13:37:55 INFO     Loading Trainer from Original plugin...
07/25/2021 13:37:55 VERBOSE  Loading config: '/home/lamakaha/faceswap/config/train.ini'
07/25/2021 13:37:55 VERBOSE  Enabled TensorBoard Logging
2021-07-25 13:38:03.000349: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2021-07-25 13:38:03.286694: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2021-07-25 13:38:03.740044: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2021-07-25 13:38:03.767359: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2021-07-25 13:38:03.789992: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2021-07-25 13:38:03.813004: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
07/25/2021 13:38:03 CRITICAL Error caught! Exiting...
07/25/2021 13:38:03 ERROR    Caught exception in thread: '_training_0'
07/25/2021 13:38:06 ERROR    Got Exception on main handler:
Traceback (most recent call last):
  File "/home/lamakaha/faceswap/lib/cli/launcher.py", line 182, in execute_script
    process.process()
  File "/home/lamakaha/faceswap/scripts/train.py", line 190, in process
    self._end_thread(thread, err)
  File "/home/lamakaha/faceswap/scripts/train.py", line 230, in _end_thread
    thread.join()
  File "/home/lamakaha/faceswap/lib/multithreading.py", line 121, in join
    raise thread.err[1].with_traceback(thread.err[2])
  File "/home/lamakaha/faceswap/lib/multithreading.py", line 37, in run
    self._target(*self._args, **self._kwargs)
  File "/home/lamakaha/faceswap/scripts/train.py", line 252, in _training
    raise err
  File "/home/lamakaha/faceswap/scripts/train.py", line 242, in _training
    self._run_training_cycle(model, trainer)
  File "/home/lamakaha/faceswap/scripts/train.py", line 327, in _run_training_cycle
    trainer.train_one_step(viewer, timelapse)
  File "/home/lamakaha/faceswap/plugins/train/trainer/_base.py", line 193, in train_one_step
    loss = self._model.model.train_on_batch(model_inputs, y=model_targets)
  File "/home/lamakaha/miniconda3/envs/faceswap/lib/python3.8/site-packages/tensorflow/python/keras/engine/training.py", line 1348, in train_on_batch
    logs = train_function(iterator)
  File "/home/lamakaha/miniconda3/envs/faceswap/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py", line 580, in __call__
    result = self._call(*args, **kwds)
  File "/home/lamakaha/miniconda3/envs/faceswap/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py", line 644, in _call
    return self._stateless_fn(*args, **kwds)
  File "/home/lamakaha/miniconda3/envs/faceswap/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 2420, in __call__
    return graph_function._filtered_call(args, kwargs)  # pylint: disable=protected-access
  File "/home/lamakaha/miniconda3/envs/faceswap/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 1661, in _filtered_call
    return self._call_flat(
  File "/home/lamakaha/miniconda3/envs/faceswap/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 1745, in _call_flat
    return self._build_call_outputs(self._inference_function.call(
  File "/home/lamakaha/miniconda3/envs/faceswap/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 593, in call
    outputs = execute.execute(
  File "/home/lamakaha/miniconda3/envs/faceswap/lib/python3.8/site-packages/tensorflow/python/eager/execute.py", line 59, in quick_execute
    tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.UnknownError:  Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
	 [[node original/encoder_1/conv_128_0_conv2d/Conv2D (defined at /faceswap/plugins/train/trainer/_base.py:193) ]] [Op:__inference_train_function_8730]

Function call stack:
train_function

07/25/2021 13:38:06 CRITICAL An unexpected crash has occurred. Crash report written to '/home/lamakaha/faceswap/crash_report.2021.07.25.133803951583.log'. You MUST provide this file if seeking assistance. Please verify you are running the latest version of faceswap before reporting
Process exited.

new crash error is attached


Re: crash report while training: Failed to get convolution algorithm. This is probably because cuDNN failed to initializ

Posted: Sun Jul 25, 2021 11:55 am
by torzdf

You still have Cuda installed:

gpu_cuda: 9.1

The version listed by nvidia-smi is the maximum version supported by your driver.

The version that should be being used is the conda package cudatoolkit 10.1.243 h6bb024c_0

Having a system install + a conda install can conflict.

Work on getting 9.1 fully removed from your system.


Re: crash report while training: Failed to get convolution algorithm. This is probably because cuDNN failed to initializ

Posted: Sun Jul 25, 2021 1:59 pm
by lamakaha

completely crashed my computer by uninstalling nvidia ,
runned in circles installing nvidia/cuda,

managed to get 10.1 installed using this guide
https://malukas.lt/blog/cuda-10-1-anaco ... mint-19-3/

still was getting same error....

entered train.ini located in /home/lamakaha/faceswap/config
and set
allow_growth = True

doing this in UI did not help, maybe i did in wrong place.

Anyway - it looks like the system is up and running, thank you very much for your help!!!
now fun begins!!!


Re: crash report while training: Failed to get convolution algorithm. This is probably because cuDNN failed to initializ

Posted: Sun Jul 25, 2021 2:00 pm
by torzdf

Glad to hear it.

You shouldn't have any global Cuda installed though. Whilst it may work now, it will probably break again in future if/when we upgrade.

We install Cuda locally in the Faceswap environment. Any global install may conflict.