Help! Unknown: CUDNN_STATUS_EXECUTION_FAILED Error out of the blue after PC crashed. Reinstall won't Fix

If training is failing to start, and you are not receiving an error message telling you what to do, tell us about it here


Forum rules

Read the FAQs and search the forum before posting a new topic.

This forum is for reporting errors with the Training process. If you want to get tips, or better understand the Training process, then you should look in the Training Discussion forum.

Please mark any answers that fixed your problems so others can find the solutions.

Post Reply
User avatar
hassamc
Posts: 1
Joined: Wed Feb 08, 2023 11:57 am

Help! Unknown: CUDNN_STATUS_EXECUTION_FAILED Error out of the blue after PC crashed. Reinstall won't Fix

Post by hassamc »

All was good until PC crashed, after restart I got this and I can't seem to get rid of it.

Code: Select all

03/19/2024 10:13:07 MainProcess     _training                      generator       _load_generator                DEBUG    Loading generator, side: b, is_display: True,  batch_size: 14
03/19/2024 10:13:07 MainProcess     _training                      generator       __init__                       DEBUG    Initializing PreviewDataGenerator: (model: villain, side: b, images: 11756 , batch_size: 14, config: {'centering': 'face', 'coverage': 87.5, 'icnr_init': False, 'conv_aware_init': False, 'optimizer': 'adam', 'learning_rate': 5e-05, 'epsilon_exponent': -7, 'save_optimizer': 'exit', 'lr_finder_iterations': 1000, 'lr_finder_mode': 'set', 'lr_finder_strength': 'default', 'autoclip': False, 'reflect_padding': False, 'allow_growth': False, 'mixed_precision': False, 'nan_protection': True, 'convert_batchsize': 16, 'loss_function': 'ssim', 'loss_function_2': 'mse', 'loss_weight_2': 100, 'loss_function_3': None, 'loss_weight_3': 0, 'loss_function_4': None, 'loss_weight_4': 0, 'mask_loss_function': 'mse', 'eye_multiplier': 3, 'mouth_multiplier': 2, 'penalized_mask_loss': True, 'mask_type': 'extended', 'mask_blur_kernel': 3, 'mask_threshold': 4, 'learn_mask': False, 'preview_images': 14, 'mask_opacity': 30, 'mask_color': '#ff0000', 'zoom_amount': 5, 'rotation_range': 10, 'shift_range': 5, 'flip_chance': 50, 'color_lightness': 30, 'color_ab': 8, 'color_clahe_chance': 50, 'color_clahe_max_size': 4})
03/19/2024 10:13:07 MainProcess     _training                      generator       _get_output_sizes              DEBUG    side: b, model output shapes: [(None, 128, 128, 3), (None, 128, 128, 3)], output sizes: [128]
03/19/2024 10:13:07 MainProcess     _training                      cache           __init__                       DEBUG    Initializing: RingBuffer (batch_size: 14, image_shape: (128, 128, 6), buffer_size: 2, dtype: uint8
03/19/2024 10:13:07 MainProcess     _training                      cache           __init__                       DEBUG    Initialized: RingBuffer
03/19/2024 10:13:07 MainProcess     _training                      generator       __init__                       DEBUG    Initialized PreviewDataGenerator
03/19/2024 10:13:07 MainProcess     _training                      generator       minibatch_ab                   DEBUG    do_shuffle: True
03/19/2024 10:13:07 MainProcess     _training                      multithreading  __init__                       DEBUG    Initializing BackgroundGenerator: (target: '_run_2', thread_count: 1)
03/19/2024 10:13:07 MainProcess     _training                      multithreading  __init__                       DEBUG    Initialized BackgroundGenerator: '_run_2'
03/19/2024 10:13:07 MainProcess     _training                      multithreading  start                          DEBUG    Starting thread(s): '_run_2'
03/19/2024 10:13:07 MainProcess     _training                      multithreading  start                          DEBUG    Starting thread 1 of 1: '_run_2'
03/19/2024 10:13:07 MainProcess     _run_2                         generator       _minibatch                     DEBUG    Loading minibatch generator: (image_count: 11756, do_shuffle: True)
03/19/2024 10:13:07 MainProcess     _training                      multithreading  start                          DEBUG    Started all threads '_run_2': 1
03/19/2024 10:13:07 MainProcess     _training                      generator       __init__                       DEBUG    Initialized Feeder:
03/19/2024 10:13:07 MainProcess     _training                      lr_finder       __init__                       DEBUG    Initializing LearningRateFinder: (model: <plugins.train.model.villain.Model object at 0x00000112679F1A20>, config: {'centering': 'face', 'coverage': 87.5, 'icnr_init': False, 'conv_aware_init': False, 'optimizer': 'adam', 'learning_rate': 5e-05, 'epsilon_exponent': -7, 'save_optimizer': 'exit', 'lr_finder_iterations': 1000, 'lr_finder_mode': 'set', 'lr_finder_strength': 'default', 'autoclip': False, 'reflect_padding': False, 'allow_growth': False, 'mixed_precision': False, 'nan_protection': True, 'convert_batchsize': 16, 'loss_function': 'ssim', 'loss_function_2': 'mse', 'loss_weight_2': 100, 'loss_function_3': None, 'loss_weight_3': 0, 'loss_function_4': None, 'loss_weight_4': 0, 'mask_loss_function': 'mse', 'eye_multiplier': 3, 'mouth_multiplier': 2, 'penalized_mask_loss': True, 'mask_type': 'extended', 'mask_blur_kernel': 3, 'mask_threshold': 4, 'learn_mask': False, 'preview_images': 14, 'mask_opacity': 30, 'mask_color': '#ff0000', 'zoom_amount': 5, 'rotation_range': 10, 'shift_range': 5, 'flip_chance': 50, 'color_lightness': 30, 'color_ab': 8, 'color_clahe_chance': 50, 'color_clahe_max_size': 4}, feeder: <lib.training.generator.Feeder object at 0x000001126A3F87C0>, stop_factor: 4, beta: 0.98)
03/19/2024 10:13:07 MainProcess     _training                      lr_finder       __init__                       DEBUG    Initialized LearningRateFinder
03/19/2024 10:13:07 MainProcess     _training                      io              save                           DEBUG    Backing up and saving models
03/19/2024 10:13:07 MainProcess     _training                      io              _get_save_averages             DEBUG    Getting save averages
03/19/2024 10:13:07 MainProcess     _training                      io              _get_save_averages             DEBUG    No loss in history
03/19/2024 10:13:07 MainProcess     _training                      io              _get_save_averages             DEBUG    Average losses since last save: []
03/19/2024 10:13:07 MainProcess     _run                           cache           _validate_version              DEBUG    Setting initial extract version: 2.3
03/19/2024 10:13:08 MainProcess     _training                      attrs           create                         DEBUG    Creating converter from 5 to 3
03/19/2024 10:13:08 MainProcess     _run_0                         cache           _validate_version              DEBUG    Setting initial extract version: 2.3
03/19/2024 10:13:08 MainProcess     preview                        preview_cv      _launch                        DEBUG    Waiting for preview image
03/19/2024 10:13:09 MainProcess     preview                        preview_cv      _launch                        DEBUG    Waiting for preview image
03/19/2024 10:13:09 MainProcess     _training                      model           save                           DEBUG    Saving State
03/19/2024 10:13:09 MainProcess     _training                      serializer      save                           DEBUG    filename: E:\Users\Hassam\Documents\EXP\Models\CLIPV3\villain_state.json, data type: <class 'dict'>
03/19/2024 10:13:09 MainProcess     _training                      serializer      _check_extension               DEBUG    Original filename: 'E:\Users\Hassam\Documents\EXP\Models\CLIPV3\villain_state.json', final filename: 'E:\Users\Hassam\Documents\EXP\Models\CLIPV3\villain_state.json'
03/19/2024 10:13:09 MainProcess     _training                      serializer      marshal                        DEBUG    data type: <class 'dict'>
03/19/2024 10:13:09 MainProcess     _training                      serializer      marshal                        DEBUG    returned data type: <class 'bytes'>
03/19/2024 10:13:09 MainProcess     _training                      model           save                           DEBUG    Saved State
03/19/2024 10:13:09 MainProcess     _training                      io              save                           INFO     [Saved model]
03/19/2024 10:13:09 MainProcess     _training                      lr_finder       _train                         INFO     Finding optimal learning rate...
03/19/2024 10:13:10 MainProcess     preview                        preview_cv      _launch                        DEBUG    Waiting for preview image
03/19/2024 10:13:10 MainProcess     _training                      api             converted_call                 DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x0000011265E58FD0>, weight: 1.0, mask_channel: 3)
03/19/2024 10:13:11 MainProcess     preview                        preview_cv      _launch                        DEBUG    Waiting for preview image
03/19/2024 10:13:11 MainProcess     _training                      api             converted_call                 DEBUG    Applying mask from channel 3
03/19/2024 10:13:11 MainProcess     _training                      api             converted_call                 DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x000001122831A9E0>, weight: 3.0, mask_channel: 4)
03/19/2024 10:13:11 MainProcess     _training                      api             converted_call                 DEBUG    Applying mask from channel 4
03/19/2024 10:13:11 MainProcess     _training                      api             converted_call                 DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x000001122831B7C0>, weight: 2.0, mask_channel: 5)
03/19/2024 10:13:11 MainProcess     _training                      api             converted_call                 DEBUG    Applying mask from channel 5
03/19/2024 10:13:11 MainProcess     _training                      api             converted_call                 DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x00000112264EA8F0>, weight: 1.0, mask_channel: 3)
03/19/2024 10:13:11 MainProcess     _training                      api             converted_call                 DEBUG    Applying mask from channel 3
03/19/2024 10:13:11 MainProcess     _training                      api             converted_call                 DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x000001122831B550>, weight: 3.0, mask_channel: 4)
03/19/2024 10:13:11 MainProcess     _training                      api             converted_call                 DEBUG    Applying mask from channel 4
03/19/2024 10:13:11 MainProcess     _training                      api             converted_call                 DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x000001122831A830>, weight: 2.0, mask_channel: 5)
03/19/2024 10:13:11 MainProcess     _training                      api             converted_call                 DEBUG    Applying mask from channel 5
03/19/2024 10:13:11 MainProcess     _training                      api             converted_call                 DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x00000112283199F0>, weight: 1.0, mask_channel: 3)
03/19/2024 10:13:11 MainProcess     _training                      api             converted_call                 DEBUG    Applying mask from channel 3
03/19/2024 10:13:12 MainProcess     _training                      api             converted_call                 DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x0000011228319CC0>, weight: 3.0, mask_channel: 4)
03/19/2024 10:13:12 MainProcess     _training                      api             converted_call                 DEBUG    Applying mask from channel 4
03/19/2024 10:13:12 MainProcess     _training                      api             converted_call                 DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x0000011228319B70>, weight: 2.0, mask_channel: 5)
03/19/2024 10:13:12 MainProcess     _training                      api             converted_call                 DEBUG    Applying mask from channel 5
03/19/2024 10:13:12 MainProcess     _training                      api             converted_call                 DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x0000011228319E40>, weight: 1.0, mask_channel: 3)
03/19/2024 10:13:12 MainProcess     _training                      api             converted_call                 DEBUG    Applying mask from channel 3
03/19/2024 10:13:12 MainProcess     _training                      api             converted_call                 DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x000001122831ACB0>, weight: 3.0, mask_channel: 4)
03/19/2024 10:13:12 MainProcess     _training                      api             converted_call                 DEBUG    Applying mask from channel 4
03/19/2024 10:13:12 MainProcess     _training                      api             converted_call                 DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x0000011228318C40>, weight: 2.0, mask_channel: 5)
03/19/2024 10:13:12 MainProcess     _training                      api             converted_call                 DEBUG    Applying mask from channel 5
03/19/2024 10:13:12 MainProcess     preview                        preview_cv      _launch                        DEBUG    Waiting for preview image
03/19/2024 10:13:13 MainProcess     _training                      api             converted_call                 DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x0000011265E58FD0>, weight: 1.0, mask_channel: 3)
03/19/2024 10:13:13 MainProcess     _training                      api             converted_call                 DEBUG    Applying mask from channel 3
03/19/2024 10:13:13 MainProcess     _training                      api             converted_call                 DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x000001122831A9E0>, weight: 3.0, mask_channel: 4)
03/19/2024 10:13:13 MainProcess     _training                      api             converted_call                 DEBUG    Applying mask from channel 4
03/19/2024 10:13:13 MainProcess     preview                        preview_cv      _launch                        DEBUG    Waiting for preview image
03/19/2024 10:13:13 MainProcess     _training                      api             converted_call                 DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x000001122831B7C0>, weight: 2.0, mask_channel: 5)
03/19/2024 10:13:13 MainProcess     _training                      api             converted_call                 DEBUG    Applying mask from channel 5
03/19/2024 10:13:13 MainProcess     _training                      api             converted_call                 DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x00000112264EA8F0>, weight: 1.0, mask_channel: 3)
03/19/2024 10:13:13 MainProcess     _training                      api             converted_call                 DEBUG    Applying mask from channel 3
03/19/2024 10:13:13 MainProcess     _training                      api             converted_call                 DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x000001122831B550>, weight: 3.0, mask_channel: 4)
03/19/2024 10:13:13 MainProcess     _training                      api             converted_call                 DEBUG    Applying mask from channel 4
03/19/2024 10:13:13 MainProcess     _training                      api             converted_call                 DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x000001122831A830>, weight: 2.0, mask_channel: 5)
03/19/2024 10:13:13 MainProcess     _training                      api             converted_call                 DEBUG    Applying mask from channel 5
03/19/2024 10:13:13 MainProcess     _training                      api             converted_call                 DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x00000112283199F0>, weight: 1.0, mask_channel: 3)
03/19/2024 10:13:13 MainProcess     _training                      api             converted_call                 DEBUG    Applying mask from channel 3
03/19/2024 10:13:13 MainProcess     _training                      api             converted_call                 DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x0000011228319CC0>, weight: 3.0, mask_channel: 4)
03/19/2024 10:13:13 MainProcess     _training                      api             converted_call                 DEBUG    Applying mask from channel 4
03/19/2024 10:13:13 MainProcess     _training                      api             converted_call                 DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x0000011228319B70>, weight: 2.0, mask_channel: 5)
03/19/2024 10:13:13 MainProcess     _training                      api             converted_call                 DEBUG    Applying mask from channel 5
03/19/2024 10:13:13 MainProcess     _training                      api             converted_call                 DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x0000011228319E40>, weight: 1.0, mask_channel: 3)
03/19/2024 10:13:13 MainProcess     _training                      api             converted_call                 DEBUG    Applying mask from channel 3
03/19/2024 10:13:13 MainProcess     _training                      api             converted_call                 DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x000001122831ACB0>, weight: 3.0, mask_channel: 4)
03/19/2024 10:13:13 MainProcess     _training                      api             converted_call                 DEBUG    Applying mask from channel 4
03/19/2024 10:13:13 MainProcess     _training                      api             converted_call                 DEBUG    Processing loss function: (func: <tensorflow.python.keras.engine.compile_utils.LossesContainer object at 0x0000011228318C40>, weight: 2.0, mask_channel: 5)
03/19/2024 10:13:13 MainProcess     _training                      api             converted_call                 DEBUG    Applying mask from channel 5
03/19/2024 10:13:14 MainProcess     preview                        preview_cv      _launch                        DEBUG    Waiting for preview image
03/19/2024 10:13:15 MainProcess     preview                        preview_cv      _launch                        DEBUG    Waiting for preview image
03/19/2024 10:13:16 MainProcess     preview                        preview_cv      _launch                        DEBUG    Waiting for preview image
03/19/2024 10:13:17 MainProcess     preview                        preview_cv      _launch                        DEBUG    Waiting for preview image
03/19/2024 10:13:18 MainProcess     preview                        preview_cv      _launch                        DEBUG    Waiting for preview image
03/19/2024 10:13:19 MainProcess     preview                        preview_cv      _launch                        DEBUG    Waiting for preview image
03/19/2024 10:13:19 MainProcess     _training                      multithreading  run                            DEBUG    Error in thread (_training): Graph execution error:\n\nDetected at node 'villain/encoder/conv_128_0_conv2d/Conv2D_1' defined at (most recent call last):\n    File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\threading.py", line 973, in _bootstrap\n      self._bootstrap_inner()\n    File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\threading.py", line 1016, in _bootstrap_inner\n      self.run()\n    File "C:\Users\Hassam\faceswap\lib\multithreading.py", line 100, in run\n      self._target(*self._args, **self._kwargs)\n    File "C:\Users\Hassam\faceswap\scripts\train.py", line 260, in _training\n      trainer = self._load_trainer(model)\n    File "C:\Users\Hassam\faceswap\scripts\train.py", line 309, in _load_trainer\n      trainer: TrainerBase = base(model,\n    File "C:\Users\Hassam\faceswap\plugins\train\trainer\original.py", line 10, in __init__\n      super().__init__(*args, **kwargs)\n    File "C:\Users\Hassam\faceswap\plugins\train\trainer\_base.py", line 87, in __init__\n      self._exit_early = self._handle_lr_finder()\n    File "C:\Users\Hassam\faceswap\plugins\train\trainer\_base.py", line 162, in _handle_lr_finder\n      success = lrf.find()\n    File "C:\Users\Hassam\faceswap\lib\training\lr_finder.py", line 182, in find\n      self._train()\n    File "C:\Users\Hassam\faceswap\lib\training\lr_finder.py", line 134, in _train\n      loss: list[float] = self._model.model.train_on_batch(model_inputs, y=model_targets)\n    File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\site-packages\keras\engine\training.py", line 2381, in train_on_batch\n      logs = self.train_function(iterator)\n    File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\site-packages\keras\engine\training.py", line 1160, in train_function\n      return step_function(self, iterator)\n    File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\site-packages\keras\engine\training.py", line 1146, in step_function\n      outputs = model.distribute_strategy.run(run_step, args=(data,))\n    File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\site-packages\keras\engine\training.py", line 1135, in run_step\n      outputs = model.train_step(data)\n    File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\site-packages\keras\engine\training.py", line 993, in train_step\n      y_pred = self(x, training=True)\n    File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\site-packages\keras\utils\traceback_utils.py", line 65, in error_handler\n      return fn(*args, **kwargs)\n    File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\site-packages\keras\engine\training.py", line 557, in __call__\n      return super().__call__(*args, **kwargs)\n    File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\site-packages\keras\utils\traceback_utils.py", line 65, in error_handler\n      return fn(*args, **kwargs)\n    File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\site-packages\keras\engine\base_layer.py", line 1097, in __call__\n      outputs = call_fn(inputs, *args, **kwargs)\n    File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\site-packages\keras\utils\traceback_utils.py", line 96, in error_handler\n      return fn(*args, **kwargs)\n    File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\site-packages\keras\engine\functional.py", line 510, in call\n      return self._run_internal_graph(inputs, training=training, mask=mask)\n    File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\site-packages\keras\engine\functional.py", line 667, in _run_internal_graph\n      outputs = node.layer(*args, **kwargs)\n    File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\site-packages\keras\utils\traceback_utils.py", line 65, in error_handler\n      return fn(*args, **kwargs)\n    File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\site-packages\keras\engine\training.py", line 557, in __call__\n      return super().__call__(*args, **kwargs)\n    File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\site-packages\keras\utils\traceback_utils.py", line 65, in error_handler\n      return fn(*args, **kwargs)\n    File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\site-packages\keras\engine\base_layer.py", line 1097, in __call__\n      outputs = call_fn(inputs, *args, **kwargs)\n    File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\site-packages\keras\utils\traceback_utils.py", line 96, in error_handler\n      return fn(*args, **kwargs)\n    File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\site-packages\keras\engine\functional.py", line 510, in call\n      return self._run_internal_graph(inputs, training=training, mask=mask)\n    File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\site-packages\keras\engine\functional.py", line 667, in _run_internal_graph\n      outputs = node.layer(*args, **kwargs)\n    File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\site-packages\keras\utils\traceback_utils.py", line 65, in error_handler\n      return fn(*args, **kwargs)\n    File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\site-packages\keras\engine\base_layer.py", line 1097, in __call__\n      outputs = call_fn(inputs, *args, **kwargs)\n    File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\site-packages\keras\utils\traceback_utils.py", line 96, in error_handler\n      return fn(*args, **kwargs)\n    File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\site-packages\keras\layers\convolutional\base_conv.py", line 283, in call\n      outputs = self.convolution_op(inputs, self.kernel)\n    File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\site-packages\keras\layers\convolutional\base_conv.py", line 255, in convolution_op\n      return tf.nn.convolution(\nNode: 'villain/encoder/conv_128_0_conv2d/Conv2D_1'\nNo algorithm worked!  Error messages:\n  Profiling failure on CUDNN engine 1#TC: UNKNOWN: CUDNN_STATUS_EXECUTION_FAILED\nin tensorflow/stream_executor/cuda/cuda_dnn.cc(4031): 'cudnnConvolutionForward( cudnn.handle(), alpha, input_nd_.handle(), input_data.opaque(), filter_.handle(), filter_data.opaque(), conv_.handle(), ToConvForwardAlgo(algo), scratch_memory.opaque(), scratch_memory.size(), beta, output_nd_.handle(), output_data.opaque())'\n  Profiling failure on CUDNN engine 1: UNKNOWN: CUDNN_STATUS_EXECUTION_FAILED\nin tensorflow/stream_executor/cuda/cuda_dnn.cc(4031): 'cudnnConvolutionForward( cudnn.handle(), alpha, input_nd_.handle(), input_data.opaque(), filter_.handle(), filter_data.opaque(), conv_.handle(), ToConvForwardAlgo(algo), scratch_memory.opaque(), scratch_memory.size(), beta, output_nd_.handle(), output_data.opaque())'\n  Profiling failure on CUDNN engine 0#TC: UNKNOWN: CUDNN_STATUS_EXECUTION_FAILED\nin tensorflow/stream_executor/cuda/cuda_dnn.cc(4031): 'cudnnConvolutionForward( cudnn.handle(), alpha, input_nd_.handle(), input_data.opaque(), filter_.handle(), filter_data.opaque(), conv_.handle(), ToConvForwardAlgo(algo), scratch_memory.opaque(), scratch_memory.size(), beta, output_nd_.handle(), output_data.opaque())'\n  Profiling failure on CUDNN engine 0: UNKNOWN: CUDNN_STATUS_EXECUTION_FAILED\nin tensorflow/stream_executor/cuda/cuda_dnn.cc(4031): 'cudnnConvolutionForward( cudnn.handle(), alpha, input_nd_.handle(), input_data.opaque(), filter_.handle(), filter_data.opaque(), conv_.handle(), ToConvForwardAlgo(algo), scratch_memory.opaque(), scratch_memory.size(), beta, output_nd_.handle(), output_data.opaque())'\n	 [[{{node villain/encoder/conv_128_0_conv2d/Conv2D_1}}]] [Op:__inference_train_function_17140]
03/19/2024 10:13:20 MainProcess     MainThread                     train           _monitor                       DEBUG    Thread error detected
03/19/2024 10:13:20 MainProcess     MainThread                     train           shutdown                       DEBUG    Sending shutdown to preview viewer
03/19/2024 10:13:20 MainProcess     MainThread                     train           _monitor                       DEBUG    Closed Monitor
03/19/2024 10:13:20 MainProcess     MainThread                     train           _end_thread                    DEBUG    Ending Training thread
03/19/2024 10:13:20 MainProcess     MainThread                     train           _end_thread                    CRITICAL Error caught! Exiting...
03/19/2024 10:13:20 MainProcess     MainThread                     multithreading  join                           DEBUG    Joining Threads: '_training'
03/19/2024 10:13:20 MainProcess     MainThread                     multithreading  join                           DEBUG    Joining Thread: '_training'
03/19/2024 10:13:20 MainProcess     MainThread                     multithreading  join                           ERROR    Caught exception in thread: '_training'
Traceback (most recent call last):
  File "C:\Users\Hassam\faceswap\lib\cli\launcher.py", line 225, in execute_script
    process.process()
  File "C:\Users\Hassam\faceswap\scripts\train.py", line 209, in process
    self._end_thread(thread, err)
  File "C:\Users\Hassam\faceswap\scripts\train.py", line 249, in _end_thread
    thread.join()
  File "C:\Users\Hassam\faceswap\lib\multithreading.py", line 224, in join
    raise thread.err[1].with_traceback(thread.err[2])
  File "C:\Users\Hassam\faceswap\lib\multithreading.py", line 100, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Users\Hassam\faceswap\scripts\train.py", line 274, in _training
    raise err
  File "C:\Users\Hassam\faceswap\scripts\train.py", line 260, in _training
    trainer = self._load_trainer(model)
  File "C:\Users\Hassam\faceswap\scripts\train.py", line 309, in _load_trainer
    trainer: TrainerBase = base(model,
  File "C:\Users\Hassam\faceswap\plugins\train\trainer\original.py", line 10, in __init__
    super().__init__(*args, **kwargs)
  File "C:\Users\Hassam\faceswap\plugins\train\trainer\_base.py", line 87, in __init__
    self._exit_early = self._handle_lr_finder()
  File "C:\Users\Hassam\faceswap\plugins\train\trainer\_base.py", line 162, in _handle_lr_finder
    success = lrf.find()
  File "C:\Users\Hassam\faceswap\lib\training\lr_finder.py", line 182, in find
    self._train()
  File "C:\Users\Hassam\faceswap\lib\training\lr_finder.py", line 134, in _train
    loss: list[float] = self._model.model.train_on_batch(model_inputs, y=model_targets)
  File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\site-packages\keras\engine\training.py", line 2381, in train_on_batch
    logs = self.train_function(iterator)
  File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\site-packages\tensorflow\python\util\traceback_utils.py", line 153, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\site-packages\tensorflow\python\eager\execute.py", line 54, in quick_execute
    tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.NotFoundError: Graph execution error:

Detected at node 'villain/encoder/conv_128_0_conv2d/Conv2D_1' defined at (most recent call last):
    File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\threading.py", line 973, in _bootstrap
      self._bootstrap_inner()
    File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\threading.py", line 1016, in _bootstrap_inner
      self.run()
    File "C:\Users\Hassam\faceswap\lib\multithreading.py", line 100, in run
      self._target(*self._args, **self._kwargs)
    File "C:\Users\Hassam\faceswap\scripts\train.py", line 260, in _training
      trainer = self._load_trainer(model)
    File "C:\Users\Hassam\faceswap\scripts\train.py", line 309, in _load_trainer
      trainer: TrainerBase = base(model,
    File "C:\Users\Hassam\faceswap\plugins\train\trainer\original.py", line 10, in __init__
      super().__init__(*args, **kwargs)
    File "C:\Users\Hassam\faceswap\plugins\train\trainer\_base.py", line 87, in __init__
      self._exit_early = self._handle_lr_finder()
    File "C:\Users\Hassam\faceswap\plugins\train\trainer\_base.py", line 162, in _handle_lr_finder
      success = lrf.find()
    File "C:\Users\Hassam\faceswap\lib\training\lr_finder.py", line 182, in find
      self._train()
    File "C:\Users\Hassam\faceswap\lib\training\lr_finder.py", line 134, in _train
      loss: list[float] = self._model.model.train_on_batch(model_inputs, y=model_targets)
    File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\site-packages\keras\engine\training.py", line 2381, in train_on_batch
      logs = self.train_function(iterator)
    File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\site-packages\keras\engine\training.py", line 1160, in train_function
      return step_function(self, iterator)
    File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\site-packages\keras\engine\training.py", line 1146, in step_function
      outputs = model.distribute_strategy.run(run_step, args=(data,))
    File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\site-packages\keras\engine\training.py", line 1135, in run_step
      outputs = model.train_step(data)
    File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\site-packages\keras\engine\training.py", line 993, in train_step
      y_pred = self(x, training=True)
    File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\site-packages\keras\utils\traceback_utils.py", line 65, in error_handler
      return fn(*args, **kwargs)
    File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\site-packages\keras\engine\training.py", line 557, in __call__
      return super().__call__(*args, **kwargs)
    File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\site-packages\keras\utils\traceback_utils.py", line 65, in error_handler
      return fn(*args, **kwargs)
    File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\site-packages\keras\engine\base_layer.py", line 1097, in __call__
      outputs = call_fn(inputs, *args, **kwargs)
    File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\site-packages\keras\utils\traceback_utils.py", line 96, in error_handler
      return fn(*args, **kwargs)
    File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\site-packages\keras\engine\functional.py", line 510, in call
      return self._run_internal_graph(inputs, training=training, mask=mask)
    File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\site-packages\keras\engine\functional.py", line 667, in _run_internal_graph
      outputs = node.layer(*args, **kwargs)
    File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\site-packages\keras\utils\traceback_utils.py", line 65, in error_handler
      return fn(*args, **kwargs)
    File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\site-packages\keras\engine\training.py", line 557, in __call__
      return super().__call__(*args, **kwargs)
    File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\site-packages\keras\utils\traceback_utils.py", line 65, in error_handler
      return fn(*args, **kwargs)
    File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\site-packages\keras\engine\base_layer.py", line 1097, in __call__
      outputs = call_fn(inputs, *args, **kwargs)
    File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\site-packages\keras\utils\traceback_utils.py", line 96, in error_handler
      return fn(*args, **kwargs)
    File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\site-packages\keras\engine\functional.py", line 510, in call
      return self._run_internal_graph(inputs, training=training, mask=mask)
    File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\site-packages\keras\engine\functional.py", line 667, in _run_internal_graph
      outputs = node.layer(*args, **kwargs)
    File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\site-packages\keras\utils\traceback_utils.py", line 65, in error_handler
      return fn(*args, **kwargs)
    File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\site-packages\keras\engine\base_layer.py", line 1097, in __call__
      outputs = call_fn(inputs, *args, **kwargs)
    File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\site-packages\keras\utils\traceback_utils.py", line 96, in error_handler
      return fn(*args, **kwargs)
    File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\site-packages\keras\layers\convolutional\base_conv.py", line 283, in call
      outputs = self.convolution_op(inputs, self.kernel)
    File "C:\Users\Hassam\miniconda3\envs\faceswap\lib\site-packages\keras\layers\convolutional\base_conv.py", line 255, in convolution_op
      return tf.nn.convolution(
Node: 'villain/encoder/conv_128_0_conv2d/Conv2D_1'
No algorithm worked!  Error messages:
  Profiling failure on CUDNN engine 1#TC: UNKNOWN: CUDNN_STATUS_EXECUTION_FAILED
in tensorflow/stream_executor/cuda/cuda_dnn.cc(4031): 'cudnnConvolutionForward( cudnn.handle(), alpha, input_nd_.handle(), input_data.opaque(), filter_.handle(), filter_data.opaque(), conv_.handle(), ToConvForwardAlgo(algo), scratch_memory.opaque(), scratch_memory.size(), beta, output_nd_.handle(), output_data.opaque())'
  Profiling failure on CUDNN engine 1: UNKNOWN: CUDNN_STATUS_EXECUTION_FAILED
in tensorflow/stream_executor/cuda/cuda_dnn.cc(4031): 'cudnnConvolutionForward( cudnn.handle(), alpha, input_nd_.handle(), input_data.opaque(), filter_.handle(), filter_data.opaque(), conv_.handle(), ToConvForwardAlgo(algo), scratch_memory.opaque(), scratch_memory.size(), beta, output_nd_.handle(), output_data.opaque())'
  Profiling failure on CUDNN engine 0#TC: UNKNOWN: CUDNN_STATUS_EXECUTION_FAILED
in tensorflow/stream_executor/cuda/cuda_dnn.cc(4031): 'cudnnConvolutionForward( cudnn.handle(), alpha, input_nd_.handle(), input_data.opaque(), filter_.handle(), filter_data.opaque(), conv_.handle(), ToConvForwardAlgo(algo), scratch_memory.opaque(), scratch_memory.size(), beta, output_nd_.handle(), output_data.opaque())'
  Profiling failure on CUDNN engine 0: UNKNOWN: CUDNN_STATUS_EXECUTION_FAILED
in tensorflow/stream_executor/cuda/cuda_dnn.cc(4031): 'cudnnConvolutionForward( cudnn.handle(), alpha, input_nd_.handle(), input_data.opaque(), filter_.handle(), filter_data.opaque(), conv_.handle(), ToConvForwardAlgo(algo), scratch_memory.opaque(), scratch_memory.size(), beta, output_nd_.handle(), output_data.opaque())'
	 [[{{node villain/encoder/conv_128_0_conv2d/Conv2D_1}}]] [Op:__inference_train_function_17140]

============ System Information ============
backend:             nvidia
encoding:            cp1252
git_branch:          master
git_commits:         63b4d91 Bugfix: Mask tool - correctly name imported mask
gpu_cuda:            No global version found. Check Conda packages for Conda Cuda
gpu_cudnn:           No global version found. Check Conda packages for Conda cuDNN
gpu_devices:         GPU_0: NVIDIA GeForce RTX 3080
gpu_devices_active:  GPU_0
gpu_driver:          555.41
gpu_vram:            GPU_0: 10240MB (150MB free)
os_machine:          AMD64
os_platform:         Windows-10-10.0.26080-SP0
os_release:          10
py_command:          C:\Users\Hassam\faceswap\faceswap.py train -A C:/Users/Hassam/Downloads/Pdown/capture -B E:/Users/Hassam/Documents/EXP/TCat/BIS1024ext1 -m E:/Users/Hassam/Documents/EXP/Models/CLIPV3 -t villain -bs 1 -it 200000 -D default -r -s 250 -ss 10000 -p -L INFO -gui
py_conda_version:    conda 24.3.0
py_implementation:   CPython
py_version:          3.10.13
py_virtual_env:      True
sys_cores:           16
sys_processor:       AMD64 Family 25 Model 97 Stepping 2, AuthenticAMD
sys_ram:             Total: 31893MB, Available: 9442MB, Used: 22451MB, Free: 9442MB

=============== Pip Packages ===============
absl-py==2.1.0
astunparse==1.6.3
cachetools==5.3.3
certifi==2024.2.2
charset-normalizer==3.3.2
colorama @ file:///C:/b/abs_a9ozq0l032/croot/colorama_1672387194846/work
contourpy @ file:///C:/b/abs_853rfy8zse/croot/contourpy_1700583617587/work
cycler @ file:///tmp/build/80754af9/cycler_1637851556182/work
fastcluster @ file:///D:/bld/fastcluster_1695650232190/work
ffmpy @ file:///home/conda/feedstock_root/build_artifacts/ffmpy_1659474992694/work
flatbuffers==24.3.7
fonttools==4.25.0
gast==0.4.0
google-auth==2.28.2
google-auth-oauthlib==0.4.6
google-pasta==0.2.0
grpcio==1.62.1
h5py==3.10.0
idna==3.6
imageio @ file:///C:/b/abs_aeqerw_nps/croot/imageio_1707247365204/work
imageio-ffmpeg==0.4.9
joblib @ file:///C:/b/abs_1anqjntpan/croot/joblib_1685113317150/work
keras==2.10.0
Keras-Preprocessing==1.1.2
kiwisolver @ file:///C:/b/abs_88mdhvtahm/croot/kiwisolver_1672387921783/work
libclang==18.1.1
Markdown==3.6
MarkupSafe==2.1.5
matplotlib @ file:///C:/b/abs_e26vnvd5s1/croot/matplotlib-suite_1698692153288/work
mkl-fft @ file:///C:/b/abs_19i1y8ykas/croot/mkl_fft_1695058226480/work
mkl-random @ file:///C:/b/abs_edwkj1_o69/croot/mkl_random_1695059866750/work
mkl-service==2.4.0
munkres==1.1.4
numexpr @ file:///C:/b/abs_5fucrty5dc/croot/numexpr_1696515448831/work
numpy @ file:///C:/b/abs_c1ywpu18ar/croot/numpy_and_numpy_base_1708638681471/work/dist/numpy-1.26.4-cp310-cp310-win_amd64.whl#sha256=ebb5aa2b36d8afa5ec3231c19e5a1fc75b6d85e7db483f0fb9e77dad58469977
nvidia-ml-py @ file:///home/conda/feedstock_root/build_artifacts/nvidia-ml-py_1698947663801/work
oauthlib==3.2.2
opencv-python==4.9.0.80
opt-einsum==3.3.0
packaging @ file:///C:/b/abs_cc1h2xfosn/croot/packaging_1710807447479/work
Pillow @ file:///C:/b/abs_153xikw91n/croot/pillow_1695134603563/work
ply==3.11
protobuf==3.19.6
psutil @ file:///C:/Windows/Temp/abs_b2c2fd7f-9fd5-4756-95ea-8aed74d0039flsd9qufz/croots/recipe/psutil_1656431277748/work
pyasn1==0.5.1
pyasn1-modules==0.3.0
pyparsing @ file:///C:/Users/BUILDE~1/AppData/Local/Temp/abs_7f_7lba6rl/croots/recipe/pyparsing_1661452540662/work
PyQt5==5.15.10
PyQt5-sip @ file:///C:/b/abs_c0pi2mimq3/croot/pyqt-split_1698769125270/work/pyqt_sip
python-dateutil @ file:///tmp/build/80754af9/python-dateutil_1626374649649/work
pywin32==305.1
pywinpty @ file:///C:/ci_310/pywinpty_1644230983541/work/target/wheels/pywinpty-2.0.2-cp310-none-win_amd64.whl
requests==2.31.0
requests-oauthlib==1.4.0
rsa==4.9
scikit-learn @ file:///C:/b/abs_daon7wm2p4/croot/scikit-learn_1694788586973/work
scipy==1.11.4
sip @ file:///C:/b/abs_edevan3fce/croot/sip_1698675983372/work
six @ file:///tmp/build/80754af9/six_1644875935023/work
tensorboard==2.10.1
tensorboard-data-server==0.6.1
tensorboard-plugin-wit==1.8.1
tensorflow==2.10.1
tensorflow-estimator==2.10.0
tensorflow-io-gcs-filesystem==0.31.0
termcolor==2.4.0
threadpoolctl @ file:///Users/ktietz/demo/mc3/conda-bld/threadpoolctl_1629802263681/work
tomli @ file:///C:/Windows/TEMP/abs_ac109f85-a7b3-4b4d-bcfd-52622eceddf0hy332ojo/croots/recipe/tomli_1657175513137/work
tornado @ file:///C:/b/abs_0cbrstidzg/croot/tornado_1696937003724/work
tqdm @ file:///C:/b/abs_f76j9hg7pv/croot/tqdm_1679561871187/work
typing_extensions==4.10.0
urllib3==2.2.1
Werkzeug==3.0.1
wrapt==1.16.0

============== Conda Packages ==============
# packages in environment at C:\Users\Hassam\miniconda3\envs\faceswap:
#
# Name                    Version                   Build  Channel
absl-py                   2.1.0                    pypi_0    pypi
aom                       3.7.1                h63175ca_0    conda-forge
astunparse                1.6.3                    pypi_0    pypi
blas                      1.0                         mkl  
brotli                    1.0.9                h2bbff1b_7  
brotli-bin                1.0.9                h2bbff1b_7  
bzip2                     1.0.8                h2bbff1b_5  
ca-certificates           2024.3.11            haa95532_0  
cachetools                5.3.3                    pypi_0    pypi
certifi                   2024.2.2                 pypi_0    pypi
charset-normalizer        3.3.2                    pypi_0    pypi
colorama                  0.4.6           py310haa95532_0  
contourpy                 1.2.0           py310h59b6b97_0  
cudatoolkit               11.2.2              h7d7167e_13    conda-forge
cudnn                     8.1.0.77             h3e0f4f4_0    conda-forge
cycler                    0.11.0             pyhd3eb1b0_0  
dav1d                     1.2.1                hcfcfb64_0    conda-forge
expat                     2.6.2                h63175ca_0    conda-forge
fastcluster               1.2.6           py310hecd3228_3    conda-forge
ffmpeg                    6.1.0           gpl_h0859920_103    conda-forge
ffmpy                     0.3.0              pyhb6f538c_0    conda-forge
flatbuffers               24.3.7                   pypi_0    pypi
font-ttf-dejavu-sans-mono 2.37                 hab24e00_0    conda-forge
font-ttf-inconsolata      3.000                h77eed37_0    conda-forge
font-ttf-source-code-pro  2.038                h77eed37_0    conda-forge
font-ttf-ubuntu           0.83                 h77eed37_1    conda-forge
fontconfig                2.14.2               hbde0cde_0    conda-forge
fonts-conda-ecosystem     1                             0    conda-forge
fonts-conda-forge         1                             0    conda-forge
fonttools                 4.25.0             pyhd3eb1b0_0  
freetype                  2.12.1               ha860e81_0  
gast                      0.4.0                    pypi_0    pypi
giflib                    5.2.1                h8cc25b3_3  
git                       2.40.1               haa95532_1  
google-auth               2.28.2                   pypi_0    pypi
google-auth-oauthlib      0.4.6                    pypi_0    pypi
google-pasta              0.2.0                    pypi_0    pypi
grpcio                    1.62.1                   pypi_0    pypi
h5py                      3.10.0                   pypi_0    pypi
icc_rt                    2022.1.0             h6049295_2  
icu                       73.1                 h6c2663c_0  
idna                      3.6                      pypi_0    pypi
imageio                   2.33.1          py310haa95532_0  
imageio-ffmpeg            0.4.9                    pypi_0    pypi
intel-openmp              2023.1.0         h59b6b97_46320  
joblib                    1.2.0           py310haa95532_0  
jpeg                      9e                   h2bbff1b_1  
keras                     2.10.0                   pypi_0    pypi
keras-preprocessing       1.1.2                    pypi_0    pypi
kiwisolver                1.4.4           py310hd77b12b_0  
krb5                      1.20.1               h5b6d351_0  
lerc                      3.0                  hd77b12b_0  
libbrotlicommon           1.0.9                h2bbff1b_7  
libbrotlidec              1.0.9                h2bbff1b_7  
libbrotlienc              1.0.9                h2bbff1b_7  
libclang                  18.1.1                   pypi_0    pypi
libclang13                14.0.6          default_h8e68704_1  
libdeflate                1.17                 h2bbff1b_1  
libexpat                  2.6.2                h63175ca_0    conda-forge
libffi                    3.4.4                hd77b12b_0  
libiconv                  1.17                 hcfcfb64_2    conda-forge
libopus                   1.3.1                h8ffe710_1    conda-forge
libpng                    1.6.39               h8cc25b3_0  
libpq                     12.17                h906ac69_0  
libtiff                   4.5.1                hd77b12b_0  
libwebp                   1.3.2                hbc33d0d_0  
libwebp-base              1.3.2                h2bbff1b_0  
libxml2                   2.12.6               hc3477c8_0    conda-forge
libzlib                   1.2.13               hcfcfb64_5    conda-forge
libzlib-wapi              1.2.13               hcfcfb64_5    conda-forge
lz4-c                     1.9.4                h2bbff1b_0  
markdown                  3.6                      pypi_0    pypi
markupsafe                2.1.5                    pypi_0    pypi
matplotlib                3.8.0           py310haa95532_0  
matplotlib-base           3.8.0           py310h4ed8f06_0  
mkl                       2023.1.0         h6b88ed4_46358  
mkl-service               2.4.0           py310h2bbff1b_1  
mkl_fft                   1.3.8           py310h2bbff1b_0  
mkl_random                1.2.4           py310h59b6b97_0  
munkres                   1.1.4                      py_0  
numexpr                   2.8.7           py310h2cd9be0_0  
numpy                     1.26.4          py310h055cbcc_0  
numpy-base                1.26.4          py310h65a83cf_0  
nvidia-ml-py              12.535.133         pyhd8ed1ab_0    conda-forge
oauthlib                  3.2.2                    pypi_0    pypi
opencv-python             4.9.0.80                 pypi_0    pypi
openh264                  2.4.0                h63175ca_0    conda-forge
openssl                   3.2.1                hcfcfb64_1    conda-forge
opt-einsum                3.3.0                    pypi_0    pypi
packaging                 23.2            py310haa95532_0  
pillow                    9.4.0           py310hd77b12b_1  
pip                       23.3.1          py310haa95532_0  
ply                       3.11            py310haa95532_0  
protobuf                  3.19.6                   pypi_0    pypi
psutil                    5.9.0           py310h2bbff1b_0  
pyasn1                    0.5.1                    pypi_0    pypi
pyasn1-modules            0.3.0                    pypi_0    pypi
pyparsing                 3.0.9           py310haa95532_0  
pyqt                      5.15.10         py310hd77b12b_0  
pyqt5-sip                 12.13.0         py310h2bbff1b_0  
python                    3.10.13              he1021f5_0  
python-dateutil           2.8.2              pyhd3eb1b0_0  
python_abi                3.10                    2_cp310    conda-forge
pywin32                   305             py310h2bbff1b_0  
pywinpty                  2.0.2           py310h5da7b33_0  
qt-main                   5.15.2              h19c9488_10  
requests                  2.31.0                   pypi_0    pypi
requests-oauthlib         1.4.0                    pypi_0    pypi
rsa                       4.9                      pypi_0    pypi
scikit-learn              1.3.0           py310h4ed8f06_1  
scipy                     1.11.4          py310h309d312_0  
setuptools                68.2.2          py310haa95532_0  
sip                       6.7.12          py310hd77b12b_0  
six                       1.16.0             pyhd3eb1b0_1  
sqlite                    3.41.2               h2bbff1b_0  
svt-av1                   1.7.0                h63175ca_0    conda-forge
tbb                       2021.8.0             h59b6b97_0  
tensorboard               2.10.1                   pypi_0    pypi
tensorboard-data-server   0.6.1                    pypi_0    pypi
tensorboard-plugin-wit    1.8.1                    pypi_0    pypi
tensorflow                2.10.1                   pypi_0    pypi
tensorflow-estimator      2.10.0                   pypi_0    pypi
tensorflow-io-gcs-filesystem 0.31.0                   pypi_0    pypi
termcolor                 2.4.0                    pypi_0    pypi
threadpoolctl             2.2.0              pyh0d69192_0  
tk                        8.6.12               h2bbff1b_0  
tomli                     2.0.1           py310haa95532_0  
tornado                   6.3.3           py310h2bbff1b_0  
tqdm                      4.65.0          py310h9909e9c_0  
typing-extensions         4.10.0                   pypi_0    pypi
tzdata                    2024a                h04d1e81_0  
ucrt                      10.0.22621.0         h57928b3_0    conda-forge
urllib3                   2.2.1                    pypi_0    pypi
vc                        14.2                 h21ff451_1  
vc14_runtime              14.38.33130         h82b7239_18    conda-forge
vs2015_runtime            14.38.33130         hcb4865c_18    conda-forge
werkzeug                  3.0.1                    pypi_0    pypi
wheel                     0.41.2          py310haa95532_0  
winpty                    0.4.3                         4  
wrapt                     1.16.0                   pypi_0    pypi
x264                      1!164.3095           h8ffe710_2    conda-forge
x265                      3.5                  h2d74725_3    conda-forge
xz                        5.4.6                h8cc25b3_0  
zlib                      1.2.13               hcfcfb64_5    conda-forge
zlib-wapi                 1.2.13               hcfcfb64_5    conda-forge
zstd                      1.5.5                hd43e919_0  

=============== State File =================
{
  "name": "villain",
  "sessions": {
    "1": {
      "timestamp": 1710857584.2884378,
      "no_logs": false,
      "loss_names": [
        "total",
        "face_a",
        "face_b"
      ],
      "batchsize": 0,
      "iterations": 0,
      "config": {
        "learning_rate": 5e-05,
        "epsilon_exponent": -7,
        "save_optimizer": "exit",
        "autoclip": false,
        "allow_growth": false,
        "mixed_precision": false,
        "nan_protection": true,
        "convert_batchsize": 16,
        "loss_function": "ssim",
        "loss_function_2": "mse",
        "loss_weight_2": 100,
        "loss_function_3": null,
        "loss_weight_3": 0,
        "loss_function_4": null,
        "loss_weight_4": 0,
        "mask_loss_function": "mse",
        "eye_multiplier": 3,
        "mouth_multiplier": 2
      }
    }
  },
  "lowest_avg_loss": {},
  "iterations": 0,
  "mixed_precision_layers": [
    "conv_128_0_conv2d",
    "leaky_re_lu",
    "residual_128_0_conv2d_0",
    "residual_128_0_leakyrelu_1",
    "residual_128_0_conv2d_1",
    "add",
    "residual_128_0_leakyrelu_3",
    "residual_128_1_conv2d_0",
    "residual_128_1_leakyrelu_1",
    "residual_128_1_conv2d_1",
    "add_1",
    "residual_128_1_leakyrelu_3",
    "residual_128_2_conv2d_0",
    "residual_128_2_leakyrelu_1",
    "residual_128_2_conv2d_1",
    "add_2",
    "residual_128_2_leakyrelu_3",
    "residual_128_3_conv2d_0",
    "residual_128_3_leakyrelu_1",
    "residual_128_3_conv2d_1",
    "add_3",
    "residual_128_3_leakyrelu_3",
    "residual_128_4_conv2d_0",
    "residual_128_4_leakyrelu_1",
    "residual_128_4_conv2d_1",
    "add_4",
    "residual_128_4_leakyrelu_3",
    "residual_128_5_conv2d_0",
    "residual_128_5_leakyrelu_1",
    "residual_128_5_conv2d_1",
    "add_5",
    "residual_128_5_leakyrelu_3",
    "residual_128_6_conv2d_0",
    "residual_128_6_leakyrelu_1",
    "residual_128_6_conv2d_1",
    "add_6",
    "residual_128_6_leakyrelu_3",
    "residual_128_7_conv2d_0",
    "residual_128_7_leakyrelu_1",
    "residual_128_7_conv2d_1",
    "add_7",
    "residual_128_7_leakyrelu_3",
    "residual_128_8_conv2d_0",
    "residual_128_8_leakyrelu_1",
    "residual_128_8_conv2d_1",
    "add_8",
    "residual_128_8_leakyrelu_3",
    "residual_128_9_conv2d_0",
    "residual_128_9_leakyrelu_1",
    "residual_128_9_conv2d_1",
    "add_9",
    "residual_128_9_leakyrelu_3",
    "residual_128_10_conv2d_0",
    "residual_128_10_leakyrelu_1",
    "residual_128_10_conv2d_1",
    "add_10",
    "residual_128_10_leakyrelu_3",
    "residual_128_11_conv2d_0",
    "residual_128_11_leakyrelu_1",
    "residual_128_11_conv2d_1",
    "add_11",
    "residual_128_11_leakyrelu_3",
    "residual_128_12_conv2d_0",
    "residual_128_12_leakyrelu_1",
    "residual_128_12_conv2d_1",
    "add_12",
    "residual_128_12_leakyrelu_3",
    "residual_128_13_conv2d_0",
    "residual_128_13_leakyrelu_1",
    "residual_128_13_conv2d_1",
    "add_13",
    "residual_128_13_leakyrelu_3",
    "residual_128_14_conv2d_0",
    "residual_128_14_leakyrelu_1",
    "residual_128_14_conv2d_1",
    "add_14",
    "residual_128_14_leakyrelu_3",
    "residual_128_15_conv2d_0",
    "residual_128_15_leakyrelu_1",
    "residual_128_15_conv2d_1",
    "add_15",
    "residual_128_15_leakyrelu_3",
    "leaky_re_lu_1",
    "add_16",
    "conv_128_1_conv2d",
    "conv_128_1_leakyrelu",
    "pixel_shuffler",
    "conv_128_2_conv2d",
    "conv_128_2_leakyrelu",
    "pixel_shuffler_1",
    "conv_128_3_conv2d",
    "conv_128_3_leakyrelu",
    "separableconv2d_256_0_seperableconv2d",
    "separableconv2d_256_0_relu",
    "conv_512_0_conv2d",
    "conv_512_0_leakyrelu",
    "separableconv2d_1024_0_seperableconv2d",
    "separableconv2d_1024_0_relu",
    "flatten",
    "dense",
    "dense_1",
    "reshape",
    "upscale_512_0_conv2d_conv2d",
    "upscale_512_0_conv2d_leakyrelu",
    "upscale_512_0_pixelshuffler",
    "upscale_512_1_conv2d_conv2d",
    "upscale_512_1_pixelshuffler",
    "leaky_re_lu_2",
    "residual_512_0_conv2d_0",
    "residual_512_0_leakyrelu_1",
    "residual_512_0_conv2d_1",
    "add_17",
    "residual_512_0_leakyrelu_3",
    "upscale_256_0_conv2d_conv2d",
    "upscale_256_0_pixelshuffler",
    "leaky_re_lu_3",
    "residual_256_0_conv2d_0",
    "residual_256_0_leakyrelu_1",
    "residual_256_0_conv2d_1",
    "add_18",
    "residual_256_0_leakyrelu_3",
    "upscale_128_0_conv2d_conv2d",
    "upscale_128_0_pixelshuffler",
    "leaky_re_lu_4",
    "residual_128_16_conv2d_0",
    "residual_128_16_leakyrelu_1",
    "residual_128_16_conv2d_1",
    "add_19",
    "residual_128_16_leakyrelu_3",
    "face_out_a_0_conv2d",
    "upscale_512_2_conv2d_conv2d",
    "upscale_512_2_pixelshuffler",
    "leaky_re_lu_5",
    "residual_512_1_conv2d_0",
    "residual_512_1_leakyrelu_1",
    "residual_512_1_conv2d_1",
    "add_20",
    "residual_512_1_leakyrelu_3",
    "upscale_256_1_conv2d_conv2d",
    "upscale_256_1_pixelshuffler",
    "leaky_re_lu_6",
    "residual_256_1_conv2d_0",
    "residual_256_1_leakyrelu_1",
    "residual_256_1_conv2d_1",
    "add_21",
    "residual_256_1_leakyrelu_3",
    "upscale_128_1_conv2d_conv2d",
    "upscale_128_1_pixelshuffler",
    "leaky_re_lu_7",
    "residual_128_17_conv2d_0",
    "residual_128_17_leakyrelu_1",
    "residual_128_17_conv2d_1",
    "add_22",
    "residual_128_17_leakyrelu_3",
    "face_out_b_0_conv2d"
  ],
  "config": {
    "centering": "face",
    "coverage": 87.5,
    "optimizer": "adam",
    "learning_rate": 5e-05,
    "epsilon_exponent": -7,
    "save_optimizer": "exit",
    "lr_finder_iterations": 1000,
    "lr_finder_mode": "set",
    "lr_finder_strength": "default",
    "autoclip": false,
    "allow_growth": false,
    "mixed_precision": false,
    "nan_protection": true,
    "convert_batchsize": 16,
    "loss_function": "ssim",
    "loss_function_2": "mse",
    "loss_weight_2": 100,
    "loss_function_3": null,
    "loss_weight_3": 0,
    "loss_function_4": null,
    "loss_weight_4": 0,
    "mask_loss_function": "mse",
    "eye_multiplier": 3,
    "mouth_multiplier": 2,
    "penalized_mask_loss": true,
    "mask_type": "extended",
    "mask_blur_kernel": 3,
    "mask_threshold": 4,
    "learn_mask": false,
    "lowmem": false
  }
}

================= Configs ==================
--------- .faceswap ---------
backend:                  nvidia

--------- convert.ini ---------

[color.color_transfer]
clip:                     True
preserve_paper:           True

[color.manual_balance]
colorspace:               HSV
balance_1:                0.0
balance_2:                0.0
balance_3:                0.0
contrast:                 0.0
brightness:               0.0

[color.match_hist]
threshold:                99.0

[mask.mask_blend]
type:                     normalized
kernel_size:              3
passes:                   4
threshold:                4
erosion:                  0.0
erosion_top:              0.0
erosion_bottom:           0.0
erosion_left:             0.0
erosion_right:            0.0

[scaling.sharpen]
method:                   none
amount:                   150
radius:                   0.3
threshold:                5.0

[writer.ffmpeg]
container:                mp4
codec:                    libx264
crf:                      23
preset:                   medium
tune:                     none
profile:                  auto
level:                    auto
skip_mux:                 False

[writer.gif]
fps:                      25
loop:                     0
palettesize:              256
subrectangles:            False

[writer.opencv]
format:                   png
draw_transparent:         False
separate_mask:            False
jpg_quality:              75
png_compress_level:       3

[writer.patch]
start_index:              0
index_offset:             0
number_padding:           6
include_filename:         True
face_index_location:      before
origin:                   bottom-left
empty_frames:             blank
json_output:              False
separate_mask:            False
bit_depth:                16
format:                   png
png_compress_level:       3
tiff_compression_method:  lzw

[writer.pillow]
format:                   png
draw_transparent:         False
separate_mask:            False
optimize:                 False
gif_interlace:            True
jpg_quality:              75
png_compress_level:       3
tif_compression:          tiff_deflate

--------- extract.ini ---------

[global]
allow_growth:             False
aligner_min_scale:        0.07
aligner_max_scale:        2.0
aligner_distance:         22.5
aligner_roll:             45.0
aligner_features:         True
filter_refeed:            True
save_filtered:            False
realign_refeeds:          True
filter_realign:           True

[align.fan]
batch-size:               12

[detect.cv2_dnn]
confidence:               50

[detect.mtcnn]
minsize:                  20
scalefactor:              0.709
batch-size:               8
cpu:                      True
threshold_1:              0.6
threshold_2:              0.7
threshold_3:              0.7

[detect.s3fd]
confidence:               70
batch-size:               4

[mask.bisenet_fp]
batch-size:               8
cpu:                      False
weights:                  faceswap
include_ears:             False
include_hair:             False
include_glasses:          True

[mask.custom]
batch-size:               8
centering:                face
fill:                     False

[mask.unet_dfl]
batch-size:               8

[mask.vgg_clear]
batch-size:               6

[mask.vgg_obstructed]
batch-size:               2

[recognition.vgg_face2]
batch-size:               16
cpu:                      False

--------- gui.ini ---------

[global]
fullscreen:               False
tab:                      extract
options_panel_width:      30
console_panel_height:     20
icon_size:                14
font:                     default
font_size:                9
autosave_last_session:    prompt
timeout:                  120
auto_load_model_stats:    True

--------- train.ini ---------

[global]
centering:                face
coverage:                 87.5
icnr_init:                False
conv_aware_init:          False
optimizer:                adam
learning_rate:            5e-05
epsilon_exponent:         -7
save_optimizer:           exit
lr_finder_iterations:     1000
lr_finder_mode:           set
lr_finder_strength:       default
autoclip:                 False
reflect_padding:          False
allow_growth:             False
mixed_precision:          False
nan_protection:           True
convert_batchsize:        16

[global.loss]
loss_function:            ssim
loss_function_2:          mse
loss_weight_2:            100
loss_function_3:          none
loss_weight_3:            0
loss_function_4:          none
loss_weight_4:            0
mask_loss_function:       mse
eye_multiplier:           3
mouth_multiplier:         2
penalized_mask_loss:      True
mask_type:                extended
mask_blur_kernel:         3
mask_threshold:           4
learn_mask:               False

[model.dfaker]
output_size:              128

[model.dfl_h128]
lowmem:                   False

[model.dfl_sae]
input_size:               128
architecture:             df
autoencoder_dims:         0
encoder_dims:             42
decoder_dims:             21
multiscale_decoder:       False

[model.dlight]
features:                 best
details:                  good
output_size:              256

[model.original]
lowmem:                   False

[model.phaze_a]
output_size:              128
shared_fc:                none
enable_gblock:            True
split_fc:                 True
split_gblock:             False
split_decoders:           False
enc_architecture:         fs_original
enc_scaling:              7
enc_load_weights:         True
bottleneck_type:          dense
bottleneck_norm:          none
bottleneck_size:          1024
bottleneck_in_encoder:    True
fc_depth:                 1
fc_min_filters:           1024
fc_max_filters:           1024
fc_dimensions:            4
fc_filter_slope:          -0.5
fc_dropout:               0.0
fc_upsampler:             upsample2d
fc_upsamples:             1
fc_upsample_filters:      512
fc_gblock_depth:          3
fc_gblock_min_nodes:      512
fc_gblock_max_nodes:      512
fc_gblock_filter_slope:   -0.5
fc_gblock_dropout:        0.0
dec_upscale_method:       subpixel
dec_upscales_in_fc:       0
dec_norm:                 none
dec_min_filters:          64
dec_max_filters:          512
dec_slope_mode:           full
dec_filter_slope:         -0.45
dec_res_blocks:           1
dec_output_kernel:        5
dec_gaussian:             True
dec_skip_last_residual:   True
freeze_layers:            keras_encoder
load_layers:              encoder
fs_original_depth:        4
fs_original_min_filters:  128
fs_original_max_filters:  1024
fs_original_use_alt:      False
mobilenet_width:          1.0
mobilenet_depth:          1
mobilenet_dropout:        0.001
mobilenet_minimalistic:   False

[model.realface]
input_size:               64
output_size:              128
dense_nodes:              1536
complexity_encoder:       128
complexity_decoder:       512

[model.unbalanced]
input_size:               128
lowmem:                   False
nodes:                    1024
complexity_encoder:       128
complexity_decoder_a:     384
complexity_decoder_b:     512

[model.villain]
lowmem:                   False

[trainer.original]
preview_images:           14
mask_opacity:             30
mask_color:               #ff0000
zoom_amount:              5
rotation_range:           10
shift_range:              5
flip_chance:              50
color_lightness:          30
color_ab:                 8
color_clahe_chance:       50
color_clahe_max_size:     4
Last edited by torzdf on Tue Mar 19, 2024 4:04 pm, edited 1 time in total.
User avatar
torzdf
Posts: 2687
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 135 times
Been thanked: 628 times

Re: Help! Unknown: CUDNN_STATUS_EXECUTION_FAILED Error out of the blue after PC crashed. Reinstall won't Fix

Post by torzdf »

Ok, this is odd that it worked before, but doesn't work now.

I would normally suspect model corruption (when you crashed), but this is not a model corruption error.

See if you can start a new model with these settings... If so, then most likely your model file is corrupted.

If not, then this will probably fix your issue: app.php/faqpage#f1r1

My word is final

Post Reply