crash when start training

If training is failing to start, and you are not receiving an error message telling you what to do, tell us about it here


Forum rules

Read the FAQs and search the forum before posting a new topic.

This forum is for reporting errors with the Training process. If you want to get tips, or better understand the Training process, then you should look in the Training Discussion forum.

Please mark any answers that fixed your problems so others can find the solutions.

Locked
User avatar
996537
Posts: 1
Joined: Fri Aug 18, 2023 1:35 pm

crash when start training

Post by 996537 »

i cant start the training no more after i reinstall the environments

many thanks, this is the Crash Report

Code: Select all

08/18/2023 06:28:46 MainProcess     _training                      config          add_item                       DEBUG    Add item: (section: 'trainer.original', title: 'mask_color', datatype: '<class 'str'>', default: '#ff0000', info: 'The RGB hex color to use for the mask overlay in the training preview.', rounding: 'None', min_max: None, choices: colorchooser, gui_radio: False, fixed: True, group: evaluation)
08/18/2023 06:28:46 MainProcess     _training                      config          add_item                       DEBUG    Add item: (section: 'trainer.original', title: 'zoom_amount', datatype: '<class 'int'>', default: '5', info: 'Percentage amount to randomly zoom each training image in and out.', rounding: '1', min_max: (0, 25), choices: None, gui_radio: False, fixed: True, group: image augmentation)
08/18/2023 06:28:46 MainProcess     _training                      config          add_item                       DEBUG    Add item: (section: 'trainer.original', title: 'rotation_range', datatype: '<class 'int'>', default: '10', info: 'Percentage amount to randomly rotate each training image.', rounding: '1', min_max: (0, 25), choices: None, gui_radio: False, fixed: True, group: image augmentation)
08/18/2023 06:28:46 MainProcess     _training                      config          add_item                       DEBUG    Add item: (section: 'trainer.original', title: 'shift_range', datatype: '<class 'int'>', default: '5', info: 'Percentage amount to randomly shift each training image horizontally and vertically.', rounding: '1', min_max: (0, 25), choices: None, gui_radio: False, fixed: True, group: image augmentation)
08/18/2023 06:28:46 MainProcess     _training                      config          add_item                       DEBUG    Add item: (section: 'trainer.original', title: 'flip_chance', datatype: '<class 'int'>', default: '50', info: 'Percentage chance to randomly flip each training image horizontally.\nNB: This is ignored if the 'no-flip' option is enabled', rounding: '1', min_max: (0, 75), choices: None, gui_radio: False, fixed: True, group: image augmentation)
08/18/2023 06:28:46 MainProcess     _training                      config          add_item                       DEBUG    Add item: (section: 'trainer.original', title: 'color_lightness', datatype: '<class 'int'>', default: '30', info: 'Percentage amount to randomly alter the lightness of each training image.\nNB: This is ignored if the 'no-augment-color' option is enabled', rounding: '1', min_max: (0, 75), choices: None, gui_radio: False, fixed: True, group: color augmentation)
08/18/2023 06:28:46 MainProcess     _training                      config          add_item                       DEBUG    Add item: (section: 'trainer.original', title: 'color_ab', datatype: '<class 'int'>', default: '8', info: 'Percentage amount to randomly alter the 'a' and 'b' colors of the L*a*b* color space of each training image.\nNB: This is ignored if the 'no-augment-color' optionis enabled', rounding: '1', min_max: (0, 50), choices: None, gui_radio: False, fixed: True, group: color augmentation)
08/18/2023 06:28:46 MainProcess     _training                      config          add_item                       DEBUG    Add item: (section: 'trainer.original', title: 'color_clahe_chance', datatype: '<class 'int'>', default: '50', info: 'Percentage chance to perform Contrast Limited Adaptive Histogram Equalization on each training image.\nNB: This is ignored if the 'no-augment-color' option is enabled', rounding: '1', min_max: (0, 75), choices: None, gui_radio: False, fixed: False, group: color augmentation)
08/18/2023 06:28:46 MainProcess     _training                      config          add_item                       DEBUG    Add item: (section: 'trainer.original', title: 'color_clahe_max_size', datatype: '<class 'int'>', default: '4', info: 'The grid size dictates how much Contrast Limited Adaptive Histogram Equalization is performed on any training image selected for clahe. Contrast will be applied randomly with a gridsize of 0 up to the maximum. This value is a multiplier calculated from the training image size.\nNB: This is ignored if the 'no-augment-color' option is enabled', rounding: '1', min_max: (1, 8), choices: None, gui_radio: False, fixed: True, group: color augmentation)
08/18/2023 06:28:46 MainProcess     _training                      config          _load_defaults_from_module     DEBUG    Added defaults: trainer.original
08/18/2023 06:28:46 MainProcess     _training                      config          _handle_config                 DEBUG    Handling config: (section: model.original, configfile: 'C:\Users\jiyi\faceswap\config\train.ini')
08/18/2023 06:28:46 MainProcess     _training                      config          _check_exists                  DEBUG    Config file exists: 'C:\Users\jiyi\faceswap\config\train.ini'
08/18/2023 06:28:46 MainProcess     _training                      config          _load_config                   VERBOSE  Loading config: 'C:\Users\jiyi\faceswap\config\train.ini'
08/18/2023 06:28:46 MainProcess     _training                      config          _validate_config               DEBUG    Validating config
08/18/2023 06:28:46 MainProcess     _training                      config          _check_config_change           DEBUG    Default config has not changed
08/18/2023 06:28:46 MainProcess     _training                      config          _check_config_choices          DEBUG    Checking config choices
08/18/2023 06:28:46 MainProcess     _training                      config          _parse_list                    DEBUG    Processed raw option 'keras_encoder' to list ['keras_encoder'] for section 'model.phaze_a', option 'freeze_layers'
08/18/2023 06:28:46 MainProcess     _training                      config          _parse_list                    DEBUG    Processed raw option 'encoder' to list ['encoder'] for section 'model.phaze_a', option 'load_layers'
08/18/2023 06:28:46 MainProcess     _training                      config          _check_config_choices          DEBUG    Checked config choices
08/18/2023 06:28:46 MainProcess     _training                      config          _validate_config               DEBUG    Validated config
08/18/2023 06:28:46 MainProcess     _training                      config          _handle_config                 DEBUG    Handled config
08/18/2023 06:28:46 MainProcess     _training                      config          __init__                       DEBUG    Initialized: Config
08/18/2023 06:28:46 MainProcess     _training                      config          get                            DEBUG    Getting config item: (section: 'global', option: 'learning_rate')
08/18/2023 06:28:46 MainProcess     _training                      config          get                            DEBUG    Returning item: (type: <class 'float'>, value: 5e-05)
08/18/2023 06:28:46 MainProcess     _training                      config          get                            DEBUG    Getting config item: (section: 'global', option: 'epsilon_exponent')
08/18/2023 06:28:46 MainProcess     _training                      config          get                            DEBUG    Returning item: (type: <class 'int'>, value: -7)
08/18/2023 06:28:46 MainProcess     _training                      config          get                            DEBUG    Getting config item: (section: 'global', option: 'save_optimizer')
08/18/2023 06:28:46 MainProcess     _training                      config          get                            DEBUG    Returning item: (type: <class 'str'>, value: exit)
08/18/2023 06:28:46 MainProcess     _training                      config          get                            DEBUG    Getting config item: (section: 'global', option: 'autoclip')
08/18/2023 06:28:46 MainProcess     _training                      config          get                            DEBUG    Returning item: (type: <class 'bool'>, value: False)
08/18/2023 06:28:46 MainProcess     _training                      config          get                            DEBUG    Getting config item: (section: 'global', option: 'allow_growth')
08/18/2023 06:28:46 MainProcess     _training                      config          get                            DEBUG    Returning item: (type: <class 'bool'>, value: False)
08/18/2023 06:28:46 MainProcess     _training                      config          get                            DEBUG    Getting config item: (section: 'global', option: 'mixed_precision')
08/18/2023 06:28:46 MainProcess     _training                      config          get                            DEBUG    Returning item: (type: <class 'bool'>, value: False)
08/18/2023 06:28:46 MainProcess     _training                      config          get                            DEBUG    Getting config item: (section: 'global', option: 'nan_protection')
08/18/2023 06:28:46 MainProcess     _training                      config          get                            DEBUG    Returning item: (type: <class 'bool'>, value: True)
08/18/2023 06:28:46 MainProcess     _training                      config          get                            DEBUG    Getting config item: (section: 'global', option: 'convert_batchsize')
08/18/2023 06:28:46 MainProcess     _training                      config          get                            DEBUG    Returning item: (type: <class 'int'>, value: 16)
08/18/2023 06:28:46 MainProcess     _training                      config          get                            DEBUG    Getting config item: (section: 'global.loss', option: 'loss_function')
08/18/2023 06:28:46 MainProcess     _training                      config          get                            DEBUG    Returning item: (type: <class 'str'>, value: ssim)
08/18/2023 06:28:46 MainProcess     _training                      config          get                            DEBUG    Getting config item: (section: 'global.loss', option: 'loss_function_2')
08/18/2023 06:28:46 MainProcess     _training                      config          get                            DEBUG    Returning item: (type: <class 'str'>, value: mse)
08/18/2023 06:28:46 MainProcess     _training                      config          get                            DEBUG    Getting config item: (section: 'global.loss', option: 'loss_weight_2')
08/18/2023 06:28:46 MainProcess     _training                      config          get                            DEBUG    Returning item: (type: <class 'int'>, value: 100)
08/18/2023 06:28:46 MainProcess     _training                      config          get                            DEBUG    Getting config item: (section: 'global.loss', option: 'loss_function_3')
08/18/2023 06:28:46 MainProcess     _training                      config          get                            DEBUG    Returning item: (type: <class 'str'>, value: None)
08/18/2023 06:28:46 MainProcess     _training                      config          get                            DEBUG    Getting config item: (section: 'global.loss', option: 'loss_weight_3')
08/18/2023 06:28:46 MainProcess     _training                      config          get                            DEBUG    Returning item: (type: <class 'int'>, value: 0)
08/18/2023 06:28:46 MainProcess     _training                      config          get                            DEBUG    Getting config item: (section: 'global.loss', option: 'loss_function_4')
08/18/2023 06:28:46 MainProcess     _training                      config          get                            DEBUG    Returning item: (type: <class 'str'>, value: None)
08/18/2023 06:28:46 MainProcess     _training                      config          get                            DEBUG    Getting config item: (section: 'global.loss', option: 'loss_weight_4')
08/18/2023 06:28:46 MainProcess     _training                      config          get                            DEBUG    Returning item: (type: <class 'int'>, value: 0)
08/18/2023 06:28:46 MainProcess     _training                      config          get                            DEBUG    Getting config item: (section: 'global.loss', option: 'mask_loss_function')
08/18/2023 06:28:46 MainProcess     _training                      config          get                            DEBUG    Returning item: (type: <class 'str'>, value: mse)
08/18/2023 06:28:46 MainProcess     _training                      config          get                            DEBUG    Getting config item: (section: 'global.loss', option: 'eye_multiplier')
08/18/2023 06:28:46 MainProcess     _training                      config          get                            DEBUG    Returning item: (type: <class 'int'>, value: 3)
08/18/2023 06:28:46 MainProcess     _training                      config          get                            DEBUG    Getting config item: (section: 'global.loss', option: 'mouth_multiplier')
08/18/2023 06:28:46 MainProcess     _training                      config          get                            DEBUG    Returning item: (type: <class 'int'>, value: 2)
08/18/2023 06:28:46 MainProcess     _training                      config          changeable_items               DEBUG    Alterable for existing models: {'learning_rate': 5e-05, 'epsilon_exponent': -7, 'save_optimizer': 'exit', 'autoclip': False, 'allow_growth': False, 'mixed_precision': False, 'nan_protection': True, 'convert_batchsize': 16, 'loss_function': 'ssim', 'loss_function_2': 'mse', 'loss_weight_2': 100, 'loss_function_3': None, 'loss_weight_3': 0, 'loss_function_4': None, 'loss_weight_4': 0, 'mask_loss_function': 'mse', 'eye_multiplier': 3, 'mouth_multiplier': 2}
08/18/2023 06:28:46 MainProcess     _training                      model           __init__                       DEBUG    Initializing State: (model_dir: 'G:\face\model', model_name: 'original', config_changeable_items: '{'learning_rate': 5e-05, 'epsilon_exponent': -7, 'save_optimizer': 'exit', 'autoclip': False, 'allow_growth': False, 'mixed_precision': False, 'nan_protection': True, 'convert_batchsize': 16, 'loss_function': 'ssim', 'loss_function_2': 'mse', 'loss_weight_2': 100, 'loss_function_3': None, 'loss_weight_3': 0, 'loss_function_4': None, 'loss_weight_4': 0, 'mask_loss_function': 'mse', 'eye_multiplier': 3, 'mouth_multiplier': 2}', no_logs: False
08/18/2023 06:28:46 MainProcess     _training                      serializer      get_serializer                 DEBUG    <lib.serializer._JSONSerializer object at 0x00000241DFC8C130>
08/18/2023 06:28:46 MainProcess     _training                      model           _load                          DEBUG    Loading State
08/18/2023 06:28:46 MainProcess     _training                      serializer      load                           DEBUG    filename: G:\face\model\original_state.json
08/18/2023 06:28:46 MainProcess     _training                      serializer      load                           DEBUG    stored data type: <class 'bytes'>
08/18/2023 06:28:46 MainProcess     _training                      serializer      unmarshal                      DEBUG    data type: <class 'bytes'>
08/18/2023 06:28:46 MainProcess     _training                      serializer      unmarshal                      DEBUG    returned data type: <class 'dict'>
08/18/2023 06:28:46 MainProcess     _training                      serializer      load                           DEBUG    data type: <class 'dict'>
08/18/2023 06:28:46 MainProcess     _training                      model           _load                          DEBUG    Loaded state: {'name': 'original', 'sessions': {'1': {'timestamp': 1692332226.981266, 'no_logs': False, 'loss_names': ['total', 'face_a', 'face_b'], 'batchsize': 16, 'iterations': 192000, 'config': {'learning_rate': 5e-05, 'epsilon_exponent': -7, 'save_optimizer': 'exit', 'autoclip': False, 'allow_growth': False, 'mixed_precision': False, 'nan_protection': True, 'convert_batchsize': 16, 'loss_function': 'ssim', 'loss_function_2': 'mse', 'loss_weight_2': 100, 'loss_function_3': None, 'loss_weight_3': 0, 'loss_function_4': None, 'loss_weight_4': 0, 'mask_loss_function': 'mse', 'eye_multiplier': 3, 'mouth_multiplier': 2}}}, 'lowest_avg_loss': {'a': 0.00958854053914547, 'b': 0.01177385875210166}, 'iterations': 192000, 'mixed_precision_layers': ['conv_128_0_conv2d', 'conv_128_0_leakyrelu', 'conv_256_0_conv2d', 'conv_256_0_leakyrelu', 'conv_512_0_conv2d', 'conv_512_0_leakyrelu', 'conv_1024_0_conv2d', 'conv_1024_0_leakyrelu', 'flatten', 'dense', 'dense_1', 'reshape', 'upscale_512_0_conv2d_conv2d', 'upscale_512_0_conv2d_leakyrelu', 'upscale_512_0_pixelshuffler', 'upscale_256_0_conv2d_conv2d', 'upscale_256_0_conv2d_leakyrelu', 'upscale_256_0_pixelshuffler', 'upscale_128_0_conv2d_conv2d', 'upscale_128_0_conv2d_leakyrelu', 'upscale_128_0_pixelshuffler', 'upscale_64_0_conv2d_conv2d', 'upscale_64_0_conv2d_leakyrelu', 'upscale_64_0_pixelshuffler', 'face_out_a_0_conv2d', 'upscale_256_1_conv2d_conv2d', 'upscale_256_1_conv2d_leakyrelu', 'upscale_256_1_pixelshuffler', 'upscale_128_1_conv2d_conv2d', 'upscale_128_1_conv2d_leakyrelu', 'upscale_128_1_pixelshuffler', 'upscale_64_1_conv2d_conv2d', 'upscale_64_1_conv2d_leakyrelu', 'upscale_64_1_pixelshuffler', 'face_out_b_0_conv2d'], 'config': {'centering': 'face', 'coverage': 87.5, 'optimizer': 'adam', 'learning_rate': 5e-05, 'epsilon_exponent': -7, 'save_optimizer': 'exit', 'autoclip': False, 'allow_growth': False, 'mixed_precision': False, 'nan_protection': True, 'convert_batchsize': 16, 'loss_function': 'ssim', 'loss_function_2': 'mse', 'loss_weight_2': 100, 'loss_function_3': None, 'loss_weight_3': 0, 'loss_function_4': None, 'loss_weight_4': 0, 'mask_loss_function': 'mse', 'eye_multiplier': 3, 'mouth_multiplier': 2, 'penalized_mask_loss': True, 'mask_type': 'extended', 'mask_blur_kernel': 3, 'mask_threshold': 4, 'learn_mask': False, 'lowmem': False}}
08/18/2023 06:28:46 MainProcess     _training                      model           _update_legacy_config          DEBUG    Checking for legacy state file update
08/18/2023 06:28:46 MainProcess     _training                      model           _update_legacy_config          DEBUG    Legacy item 'dssim_loss' not in config. Skipping update
08/18/2023 06:28:46 MainProcess     _training                      model           _update_legacy_config          DEBUG    Legacy item 'l2_reg_term' not in config. Skipping update
08/18/2023 06:28:46 MainProcess     _training                      model           _update_legacy_config          DEBUG    Legacy item 'clipnorm' not in config. Skipping update
08/18/2023 06:28:46 MainProcess     _training                      model           _update_legacy_config          DEBUG    State file updated for legacy config: False
08/18/2023 06:28:46 MainProcess     _training                      model           _replace_config                DEBUG    Replacing config. Old config: {'centering': 'face', 'coverage': 87.5, 'optimizer': 'adam', 'learning_rate': 5e-05, 'epsilon_exponent': -7, 'save_optimizer': 'exit', 'autoclip': False, 'allow_growth': False, 'mixed_precision': False, 'nan_protection': True, 'convert_batchsize': 16, 'loss_function': 'ssim', 'loss_function_2': 'mse', 'loss_weight_2': 100, 'loss_function_3': None, 'loss_weight_3': 0, 'loss_function_4': None, 'loss_weight_4': 0, 'mask_loss_function': 'mse', 'eye_multiplier': 3, 'mouth_multiplier': 2, 'penalized_mask_loss': True, 'mask_type': 'extended', 'mask_blur_kernel': 3, 'mask_threshold': 4, 'learn_mask': False, 'lowmem': False}
08/18/2023 06:28:46 MainProcess     _training                      model           _replace_config                DEBUG    Replaced config. New config: {'centering': 'face', 'coverage': 87.5, 'optimizer': 'adam', 'learning_rate': 5e-05, 'epsilon_exponent': -7, 'save_optimizer': 'exit', 'autoclip': False, 'allow_growth': False, 'mixed_precision': False, 'nan_protection': True, 'convert_batchsize': 16, 'loss_function': 'ssim', 'loss_function_2': 'mse', 'loss_weight_2': 100, 'loss_function_3': None, 'loss_weight_3': 0, 'loss_function_4': None, 'loss_weight_4': 0, 'mask_loss_function': 'mse', 'eye_multiplier': 3, 'mouth_multiplier': 2, 'penalized_mask_loss': True, 'mask_type': 'extended', 'mask_blur_kernel': 3, 'mask_threshold': 4, 'learn_mask': False, 'lowmem': False}
08/18/2023 06:28:46 MainProcess     _training                      model           _replace_config                INFO     Using configuration saved in state file
08/18/2023 06:28:46 MainProcess     _training                      model           _new_session_id                DEBUG    2
08/18/2023 06:28:46 MainProcess     _training                      model           _create_new_session            DEBUG    Creating new session. id: 2
08/18/2023 06:28:46 MainProcess     _training                      model           __init__                       DEBUG    Initialized State:
08/18/2023 06:28:46 MainProcess     _training                      settings        __init__                       DEBUG    Initializing Settings: (arguments: Namespace(func=<bound method ScriptExecutor.execute_script of <lib.cli.launcher.ScriptExecutor object at 0x00000241D485CAC0>>, exclude_gpus=None, configfile=None, loglevel='INFO', logfile=None, redirect_gui=True, colab=False, input_a='G:\\face\\a', input_b='G:\\face\\b', model_dir='G:\\face\\model', load_weights=None, trainer='original', summary=False, freeze_weights=False, batch_size=16, iterations=1000000, distribution_strategy='default', save_interval=250, snapshot_interval=25000, timelapse_input_a='G:\\face\\a', timelapse_input_b='G:\\face\\b', timelapse_output='G:\\face\\output', preview=False, write_image=False, no_logs=False, warp_to_landmarks=False, no_flip=False, no_augment_color=False, no_warp=False), mixed_precision: False, allow_growth: False, is_predict: False)
08/18/2023 06:28:46 MainProcess     _training                      settings        _set_tf_settings               DEBUG    Not setting any specific Tensorflow settings
08/18/2023 06:28:46 MainProcess     _training                      settings        _set_keras_mixed_precision     DEBUG    use_mixed_precision: False
08/18/2023 06:28:46 MainProcess     _training                      settings        _set_keras_mixed_precision     DEBUG    Disabling mixed precision. (Compute dtype: float32, variable_dtype: float32)
08/18/2023 06:28:46 MainProcess     _training                      settings        _get_strategy                  DEBUG    Using strategy: <tensorflow.python.distribute.distribute_lib._DefaultDistributionStrategy object at 0x00000241DFD6EC80>
08/18/2023 06:28:46 MainProcess     _training                      settings        __init__                       DEBUG    Initialized Settings
08/18/2023 06:28:46 MainProcess     _training                      settings        __init__                       DEBUG    Initializing Loss: (color_order: bgr)
08/18/2023 06:28:46 MainProcess     _training                      settings        _get_mask_channels             DEBUG    uses_masks: (True, True, True), mask_channels: [3, 4, 5]
08/18/2023 06:28:46 MainProcess     _training                      settings        __init__                       DEBUG    Initialized: Loss
08/18/2023 06:28:46 MainProcess     _training                      model           __init__                       DEBUG    Initialized ModelBase (Model)
08/18/2023 06:28:46 MainProcess     _training                      settings        strategy_scope                 DEBUG    Using strategy scope: <tensorflow.python.distribute.distribute_lib._DefaultDistributionContext object at 0x00000241DFAB32C0>
08/18/2023 06:28:46 MainProcess     _training                      io              _load                          DEBUG    Loading model: G:\face\model\original.h5
08/18/2023 06:28:46 MainProcess     _training                      attrs           __getitem__                    DEBUG    Creating converter from 3 to 5
08/18/2023 06:28:47 MainProcess     _training                      multithreading  run                            DEBUG    Error in thread (_training): Multiple OpKernel registrations match NodeDef at the same priority '{{node StatelessRandomGetKeyCounter}}': 'op: "StatelessRandomGetKeyCounter" device_type: "GPU" host_memory_arg: "seed" host_memory_arg: "key" host_memory_arg: "counter"' and 'op: "StatelessRandomGetKeyCounter" device_type: "GPU" host_memory_arg: "seed" host_memory_arg: "key" host_memory_arg: "counter"' [Op:StatelessRandomGetKeyCounter]
08/18/2023 06:28:47 MainProcess     MainThread                     train           _monitor                       DEBUG    Thread error detected
08/18/2023 06:28:47 MainProcess     MainThread                     train           _monitor                       DEBUG    Closed Monitor
08/18/2023 06:28:47 MainProcess     MainThread                     train           _end_thread                    DEBUG    Ending Training thread
08/18/2023 06:28:47 MainProcess     MainThread                     train           _end_thread                    CRITICAL Error caught! Exiting...
08/18/2023 06:28:47 MainProcess     MainThread                     multithreading  join                           DEBUG    Joining Threads: '_training'
08/18/2023 06:28:47 MainProcess     MainThread                     multithreading  join                           DEBUG    Joining Thread: '_training'
08/18/2023 06:28:47 MainProcess     MainThread                     multithreading  join                           ERROR    Caught exception in thread: '_training'
Traceback (most recent call last):
  File "C:\Users\jiyi\faceswap\lib\cli\launcher.py", line 225, in execute_script
    process.process()
  File "C:\Users\jiyi\faceswap\scripts\train.py", line 209, in process
    self._end_thread(thread, err)
  File "C:\Users\jiyi\faceswap\scripts\train.py", line 249, in _end_thread
    thread.join()
  File "C:\Users\jiyi\faceswap\lib\multithreading.py", line 224, in join
    raise thread.err[1].with_traceback(thread.err[2])
  File "C:\Users\jiyi\faceswap\lib\multithreading.py", line 100, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Users\jiyi\faceswap\scripts\train.py", line 271, in _training
    raise err
  File "C:\Users\jiyi\faceswap\scripts\train.py", line 259, in _training
    model = self._load_model()
  File "C:\Users\jiyi\faceswap\scripts\train.py", line 287, in _load_model
    model.build()
  File "C:\Users\jiyi\faceswap\plugins\train\model\_base\model.py", line 256, in build
    model = self._io._load()  # pylint:disable=protected-access
  File "C:\Users\jiyi\faceswap\plugins\train\model\_base\io.py", line 142, in _load
    model = kmodels.load_model(self._filename, compile=False)
  File "C:\Users\jiyi\Anaconda3\envs\faceswap\lib\site-packages\keras\utils\traceback_utils.py", line 70, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "C:\Users\jiyi\Anaconda3\envs\faceswap\lib\site-packages\keras\backend.py", line 2100, in random_uniform
    return tf.random.stateless_uniform(
tensorflow.python.framework.errors_impl.InvalidArgumentError: Multiple OpKernel registrations match NodeDef at the same priority '{{node StatelessRandomGetKeyCounter}}': 'op: "StatelessRandomGetKeyCounter" device_type: "GPU" host_memory_arg: "seed" host_memory_arg: "key" host_memory_arg: "counter"' and 'op: "StatelessRandomGetKeyCounter" device_type: "GPU" host_memory_arg: "seed" host_memory_arg: "key" host_memory_arg: "counter"' [Op:StatelessRandomGetKeyCounter]

============ System Information ============
backend:             directml
encoding:            cp936
git_branch:          master
git_commits:         68a3322 bugfix: train - fix learn_mask training
gpu_cuda:            12.2
gpu_cudnn:           No global version found. Check Conda packages for Conda cuDNN
gpu_devices:         GPU_0: NVIDIA GeForce RTX 3080 Ti, GPU_1: Intel(R) UHD Graphics 770
gpu_devices_active:  GPU_0, GPU_1
gpu_driver:          31.0.15.3699|31.0.101.4338
gpu_vram:            GPU_0: 12100MB (11332MB free), GPU_1: 128MB (15511MB free)
os_machine:          AMD64
os_platform:         Windows-10-10.0.19045-SP0
os_release:          10
py_command:          C:\Users\jiyi\faceswap\faceswap.py train -A G:/face/a -B G:/face/b -m G:/face/model -t original -bs 16 -it 1000000 -D default -s 250 -ss 25000 -tia G:/face/a -tib G:/face/b -to G:/face/output -L INFO -gui
py_conda_version:    Conda is used, but version not found
py_implementation:   CPython
py_version:          3.10.12
py_virtual_env:      True
sys_cores:           20
sys_processor:       Intel64 Family 6 Model 151 Stepping 2, GenuineIntel
sys_ram:             Total: 32559MB, Available: 19441MB, Used: 13118MB, Free: 19441MB

=============== Pip Packages ===============
absl-py==1.4.0
astunparse==1.6.3
cachetools==5.3.1
certifi==2023.7.22
charset-normalizer==3.2.0
colorama @ file:///C:/b/abs_a9ozq0l032/croot/colorama_1672387194846/work
comtypes @ file:///C:/b/abs_1coomkk_86/croot/comtypes_1678829787230/work
contourpy @ file:///C:/b/abs_d5rpy288vc/croots/recipe/contourpy_1663827418189/work
cycler @ file:///tmp/build/80754af9/cycler_1637851556182/work
fastcluster @ file:///D:/bld/fastcluster_1649783460966/work
ffmpy @ file:///home/conda/feedstock_root/build_artifacts/ffmpy_1659474992694/work
flatbuffers==23.5.26
fonttools==4.25.0
gast==0.4.0
google-auth==2.22.0
google-auth-oauthlib==0.4.6
google-pasta==0.2.0
grpcio==1.57.0
h5py==3.9.0
idna==3.4
imageio @ file:///C:/b/abs_27kq2gy1us/croot/imageio_1677879918708/work
imageio-ffmpeg @ file:///home/conda/feedstock_root/build_artifacts/imageio-ffmpeg_1673483481485/work
joblib @ file:///C:/b/abs_1anqjntpan/croot/joblib_1685113317150/work
keras==2.10.0
Keras-Preprocessing==1.1.2
kiwisolver @ file:///C:/b/abs_88mdhvtahm/croot/kiwisolver_1672387921783/work
libclang==16.0.6
Markdown==3.4.4
MarkupSafe==2.1.3
matplotlib @ file:///C:/b/abs_49b2acwxd4/croot/matplotlib-suite_1679593486357/work
mkl-fft==1.3.6
mkl-random @ file:///C:/Users/dev-admin/mkl/mkl_random_1682977971003/work
mkl-service==2.4.0
munkres==1.1.4
numexpr @ file:///C:/b/abs_afm0oewmmt/croot/numexpr_1683221839116/work
numpy @ file:///C:/b/abs_f6napi3n6e/croot/numpy_and_numpy_base_1691091651337/work
nvidia-ml-py==12.535.77
oauthlib==3.2.2
opencv-python==4.8.0.76
opt-einsum==3.3.0
packaging @ file:///C:/b/abs_ed_kb9w6g4/croot/packaging_1678965418855/work
Pillow==9.4.0
ply==3.11
protobuf==3.19.6
psutil @ file:///C:/Windows/Temp/abs_b2c2fd7f-9fd5-4756-95ea-8aed74d0039flsd9qufz/croots/recipe/psutil_1656431277748/work
pyasn1==0.5.0
pyasn1-modules==0.3.0
pyparsing @ file:///C:/Users/BUILDE~1/AppData/Local/Temp/abs_7f_7lba6rl/croots/recipe/pyparsing_1661452540662/work
PyQt5==5.15.7
PyQt5-sip @ file:///C:/Windows/Temp/abs_d7gmd2jg8i/croots/recipe/pyqt-split_1659273064801/work/pyqt_sip
python-dateutil @ file:///tmp/build/80754af9/python-dateutil_1626374649649/work
pywin32==305.1
pywinpty @ file:///C:/ci_310/pywinpty_1644230983541/work/target/wheels/pywinpty-2.0.2-cp310-none-win_amd64.whl
requests==2.31.0
requests-oauthlib==1.3.1
rsa==4.9
scikit-learn @ file:///C:/b/abs_55olq_4gzc/croot/scikit-learn_1690978955123/work
scipy==1.11.1
sip @ file:///C:/Windows/Temp/abs_b8fxd17m2u/croots/recipe/sip_1659012372737/work
six @ file:///tmp/build/80754af9/six_1644875935023/work
tensorboard==2.10.1
tensorboard-data-server==0.6.1
tensorboard-plugin-wit==1.8.1
tensorflow==2.10.1
tensorflow-cpu==2.10.0
tensorflow-directml-plugin==0.4.0.dev230202
tensorflow-estimator==2.10.0
tensorflow-io-gcs-filesystem==0.31.0
tensorflow_intel==2.10.0
termcolor==2.3.0
threadpoolctl @ file:///Users/ktietz/demo/mc3/conda-bld/threadpoolctl_1629802263681/work
toml @ file:///tmp/build/80754af9/toml_1616166611790/work
tornado @ file:///C:/b/abs_61jhmrrua1/croot/tornado_1690848767317/work
tqdm @ file:///C:/b/abs_f76j9hg7pv/croot/tqdm_1679561871187/work
typing_extensions==4.7.1
urllib3==1.26.16
Werkzeug==2.3.7
wrapt==1.15.0

============== Conda Packages ==============
Could not get package list

=============== State File =================
{
  "name": "original",
  "sessions": {
    "1": {
      "timestamp": 1692332226.981266,
      "no_logs": false,
      "loss_names": [
        "total",
        "face_a",
        "face_b"
      ],
      "batchsize": 16,
      "iterations": 192000,
      "config": {
        "learning_rate": 5e-05,
        "epsilon_exponent": -7,
        "save_optimizer": "exit",
        "autoclip": false,
        "allow_growth": false,
        "mixed_precision": false,
        "nan_protection": true,
        "convert_batchsize": 16,
        "loss_function": "ssim",
        "loss_function_2": "mse",
        "loss_weight_2": 100,
        "loss_function_3": null,
        "loss_weight_3": 0,
        "loss_function_4": null,
        "loss_weight_4": 0,
        "mask_loss_function": "mse",
        "eye_multiplier": 3,
        "mouth_multiplier": 2
      }
    }
  },
  "lowest_avg_loss": {
    "a": 0.00958854053914547,
    "b": 0.01177385875210166
  },
  "iterations": 192000,
  "mixed_precision_layers": [
    "conv_128_0_conv2d",
    "conv_128_0_leakyrelu",
    "conv_256_0_conv2d",
    "conv_256_0_leakyrelu",
    "conv_512_0_conv2d",
    "conv_512_0_leakyrelu",
    "conv_1024_0_conv2d",
    "conv_1024_0_leakyrelu",
    "flatten",
    "dense",
    "dense_1",
    "reshape",
    "upscale_512_0_conv2d_conv2d",
    "upscale_512_0_conv2d_leakyrelu",
    "upscale_512_0_pixelshuffler",
    "upscale_256_0_conv2d_conv2d",
    "upscale_256_0_conv2d_leakyrelu",
    "upscale_256_0_pixelshuffler",
    "upscale_128_0_conv2d_conv2d",
    "upscale_128_0_conv2d_leakyrelu",
    "upscale_128_0_pixelshuffler",
    "upscale_64_0_conv2d_conv2d",
    "upscale_64_0_conv2d_leakyrelu",
    "upscale_64_0_pixelshuffler",
    "face_out_a_0_conv2d",
    "upscale_256_1_conv2d_conv2d",
    "upscale_256_1_conv2d_leakyrelu",
    "upscale_256_1_pixelshuffler",
    "upscale_128_1_conv2d_conv2d",
    "upscale_128_1_conv2d_leakyrelu",
    "upscale_128_1_pixelshuffler",
    "upscale_64_1_conv2d_conv2d",
    "upscale_64_1_conv2d_leakyrelu",
    "upscale_64_1_pixelshuffler",
    "face_out_b_0_conv2d"
  ],
  "config": {
    "centering": "face",
    "coverage": 87.5,
    "optimizer": "adam",
    "learning_rate": 5e-05,
    "epsilon_exponent": -7,
    "save_optimizer": "exit",
    "autoclip": false,
    "allow_growth": false,
    "mixed_precision": false,
    "nan_protection": true,
    "convert_batchsize": 16,
    "loss_function": "ssim",
    "loss_function_2": "mse",
    "loss_weight_2": 100,
    "loss_function_3": null,
    "loss_weight_3": 0,
    "loss_function_4": null,
    "loss_weight_4": 0,
    "mask_loss_function": "mse",
    "eye_multiplier": 3,
    "mouth_multiplier": 2,
    "penalized_mask_loss": true,
    "mask_type": "extended",
    "mask_blur_kernel": 3,
    "mask_threshold": 4,
    "learn_mask": false,
    "lowmem": false
  }
}

================= Configs ==================
--------- .faceswap ---------
backend:                  directml

--------- convert.ini ---------

[color.color_transfer]
clip:                     True
preserve_paper:           True

[color.manual_balance]
colorspace:               HSV
balance_1:                0.0
balance_2:                0.0
balance_3:                0.0
contrast:                 0.0
brightness:               0.0

[color.match_hist]
threshold:                99.0

[mask.mask_blend]
type:                     normalized
kernel_size:              3
passes:                   4
threshold:                4
erosion:                  0.0
erosion_top:              0.0
erosion_bottom:           0.0
erosion_left:             0.0
erosion_right:            0.0

[scaling.sharpen]
method:                   none
amount:                   150
radius:                   0.3
threshold:                5.0

[writer.ffmpeg]
container:                mp4
codec:                    libx264
crf:                      23
preset:                   medium
tune:                     none
profile:                  auto
level:                    auto
skip_mux:                 False

[writer.gif]
fps:                      25
loop:                     0
palettesize:              256
subrectangles:            False

[writer.opencv]
format:                   png
draw_transparent:         False
separate_mask:            False
jpg_quality:              75
png_compress_level:       3

[writer.pillow]
format:                   png
draw_transparent:         False
separate_mask:            False
optimize:                 False
gif_interlace:            True
jpg_quality:              75
png_compress_level:       3
tif_compression:          tiff_deflate

--------- extract.ini ---------

[global]
allow_growth:             False
aligner_min_scale:        0.07
aligner_max_scale:        2.0
aligner_distance:         22.5
aligner_roll:             45.0
aligner_features:         True
filter_refeed:            True
save_filtered:            False
realign_refeeds:          True
filter_realign:           True

[align.fan]
batch-size:               12

[detect.cv2_dnn]
confidence:               50

[detect.mtcnn]
minsize:                  20
scalefactor:              0.709
batch-size:               8
cpu:                      True
threshold_1:              0.6
threshold_2:              0.7
threshold_3:              0.7

[detect.s3fd]
confidence:               70
batch-size:               4

[mask.bisenet_fp]
batch-size:               8
cpu:                      False
weights:                  faceswap
include_ears:             False
include_hair:             False
include_glasses:          True

[mask.custom]
batch-size:               8
centering:                face
fill:                     False

[mask.unet_dfl]
batch-size:               8

[mask.vgg_clear]
batch-size:               6

[mask.vgg_obstructed]
batch-size:               2

[recognition.vgg_face2]
batch-size:               16
cpu:                      False

--------- gui.ini ---------

[global]
fullscreen:               False
tab:                      extract
options_panel_width:      30
console_panel_height:     20
icon_size:                14
font:                     default
font_size:                9
autosave_last_session:    prompt
timeout:                  120
auto_load_model_stats:    True

--------- train.ini ---------

[global]
centering:                face
coverage:                 87.5
icnr_init:                False
conv_aware_init:          False
optimizer:                adam
learning_rate:            5e-05
epsilon_exponent:         -7
save_optimizer:           exit
autoclip:                 False
reflect_padding:          False
allow_growth:             False
mixed_precision:          False
nan_protection:           True
convert_batchsize:        16

[global.loss]
loss_function:            ssim
loss_function_2:          mse
loss_weight_2:            100
loss_function_3:          none
loss_weight_3:            0
loss_function_4:          none
loss_weight_4:            0
mask_loss_function:       mse
eye_multiplier:           3
mouth_multiplier:         2
penalized_mask_loss:      True
mask_type:                extended
mask_blur_kernel:         3
mask_threshold:           4
learn_mask:               False

[model.dfaker]
output_size:              128

[model.dfl_h128]
lowmem:                   False

[model.dfl_sae]
input_size:               128
architecture:             df
autoencoder_dims:         0
encoder_dims:             42
decoder_dims:             21
multiscale_decoder:       False

[model.dlight]
features:                 best
details:                  good
output_size:              256

[model.original]
lowmem:                   False

[model.phaze_a]
output_size:              128
shared_fc:                none
enable_gblock:            True
split_fc:                 True
split_gblock:             False
split_decoders:           False
enc_architecture:         fs_original
enc_scaling:              7
enc_load_weights:         True
bottleneck_type:          dense
bottleneck_norm:          none
bottleneck_size:          1024
bottleneck_in_encoder:    True
fc_depth:                 1
fc_min_filters:           1024
fc_max_filters:           1024
fc_dimensions:            4
fc_filter_slope:          -0.5
fc_dropout:               0.0
fc_upsampler:             upsample2d
fc_upsamples:             1
fc_upsample_filters:      512
fc_gblock_depth:          3
fc_gblock_min_nodes:      512
fc_gblock_max_nodes:      512
fc_gblock_filter_slope:   -0.5
fc_gblock_dropout:        0.0
dec_upscale_method:       subpixel
dec_upscales_in_fc:       0
dec_norm:                 none
dec_min_filters:          64
dec_max_filters:          512
dec_slope_mode:           full
dec_filter_slope:         -0.45
dec_res_blocks:           1
dec_output_kernel:        5
dec_gaussian:             True
dec_skip_last_residual:   True
freeze_layers:            keras_encoder
load_layers:              encoder
fs_original_depth:        4
fs_original_min_filters:  128
fs_original_max_filters:  1024
fs_original_use_alt:      False
mobilenet_width:          1.0
mobilenet_depth:          1
mobilenet_dropout:        0.001
mobilenet_minimalistic:   False

[model.realface]
input_size:               64
output_size:              128
dense_nodes:              1536
complexity_encoder:       128
complexity_decoder:       512

[model.unbalanced]
input_size:               128
lowmem:                   False
nodes:                    1024
complexity_encoder:       128
complexity_decoder_a:     384
complexity_decoder_b:     512

[model.villain]
lowmem:                   False

[trainer.original]
preview_images:           14
mask_opacity:             30
mask_color:               #ff0000
zoom_amount:              5
rotation_range:           10
shift_range:              5
flip_chance:              50
color_lightness:          30
color_ab:                 8
color_clahe_chance:       50
color_clahe_max_size:     4
Last edited by torzdf on Fri Aug 18, 2023 10:17 pm, edited 1 time in total.
User avatar
torzdf
Posts: 2687
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 135 times
Been thanked: 628 times

Re: crash when start training

Post by torzdf »

In the first instance, you've installed the DirectML version of Faceswap. You have an Nvidia GPU, so install the Nvidia version.

Also, your Conda packages can't be listed which suggests some kind of issue. I would do this first:
app.php/faqpage#f1r1

My word is final

Locked