Page 1 of 1

CRITICAL An unexpected crash has occurred whilte training

Posted: Tue Mar 15, 2022 4:26 am
by shinjikhang

When I was training, I got an error like this. i'm sure my GPU memory is still available and my input does not contain Chinese language, any answer is appreciated. Thanks.

This is my crash_report_log :

Code: Select all

04/27/2020 15:23:14 MainProcess     _training_0     _base           get_save_averages         DEBUG    Getting save averages
04/27/2020 15:23:14 MainProcess     _training_0     _base           get_save_averages         DEBUG    Average losses since last save: {'a': 0.028464767010882498, 'b': 0.019361245878972113}
04/27/2020 15:23:14 MainProcess     _training_0     _base           check_loss_drop           DEBUG    Loss for 'a' has not dropped
04/27/2020 15:23:14 MainProcess     _training_0     _base           should_backup             DEBUG    Lowest historical save iteration loss average: {'a': 0.024596761912107468, 'b': 0.01553005538880825}
04/27/2020 15:23:14 MainProcess     _training_0     _base           should_backup             DEBUG    Backing up: False
04/27/2020 15:23:14 MainProcess     ThreadPoolExecutor-3645_0 _base           save                      DEBUG    Saving model: 'C:\Users\fulvi\Videos\Ritorno a Casablanca\model dlight\dlight_decoder_A.h5'
04/27/2020 15:23:14 MainProcess     ThreadPoolExecutor-3645_1 _base           save                      DEBUG    Saving model: 'C:\Users\fulvi\Videos\Ritorno a Casablanca\model dlight\dlight_decoder_B.h5'
04/27/2020 15:23:14 MainProcess     ThreadPoolExecutor-3645_2 _base           save                      DEBUG    Saving model: 'C:\Users\fulvi\Videos\Ritorno a Casablanca\model dlight\dlight_encoder.h5'
04/27/2020 15:23:14 MainProcess     ThreadPoolExecutor-3645_3 _base           save                      DEBUG    Saving State
04/27/2020 15:23:14 MainProcess     ThreadPoolExecutor-3645_3 serializer      save                      DEBUG    filename: C:\Users\fulvi\Videos\Ritorno a Casablanca\model dlight\dlight_state.json, data type: <class 'dict'>
04/27/2020 15:23:14 MainProcess     ThreadPoolExecutor-3645_3 serializer      _check_extension          DEBUG    Original filename: 'C:\Users\fulvi\Videos\Ritorno a Casablanca\model dlight\dlight_state.json', final filename: 'C:\Users\fulvi\Videos\Ritorno a Casablanca\model dlight\dlight_state.json'
04/27/2020 15:23:14 MainProcess     ThreadPoolExecutor-3645_3 serializer      marshal                   DEBUG    data type: <class 'dict'>
04/27/2020 15:23:14 MainProcess     ThreadPoolExecutor-3645_3 serializer      marshal                   DEBUG    returned data type: <class 'bytes'>
04/27/2020 15:23:14 MainProcess     ThreadPoolExecutor-3645_3 _base           save                      DEBUG    Saved State
04/27/2020 15:23:19 MainProcess     _training_0     _base           save_models               INFO     [Saved models] - Average since last save: face_loss_A: 0.02846, face_loss_B: 0.01936
04/27/2020 15:26:06 MainProcess     _training_0     _base           generate_preview          DEBUG    Generating preview
04/27/2020 15:26:06 MainProcess     _training_0     _base           largest_face_index        DEBUG    0
04/27/2020 15:26:06 MainProcess     _training_0     _base           compile_sample            DEBUG    Compiling samples: (side: 'a', samples: 16)
04/27/2020 15:26:06 MainProcess     _training_0     _base           generate_preview          DEBUG    Generating preview
04/27/2020 15:26:06 MainProcess     _training_0     _base           largest_face_index        DEBUG    0
04/27/2020 15:26:06 MainProcess     _training_0     _base           compile_sample            DEBUG    Compiling samples: (side: 'b', samples: 16)
04/27/2020 15:26:06 MainProcess     _training_0     _base           show_sample               DEBUG    Showing sample
04/27/2020 15:26:06 MainProcess     _training_0     _base           _get_predictions          DEBUG    Getting Predictions
04/27/2020 15:26:07 MainProcess     _training_0     _base           _get_predictions          DEBUG    Returning predictions: {'a_a': (16, 128, 128, 3), 'b_a': (16, 128, 128, 3), 'a_b': (16, 128, 128, 3), 'b_b': (16, 128, 128, 3)}
04/27/2020 15:26:07 MainProcess     _training_0     _base           _to_full_frame            DEBUG    side: 'a', number of sample arrays: 3, prediction.shapes: [(16, 128, 128, 3), (16, 128, 128, 3)])
04/27/2020 15:26:07 MainProcess     _training_0     _base           _frame_overlay            DEBUG    full_size: 256, target_size: 176, color: (0, 0, 255)
04/27/2020 15:26:07 MainProcess     _training_0     _base           _frame_overlay            DEBUG    Overlayed background. Shape: (16, 256, 256, 3)
04/27/2020 15:26:07 MainProcess     _training_0     _base           _compile_masked           DEBUG    masked shapes: [(16, 128, 128, 3), (16, 128, 128, 3), (16, 128, 128, 3)]
04/27/2020 15:26:07 MainProcess     _training_0     _base           _resize_sample            DEBUG    Resizing sample: (side: 'a', sample.shape: (16, 128, 128, 3), target_size: 176, scale: 1.375)
04/27/2020 15:26:07 MainProcess     _training_0     _base           _resize_sample            DEBUG    Resized sample: (side: 'a' shape: (16, 176, 176, 3))
04/27/2020 15:26:07 MainProcess     _training_0     _base           _resize_sample            DEBUG    Resizing sample: (side: 'a', sample.shape: (16, 128, 128, 3), target_size: 176, scale: 1.375)
04/27/2020 15:26:07 MainProcess     _training_0     _base           _resize_sample            DEBUG    Resized sample: (side: 'a' shape: (16, 176, 176, 3))
04/27/2020 15:26:07 MainProcess     _training_0     _base           _resize_sample            DEBUG    Resizing sample: (side: 'a', sample.shape: (16, 128, 128, 3), target_size: 176, scale: 1.375)
04/27/2020 15:26:07 MainProcess     _training_0     _base           _resize_sample            DEBUG    Resized sample: (side: 'a' shape: (16, 176, 176, 3))
04/27/2020 15:26:07 MainProcess     _training_0     _base           _overlay_foreground       DEBUG    Overlayed foreground. Shape: (16, 256, 256, 3)
04/27/2020 15:26:07 MainProcess     _training_0     _base           _overlay_foreground       DEBUG    Overlayed foreground. Shape: (16, 256, 256, 3)
04/27/2020 15:26:07 MainProcess     _training_0     _base           _overlay_foreground       DEBUG    Overlayed foreground. Shape: (16, 256, 256, 3)
04/27/2020 15:26:07 MainProcess     _training_0     _base           _resize_sample            DEBUG    Resizing sample: (side: 'a', sample.shape: (16, 256, 256, 3), target_size: 102, scale: 0.3984375)
04/27/2020 15:26:07 MainProcess     _training_0     _base           _resize_sample            DEBUG    Resized sample: (side: 'a' shape: (16, 102, 102, 3))
04/27/2020 15:26:07 MainProcess     _training_0     _base           _resize_sample            DEBUG    Resizing sample: (side: 'a', sample.shape: (16, 256, 256, 3), target_size: 102, scale: 0.3984375)
04/27/2020 15:26:07 MainProcess     _training_0     _base           _resize_sample            DEBUG    Resized sample: (side: 'a' shape: (16, 102, 102, 3))
04/27/2020 15:26:07 MainProcess     _training_0     _base           _resize_sample            DEBUG    Resizing sample: (side: 'a', sample.shape: (16, 256, 256, 3), target_size: 102, scale: 0.3984375)
04/27/2020 15:26:07 MainProcess     _training_0     _base           _resize_sample            DEBUG    Resized sample: (side: 'a' shape: (16, 102, 102, 3))
04/27/2020 15:26:07 MainProcess     _training_0     _base           _get_headers              DEBUG    side: 'a', width: 102
04/27/2020 15:26:07 MainProcess     _training_0     _base           _get_headers              DEBUG    height: 25, total_width: 306
04/27/2020 15:26:07 MainProcess     _training_0     _base           _get_headers              DEBUG    texts: ['Original (A)', 'Original > Original', 'Original > Swap'], text_sizes: [(58, 8), (93, 8), (82, 8)], text_x: [22, 106, 214], text_y: 16
04/27/2020 15:26:07 MainProcess     _training_0     _base           _get_headers              DEBUG    header_box.shape: (25, 306, 3)
04/27/2020 15:26:07 MainProcess     _training_0     _base           _to_full_frame            DEBUG    side: 'b', number of sample arrays: 3, prediction.shapes: [(16, 128, 128, 3), (16, 128, 128, 3)])
04/27/2020 15:26:07 MainProcess     _training_0     _base           _frame_overlay            DEBUG    full_size: 256, target_size: 176, color: (0, 0, 255)
04/27/2020 15:26:07 MainProcess     _training_0     _base           _frame_overlay            DEBUG    Overlayed background. Shape: (16, 256, 256, 3)
04/27/2020 15:26:07 MainProcess     _training_0     _base           _compile_masked           DEBUG    masked shapes: [(16, 128, 128, 3), (16, 128, 128, 3), (16, 128, 128, 3)]
04/27/2020 15:26:07 MainProcess     _training_0     _base           _resize_sample            DEBUG    Resizing sample: (side: 'b', sample.shape: (16, 128, 128, 3), target_size: 176, scale: 1.375)
04/27/2020 15:26:07 MainProcess     _training_0     _base           _resize_sample            DEBUG    Resized sample: (side: 'b' shape: (16, 176, 176, 3))
04/27/2020 15:26:07 MainProcess     _training_0     _base           _resize_sample            DEBUG    Resizing sample: (side: 'b', sample.shape: (16, 128, 128, 3), target_size: 176, scale: 1.375)
04/27/2020 15:26:07 MainProcess     _training_0     _base           _resize_sample            DEBUG    Resized sample: (side: 'b' shape: (16, 176, 176, 3))
04/27/2020 15:26:07 MainProcess     _training_0     _base           _resize_sample            DEBUG    Resizing sample: (side: 'b', sample.shape: (16, 128, 128, 3), target_size: 176, scale: 1.375)
04/27/2020 15:26:07 MainProcess     _training_0     _base           _resize_sample            DEBUG    Resized sample: (side: 'b' shape: (16, 176, 176, 3))
04/27/2020 15:26:07 MainProcess     _training_0     _base           _overlay_foreground       DEBUG    Overlayed foreground. Shape: (16, 256, 256, 3)
04/27/2020 15:26:07 MainProcess     _training_0     _base           _overlay_foreground       DEBUG    Overlayed foreground. Shape: (16, 256, 256, 3)
04/27/2020 15:26:07 MainProcess     _training_0     _base           _overlay_foreground       DEBUG    Overlayed foreground. Shape: (16, 256, 256, 3)
04/27/2020 15:26:07 MainProcess     _training_0     _base           _resize_sample            DEBUG    Resizing sample: (side: 'b', sample.shape: (16, 256, 256, 3), target_size: 102, scale: 0.3984375)
04/27/2020 15:26:07 MainProcess     _training_0     _base           _resize_sample            DEBUG    Resized sample: (side: 'b' shape: (16, 102, 102, 3))
04/27/2020 15:26:07 MainProcess     _training_0     _base           _resize_sample            DEBUG    Resizing sample: (side: 'b', sample.shape: (16, 256, 256, 3), target_size: 102, scale: 0.3984375)
04/27/2020 15:26:07 MainProcess     _training_0     _base           _resize_sample            DEBUG    Resized sample: (side: 'b' shape: (16, 102, 102, 3))
04/27/2020 15:26:07 MainProcess     _training_0     _base           _resize_sample            DEBUG    Resizing sample: (side: 'b', sample.shape: (16, 256, 256, 3), target_size: 102, scale: 0.3984375)
04/27/2020 15:26:07 MainProcess     _training_0     _base           _resize_sample            DEBUG    Resized sample: (side: 'b' shape: (16, 102, 102, 3))
04/27/2020 15:26:07 MainProcess     _training_0     _base           _get_headers              DEBUG    side: 'b', width: 102
04/27/2020 15:26:07 MainProcess     _training_0     _base           _get_headers              DEBUG    height: 25, total_width: 306
04/27/2020 15:26:07 MainProcess     _training_0     _base           _get_headers              DEBUG    texts: ['Swap (B)', 'Swap > Swap', 'Swap > Original'], text_sizes: [(47, 8), (70, 8), (82, 8)], text_x: [27, 118, 214], text_y: 16
04/27/2020 15:26:07 MainProcess     _training_0     _base           _get_headers              DEBUG    header_box.shape: (25, 306, 3)
04/27/2020 15:26:07 MainProcess     _training_0     _base           _duplicate_headers        DEBUG    side: a header.shape: (25, 306, 3)
04/27/2020 15:26:07 MainProcess     _training_0     _base           _duplicate_headers        DEBUG    side: b header.shape: (25, 306, 3)
04/27/2020 15:26:07 MainProcess     _training_0     _base           _stack_images             DEBUG    Stack images
04/27/2020 15:26:07 MainProcess     _training_0     _base           get_transpose_axes        DEBUG    Even number of images to stack
04/27/2020 15:26:07 MainProcess     _training_0     _base           _stack_images             DEBUG    Stacked images
04/27/2020 15:26:07 MainProcess     _training_0     _base           show_sample               DEBUG    Compiled sample
04/27/2020 15:26:07 MainProcess     _training_0     _base           save_models               DEBUG    Backing up and saving models
04/27/2020 15:26:07 MainProcess     _training_0     _base           get_save_averages         DEBUG    Getting save averages
04/27/2020 15:26:07 MainProcess     _training_0     _base           get_save_averages         DEBUG    Average losses since last save: {'a': 0.02829898800700903, 'b': 0.018589067393913864}
04/27/2020 15:26:07 MainProcess     _training_0     _base           check_loss_drop           DEBUG    Loss for 'a' has not dropped
04/27/2020 15:26:07 MainProcess     _training_0     _base           should_backup             DEBUG    Lowest historical save iteration loss average: {'a': 0.024596761912107468, 'b': 0.01553005538880825}
04/27/2020 15:26:07 MainProcess     _training_0     _base           should_backup             DEBUG    Backing up: False
04/27/2020 15:26:07 MainProcess     ThreadPoolExecutor-4048_0 _base           save                      DEBUG    Saving model: 'C:\Users\fulvi\Videos\Ritorno a Casablanca\model dlight\dlight_decoder_A.h5'
04/27/2020 15:26:07 MainProcess     ThreadPoolExecutor-4048_1 _base           save                      DEBUG    Saving model: 'C:\Users\fulvi\Videos\Ritorno a Casablanca\model dlight\dlight_decoder_B.h5'
04/27/2020 15:26:07 MainProcess     ThreadPoolExecutor-4048_2 _base           save                      DEBUG    Saving model: 'C:\Users\fulvi\Videos\Ritorno a Casablanca\model dlight\dlight_encoder.h5'
04/27/2020 15:26:07 MainProcess     ThreadPoolExecutor-4048_3 _base           save                      DEBUG    Saving State
04/27/2020 15:26:07 MainProcess     ThreadPoolExecutor-4048_3 serializer      save                      DEBUG    filename: C:\Users\fulvi\Videos\Ritorno a Casablanca\model dlight\dlight_state.json, data type: <class 'dict'>
04/27/2020 15:26:07 MainProcess     ThreadPoolExecutor-4048_3 serializer      _check_extension          DEBUG    Original filename: 'C:\Users\fulvi\Videos\Ritorno a Casablanca\model dlight\dlight_state.json', final filename: 'C:\Users\fulvi\Videos\Ritorno a Casablanca\model dlight\dlight_state.json'
04/27/2020 15:26:07 MainProcess     ThreadPoolExecutor-4048_3 serializer      marshal                   DEBUG    data type: <class 'dict'>
04/27/2020 15:26:07 MainProcess     ThreadPoolExecutor-4048_3 serializer      marshal                   DEBUG    returned data type: <class 'bytes'>
04/27/2020 15:26:07 MainProcess     ThreadPoolExecutor-4048_3 _base           save                      DEBUG    Saved State
04/27/2020 15:26:11 MainProcess     _training_0     _base           save_models               INFO     [Saved models] - Average since last save: face_loss_A: 0.02830, face_loss_B: 0.01859
04/27/2020 15:26:29 MainProcess     _training_0     multithreading  run                       DEBUG    Error in thread (_training_0): 2 root error(s) found.\n  (0) Internal: Dst tensor is not initialized.\n	 [[{{node training_1/Adam/Variable_24/read}}]]\n	 [[training_1/Adam/add_179/_1129]]\n  (1) Internal: Dst tensor is not initialized.\n	 [[{{node training_1/Adam/Variable_24/read}}]]\n0 successful operations.\n0 derived errors ignored.
04/27/2020 15:26:30 MainProcess     MainThread      train           _monitor                  DEBUG    Thread error detected
04/27/2020 15:26:30 MainProcess     MainThread      train           _monitor                  DEBUG    Closed Monitor
04/27/2020 15:26:30 MainProcess     MainThread      train           _end_thread               DEBUG    Ending Training thread
04/27/2020 15:26:30 MainProcess     MainThread      train           _end_thread               CRITICAL Error caught! Exiting...
04/27/2020 15:26:30 MainProcess     MainThread      multithreading  join                      DEBUG    Joining Threads: '_training'
04/27/2020 15:26:30 MainProcess     MainThread      multithreading  join                      DEBUG    Joining Thread: '_training_0'
04/27/2020 15:26:30 MainProcess     MainThread      multithreading  join                      ERROR    Caught exception in thread: '_training_0'
Traceback (most recent call last):
  File "C:\Users\fulvi\faceswap\lib\cli\launcher.py", line 155, in execute_script
    process.process()
  File "C:\Users\fulvi\faceswap\scripts\train.py", line 161, in process
    self._end_thread(thread, err)
  File "C:\Users\fulvi\faceswap\scripts\train.py", line 201, in _end_thread
    thread.join()
  File "C:\Users\fulvi\faceswap\lib\multithreading.py", line 121, in join
    raise thread.err[1].with_traceback(thread.err[2])
  File "C:\Users\fulvi\faceswap\lib\multithreading.py", line 37, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Users\fulvi\faceswap\scripts\train.py", line 226, in _training
    raise err
  File "C:\Users\fulvi\faceswap\scripts\train.py", line 216, in _training
    self._run_training_cycle(model, trainer)
  File "C:\Users\fulvi\faceswap\scripts\train.py", line 305, in _run_training_cycle
    trainer.train_one_step(viewer, timelapse)
  File "C:\Users\fulvi\faceswap\plugins\train\trainer\_base.py", line 316, in train_one_step
    raise err
  File "C:\Users\fulvi\faceswap\plugins\train\trainer\_base.py", line 283, in train_one_step
    loss[side] = batcher.train_one_batch()
  File "C:\Users\fulvi\faceswap\plugins\train\trainer\_base.py", line 424, in train_one_batch
    loss = self._model.predictors[self._side].train_on_batch(model_inputs, model_targets)
  File "C:\Users\fulvi\MiniConda3\envs\faceswap\lib\site-packages\keras\engine\training.py", line 1217, in train_on_batch
    outputs = self.train_function(ins)
  File "C:\Users\fulvi\MiniConda3\envs\faceswap\lib\site-packages\keras\backend\tensorflow_backend.py", line 2715, in __call__
    return self._call(inputs)
  File "C:\Users\fulvi\MiniConda3\envs\faceswap\lib\site-packages\keras\backend\tensorflow_backend.py", line 2675, in _call
    fetched = self._callable_fn(*array_vals)
  File "C:\Users\fulvi\MiniConda3\envs\faceswap\lib\site-packages\tensorflow_core\python\client\session.py", line 1472, in __call__
    run_metadata_ptr)
tensorflow.python.framework.errors_impl.InternalError: 2 root error(s) found.
  (0) Internal: Dst tensor is not initialized.
	 [[{{node training_1/Adam/Variable_24/read}}]]
	 [[training_1/Adam/add_179/_1129]]
  (1) Internal: Dst tensor is not initialized.
	 [[{{node training_1/Adam/Variable_24/read}}]]
0 successful operations.
0 derived errors ignored.

============ System Information ============
encoding:            cp1252
git_branch:          master
git_commits:         d5f42b6 Bugfix: lib.gui.project - Reset invalid choices to default if an invalid choice is discovered when loading a .fsw file. cbba53e Core Updates (#1015)
gpu_cuda:            No global version found. Check Conda packages for Conda Cuda
gpu_cudnn:           No global version found. Check Conda packages for Conda cuDNN
gpu_devices:         GPU_0: GeForce GTX 1050
gpu_devices_active:  GPU_0
gpu_driver:          445.87
gpu_vram:            GPU_0: 2048MB
os_machine:          AMD64
os_platform:         Windows-10-10.0.18362-SP0
os_release:          10
py_command:          C:\Users\fulvi\faceswap\faceswap.py train -A C:/Users/fulvi/Videos/Ritorno a Casablanca/FG_emasked -ala C:/Users/fulvi/Videos/Ritorno a Casablanca/FG_emasked/alignments_merged_20200426_193602.fsa -B C:/Users/fulvi/Videos/Ritorno a Casablanca/HB_emasked -alb C:/Users/fulvi/Videos/Ritorno a Casablanca/HB_emasked/alignments_merged_20200425_182140.fsa -m C:/Users/fulvi/Videos/Ritorno a Casablanca/model dlight -t dlight -bs 4 -it 1250000 -g 1 -o -s 200 -ss 25000 -ps 40 -p -L INFO -gui
py_conda_version:    conda 4.8.3
py_implementation:   CPython
py_version:          3.7.7
py_virtual_env:      True
sys_cores:           8
sys_processor:       Intel64 Family 6 Model 158 Stepping 9, GenuineIntel
sys_ram:             Total: 16303MB, Available: 7685MB, Used: 8618MB, Free: 7685MB

=============== Pip Packages ===============
absl-py==0.9.0
asn1crypto==1.3.0
astor==0.8.0
blinker==1.4
cachetools==3.1.1
certifi==2020.4.5.1
cffi==1.14.0
chardet==3.0.4
click==7.1.1
cloudpickle==1.3.0
cryptography==2.8
cycler==0.10.0
cytoolz==0.10.1
dask==2.14.0
decorator==4.4.2
fastcluster==1.1.26
ffmpy==0.2.2
gast==0.2.2
google-auth==1.13.1
google-auth-oauthlib==0.4.1
google-pasta==0.2.0
grpcio==1.27.2
h5py==2.9.0
idna==2.9
imageio==2.6.1
imageio-ffmpeg==0.4.1
joblib==0.14.1
Keras==2.2.4
Keras-Applications==1.0.8
Keras-Preprocessing==1.1.0
kiwisolver==1.1.0
Markdown==3.1.1
matplotlib==3.1.3
mkl-fft==1.0.15
mkl-random==1.1.0
mkl-service==2.3.0
networkx==2.4
numpy==1.17.4
nvidia-ml-py3==7.352.1
oauthlib==3.1.0
olefile==0.46
opencv-python==4.1.2.30
opt-einsum==3.1.0
pathlib==1.0.1
Pillow==6.2.1
protobuf==3.11.4
psutil==5.7.0
pyasn1==0.4.8
pyasn1-modules==0.2.7
pycparser==2.20
PyJWT==1.7.1
pyOpenSSL==19.1.0
pyparsing==2.4.6
pyreadline==2.1
PySocks==1.7.1
python-dateutil==2.8.1
pytz==2019.3
PyWavelets==1.1.1
pywin32==227
PyYAML==5.3.1
requests==2.23.0
requests-oauthlib==1.3.0
rsa==4.0
scikit-image==0.16.2
scikit-learn==0.22.1
scipy==1.4.1
six==1.14.0
tensorboard==2.1.0
tensorflow==1.15.0
tensorflow-estimator==1.15.1
termcolor==1.1.0
toolz==0.10.0
toposort==1.5
tornado==6.0.4
tqdm==4.45.0
urllib3==1.25.8
Werkzeug==0.16.1
win-inet-pton==1.1.0
wincertstore==0.2
wrapt==1.12.1

============== Conda Packages ==============
# packages in environment at C:\Users\fulvi\MiniConda3\envs\faceswap:
#
# Name                    Version                   Build  Channel
_tflow_select             2.1.0                       gpu  
absl-py 0.9.0 py37_0
asn1crypto 1.3.0 py37_0
astor 0.8.0 py37_0
blas 1.0 mkl
blinker 1.4 py37_0
ca-certificates 2020.1.1 0
cachetools 3.1.1 py_0
certifi 2020.4.5.1 py37_0
cffi 1.14.0 py37h7a1dbc1_0
chardet 3.0.4 py37_1003
click 7.1.1 py_0
cloudpickle 1.3.0 py_0
cryptography 2.8 py37h7a1dbc1_0
cudatoolkit 10.0.130 0
cudnn 7.6.5 cuda10.0_0
cycler 0.10.0 py37_0
cytoolz 0.10.1 py37he774522_0
dask-core 2.14.0 py_0
decorator 4.4.2 py_0
fastcluster 1.1.26 py37he350917_0 conda-forge ffmpeg 4.2 h6538335_0 conda-forge ffmpy 0.2.2 pypi_0 pypi freetype 2.9.1 ha9979f8_1
gast 0.2.2 py37_0
git 2.23.0 h6bb4b03_0
google-auth 1.13.1 py_0
google-auth-oauthlib 0.4.1 py_2
google-pasta 0.2.0 py_0
grpcio 1.27.2 py37h351948d_0
h5py 2.9.0 py37h5e291fa_0
hdf5 1.10.4 h7ebc959_0
icc_rt 2019.0.0 h0cc432a_1
icu 58.2 ha66f8fd_1
idna 2.9 py_1
imageio 2.6.1 py37_0
imageio-ffmpeg 0.4.1 py_0 conda-forge intel-openmp 2020.0 166
joblib 0.14.1 py_0
jpeg 9b hb83a4c4_2
keras 2.2.4 0
keras-applications 1.0.8 py_0
keras-base 2.2.4 py37_0
keras-preprocessing 1.1.0 py_1
kiwisolver 1.1.0 py37ha925a31_0
libpng 1.6.37 h2a8f88b_0
libprotobuf 3.11.4 h7bd577a_0
libtiff 4.1.0 h56a325e_0
markdown 3.1.1 py37_0
matplotlib 3.1.1 py37hc8f65d3_0
matplotlib-base 3.1.3 py37h64f37c6_0
mkl 2020.0 166
mkl-service 2.3.0 py37hb782905_0
mkl_fft 1.0.15 py37h14836fe_0
mkl_random 1.1.0 py37h675688f_0
networkx 2.4 py_0
numpy 1.17.4 py37h4320e6b_0
numpy-base 1.17.4 py37hc3f5095_0
nvidia-ml-py3 7.352.1 pypi_0 pypi oauthlib 3.1.0 py_0
olefile 0.46 py37_0
opencv-python 4.1.2.30 pypi_0 pypi openssl 1.1.1g he774522_0
opt_einsum 3.1.0 py_0
pathlib 1.0.1 py37_1
pillow 6.2.1 py37hdc69c19_0
pip 20.0.2 py37_1
protobuf 3.11.4 py37h33f27b4_0
psutil 5.7.0 py37he774522_0
pyasn1 0.4.8 py_0
pyasn1-modules 0.2.7 py_0
pycparser 2.20 py_0
pyjwt 1.7.1 py37_0
pyopenssl 19.1.0 py37_0
pyparsing 2.4.6 py_0
pyqt 5.9.2 py37h6538335_2
pyreadline 2.1 py37_1
pysocks 1.7.1 py37_0
python 3.7.7 h60c2a47_2
python-dateutil 2.8.1 py_0
python_abi 3.7 1_cp37m conda-forge pytz 2019.3 py_0
pywavelets 1.1.1 py37he774522_0
pywin32 227 py37he774522_1
pyyaml 5.3.1 py37he774522_0
qt 5.9.7 vc14h73c81de_0
requests 2.23.0 py37_0
requests-oauthlib 1.3.0 py_0
rsa 4.0 py_0
scikit-image 0.16.2 py37h47e9c7a_0
scikit-learn 0.22.1 py37h6288b17_0
scipy 1.4.1 py37h9439919_0
setuptools 46.1.3 py37_0
sip 4.19.8 py37h6538335_0
six 1.14.0 py37_0
sqlite 3.31.1 h2a8f88b_1
tensorboard 2.1.0 py3_0
tensorflow 1.15.0 gpu_py37hc3743a6_0
tensorflow-base 1.15.0 gpu_py37h1afeea4_0
tensorflow-estimator 1.15.1 pyh2649769_0
tensorflow-gpu 1.15.0 h0d30ee6_0
termcolor 1.1.0 py37_1
tk 8.6.8 hfa6e2cd_0
toolz 0.10.0 py_0
toposort 1.5 py_3 conda-forge tornado 6.0.4 py37he774522_1
tqdm 4.45.0 py_0
urllib3 1.25.8 py37_0
vc 14.1 h0510ff6_4
vs2015_runtime 14.16.27012 hf0eaf9b_1
werkzeug 0.16.1 py_0
wheel 0.34.2 py37_0
win_inet_pton 1.1.0 py37_0
wincertstore 0.2 py37_0
wrapt 1.12.1 py37he774522_1
xz 5.2.5 h62dcd97_0
yaml 0.1.7 hc54c509_2
zlib 1.2.11 h62dcd97_4
zstd 1.3.7 h508b16e_0 =============== State File ================= { "name": "dlight", "sessions": { "1": { "timestamp": 1587824855.7343981, "no_logs": false, "pingpong": false, "loss_names": { "a": [ "face_loss" ], "b": [ "face_loss" ] }, "batchsize": 3, "iterations": 29, "config": { "learning_rate": 5e-05 } }, "2": { "timestamp": 1587824918.5497837, "no_logs": false, "pingpong": false, "loss_names": { "a": [ "face_loss" ], "b": [ "face_loss" ] }, "batchsize": 4, "iterations": 136, "config": { "learning_rate": 5e-05 } }, "3": { "timestamp": 1587825136.2278714, "no_logs": false, "pingpong": false, "loss_names": { "a": [ "face_loss" ], "b": [ "face_loss" ] }, "batchsize": 4, "iterations": 156, "config": { "learning_rate": 5e-05 } }, "4": { "timestamp": 1587825309.0783265, "no_logs": false, "pingpong": false, "loss_names": { "a": [ "face_loss" ], "b": [ "face_loss" ] }, "batchsize": 4, "iterations": 192, "config": { "learning_rate": 5e-05 } }, "5": { "timestamp": 1587825525.800963, "no_logs": false, "pingpong": false, "loss_names": { "a": [ "face_loss" ], "b": [ "face_loss" ] }, "batchsize": 3, "iterations": 685, "config": { "learning_rate": 5e-05 } }, "6": { "timestamp": 1587826134.5144174, "no_logs": false, "pingpong": false, "loss_names": { "a": [ "face_loss" ], "b": [ "face_loss" ] }, "batchsize": 4, "iterations": 3115, "config": { "learning_rate": 5e-05 } }, "7": { "timestamp": 1587829306.1943862, "no_logs": false, "pingpong": false, "loss_names": { "a": [ "face_loss" ], "b": [ "face_loss" ] }, "batchsize": 4, "iterations": 683, "config": { "learning_rate": 5e-05 } }, "8": { "timestamp": 1587832143.3495698, "no_logs": false, "pingpong": false, "loss_names": { "a": [ "face_loss" ], "b": [ "face_loss" ] }, "batchsize": 4, "iterations": 5105, "config": { "learning_rate": 5e-05 } }, "9": { "timestamp": 1587836596.5843508, "no_logs": false, "pingpong": false, "loss_names": { "a": [ "face_loss" ], "b": [ "face_loss" ] }, "batchsize": 4, "iterations": 2146, "config": { "learning_rate": 5e-05 } }, "10": { "timestamp": 1587838529.4010422, "no_logs": false, "pingpong": false, "loss_names": { "a": [ "face_loss" ], "b": [ "face_loss" ] }, "batchsize": 4, "iterations": 46526, "config": { "learning_rate": 5e-05 } }, "11": { "timestamp": 1587877883.7384393, "no_logs": false, "pingpong": false, "loss_names": { "a": [ "face_loss" ], "b": [ "face_loss" ] }, "batchsize": 4, "iterations": 11601, "config": { "learning_rate": 5e-05 } }, "12": { "timestamp": 1587888227.159942, "no_logs": false, "pingpong": false, "loss_names": { "a": [ "face_loss" ], "b": [ "face_loss" ] }, "batchsize": 4, "iterations": 1405, "config": { "learning_rate": 5e-05 } }, "13": { "timestamp": 1587891218.5755107, "no_logs": false, "pingpong": false, "loss_names": { "a": [ "face_loss" ], "b": [ "face_loss" ] }, "batchsize": 4, "iterations": 11333, "config": { "learning_rate": 5e-05 } }, "14": { "timestamp": 1587902024.8967724, "no_logs": false, "pingpong": false, "loss_names": { "a": [ "face_loss" ], "b": [ "face_loss" ] }, "batchsize": 4, "iterations": 3023, "config": { "learning_rate": 5e-05 } }, "15": { "timestamp": 1587905931.3894255, "no_logs": false, "pingpong": false, "loss_names": { "a": [ "face_loss" ], "b": [ "face_loss" ] }, "batchsize": 4, "iterations": 17026, "config": { "learning_rate": 5e-05 } }, "16": { "timestamp": 1587922606.9400318, "no_logs": false, "pingpong": false, "loss_names": { "a": [ "face_loss" ], "b": [ "face_loss" ] }, "batchsize": 4, "iterations": 1203, "config": { "learning_rate": 5e-05 } }, "17": { "timestamp": 1587926633.24557, "no_logs": false, "pingpong": false, "loss_names": { "a": [ "face_loss" ], "b": [ "face_loss" ] }, "batchsize": 4, "iterations": 52824, "config": { "learning_rate": 5e-05 } }, "18": { "timestamp": 1587970934.8403707, "no_logs": false, "pingpong": false, "loss_names": { "a": [ "face_loss" ], "b": [ "face_loss" ] }, "batchsize": 4, "iterations": 2801, "config": { "learning_rate": 5e-05 } }, "19": { "timestamp": 1587974323.8864837, "no_logs": false, "pingpong": false, "loss_names": { "a": [ "face_loss" ], "b": [ "face_loss" ] }, "batchsize": 4, "iterations": 20636, "config": { "learning_rate": 5e-05 } }, "20": { "timestamp": 1587992211.8501766, "no_logs": false, "pingpong": false, "loss_names": { "a": [ "face_loss" ], "b": [ "face_loss" ] }, "batchsize": 4, "iterations": 2001, "config": { "learning_rate": 5e-05 } } }, "lowest_avg_loss": { "a": 0.024596761912107468, "b": 0.01553005538880825 }, "iterations": 182626, "inputs": { "face_in:0": [ 128, 128, 3 ], "mask_in:0": [ 128, 128, 1 ] }, "training_size": 256, "config": { "coverage": 68.75, "mask_type": "components", "mask_blur_kernel": 3, "mask_threshold": 4, "learn_mask": false, "icnr_init": false, "conv_aware_init": false, "subpixel_upscaling": false, "reflect_padding": false, "penalized_mask_loss": true, "loss_function": "mae", "learning_rate": 5e-05, "features": "fair", "details": "good", "output_size": 128 } } ================= Configs ================== --------- .faceswap --------- backend: nvidia --------- convert.ini --------- [color.color_transfer] clip: True preserve_paper: True [color.manual_balance] colorspace: HSV balance_1: 0.0 balance_2: 0.0 balance_3: 0.0 contrast: 0.0 brightness: 0.0 [color.match_hist] threshold: 99.0 [mask.box_blend] type: gaussian distance: 11.0 radius: 5.0 passes: 1 [mask.mask_blend] type: normalized kernel_size: 3 passes: 4 threshold: 4 erosion: 0.0 [scaling.sharpen] method: unsharp_mask amount: 150 radius: 0.3 threshold: 5.0 [writer.ffmpeg] container: mp4 codec: libx264 crf: 23 preset: medium tune: none profile: auto level: auto [writer.gif] fps: 25 loop: 0 palettesize: 256 subrectangles: False [writer.opencv] format: png draw_transparent: False jpg_quality: 75 png_compress_level: 3 [writer.pillow] format: png draw_transparent: False optimize: False gif_interlace: True jpg_quality: 75 png_compress_level: 3 tif_compression: tiff_deflate --------- extract.ini --------- [global] allow_growth: False [align.fan] batch-size: 12 [detect.cv2_dnn] confidence: 50 [detect.mtcnn] minsize: 20 threshold_1: 0.6 threshold_2: 0.7 threshold_3: 0.7 scalefactor: 0.709 batch-size: 8 [detect.s3fd] confidence: 70 batch-size: 4 [mask.unet_dfl] batch-size: 8 [mask.vgg_clear] batch-size: 6 [mask.vgg_obstructed] batch-size: 2 --------- gui.ini --------- [global] fullscreen: False tab: extract options_panel_width: 30 console_panel_height: 20 icon_size: 14 font: default font_size: 8 autosave_last_session: prompt timeout: 120 auto_load_model_stats: True --------- train.ini --------- [global] coverage: 68.75 mask_type: components mask_blur_kernel: 3 mask_threshold: 4 learn_mask: False icnr_init: False conv_aware_init: False subpixel_upscaling: False reflect_padding: False penalized_mask_loss: True loss_function: mae learning_rate: 5e-05 [model.dfl_h128] lowmem: True [model.dfl_sae] input_size: 96 clipnorm: True architecture: liae autoencoder_dims: 0 encoder_dims: 42 decoder_dims: 21 multiscale_decoder: False [model.dlight] features: best details: good output_size: 256 [model.original] lowmem: False [model.realface] input_size: 64 output_size: 96 dense_nodes: 1536 complexity_encoder: 128 complexity_decoder: 512 [model.unbalanced] input_size: 128 lowmem: False clipnorm: True nodes: 1024 complexity_encoder: 128 complexity_decoder_a: 384 complexity_decoder_b: 512 [model.villain] lowmem: False [trainer.original] preview_images: 16 zoom_amount: 5 rotation_range: 10 shift_range: 5 flip_chance: 50 color_lightness: 30 color_ab: 8 color_clahe_chance: 50 color_clahe_max_size: 4

Re: CRITICAL An unexpected crash has occurred whilte training

Posted: Tue Mar 15, 2022 2:57 pm
by torzdf

This is an Out Of Memory error. You only have a 2GB GPU. Under Windows, you may not be able to do much. At best you may be able to use the lightweight model at a low batchsize.