I use AMD RX 580 gpu and meet with regular crashes under Windows (i suppose because of DirectML) so i try to install Ubuntu to change DirectML on a ROCM.
Change OS solve my problem with training crashes, but now i faced with strange error when try to continue training model.
Code: Select all
04/04/2023 18:57:06 INFO ===================================================
04/04/2023 18:57:06 INFO Starting
04/04/2023 18:57:06 INFO ===================================================
04/04/2023 18:57:07 INFO Loading data, this may take a while...
04/04/2023 18:57:07 INFO Loading Model from Dfaker plugin...
04/04/2023 18:57:07 INFO Config item: 'epsilon_exponent' has been updated from '-6' to '-7'
04/04/2023 18:57:07 INFO Config item: 'convert_batchsize' has been updated from '8' to '16'
04/04/2023 18:57:07 INFO Config item: 'eye_multiplier' has been updated from '3' to '1'
04/04/2023 18:57:07 INFO Config item: 'mouth_multiplier' has been updated from '2' to '1'
04/04/2023 18:57:07 INFO Using configuration saved in state file
04/04/2023 18:57:07 CRITICAL Error caught! Exiting...
04/04/2023 18:57:07 ERROR Caught exception in thread: '_training'
ls: unable to access to '/home/lighting/.local/lib/python3.10/site-packages/cv2/../../lib64': no such file or directory
ls: unable to access to '/home/lighting/.local/lib/python3.10/site-packages/cv2/../../lib64': no such file or directory
04/04/2023 18:57:08 ERROR Got Exception on main handler:
Traceback (most recent call last):
File "/home/lighting/faceswap/lib/cli/launcher.py", line 230, in execute_script
process.process()
File "/home/lighting/faceswap/scripts/train.py", line 213, in process
self._end_thread(thread, err)
File "/home/lighting/faceswap/scripts/train.py", line 253, in _end_thread
thread.join()
File "/home/lighting/faceswap/lib/multithreading.py", line 220, in join
raise thread.err[1].with_traceback(thread.err[2])
File "/home/lighting/faceswap/lib/multithreading.py", line 96, in run
self._target(*self._args, **self._kwargs)
File "/home/lighting/faceswap/scripts/train.py", line 275, in _training
raise err
File "/home/lighting/faceswap/scripts/train.py", line 263, in _training
model = self._load_model()
File "/home/lighting/faceswap/scripts/train.py", line 291, in _load_model
model.build()
File "/home/lighting/faceswap/plugins/train/model/_base/model.py", line 304, in build
model = self._io._load() # pylint:disable=protected-access
File "/home/lighting/faceswap/plugins/train/model/_base/io.py", line 152, in _load
model = load_model(self._filename, compile=False)
File "/home/lighting/.local/lib/python3.10/site-packages/keras/utils/traceback_utils.py", line 70, in error_handler
raise e.with_traceback(filtered_tb) from None
File "/home/lighting/.local/lib/python3.10/site-packages/h5py/_hl/files.py", line 567, in __init__
fid = make_fid(name, mode, userblock_size, fapl, fcpl, swmr=swmr)
File "/home/lighting/.local/lib/python3.10/site-packages/h5py/_hl/files.py", line 231, in make_fid
fid = h5f.open(name, flags, fapl=fapl)
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py/h5f.pyx", line 106, in h5py.h5f.open
OSError: Unable to open file (file signature not found)
04/04/2023 18:57:08 CRITICAL An unexpected crash has occurred. Crash report written to '/home/lighting/faceswap/crash_report.2023.04.04.185707983528.log'. You MUST provide this file if seeking assistance. Please verify you are running the latest version of faceswap before reporting
Process exited.
I don't understand why it try to go two levels up in search of lib64 (../../). Can't see any suspicious in log file.
Newly created model start to train succesfully. Any suggestion?