Training exception after 14 hours

If training is failing to start, and you are not receiving an error message telling you what to do, tell us about it here


Forum rules

Read the FAQs and search the forum before posting a new topic.

This forum is for reporting errors with the Training process. If you want to get tips, or better understand the Training process, then you should look in the Training Discussion forum.

Please mark any answers that fixed your problems so others can find the solutions.

Locked
User avatar
jrsy
Posts: 1
Joined: Sun Apr 23, 2023 6:39 pm

Training exception after 14 hours

Post by jrsy »

'C:\Users\Jim' is not recognized as an internal or external command

This has happened several times after many hours of working. Can't seem to get past it. Tens of GB still available on the drive.

Sorry for big output paste. Trying to figure out how to attach a file here so I can include the crash report files.

Maybe this but it will expire: https://www.dropbox.com/t/DJ8tsLSM70PPevUg

https://www.dropbox.com/t/DJ8tsLSM70PPevUg

end of output:

Code: Select all

04/23/2023 14:04:28 INFO     [Saved models] - Average loss since last save: face_a: 0.02790, face_b: 0.02637

04/23/2023 14:04:30 INFO     [Preview Updated]

04/23/2023 14:07:35 INFO     [Saved models] - Average loss since last save: face_a: 0.02825, face_b: 0.02564

04/23/2023 14:07:37 INFO     [Preview Updated]

04/23/2023 14:10:26 CRITICAL Error caught! Exiting...
04/23/2023 14:10:26 ERROR    Caught exception in thread: '_training'
'C:\Users\Jim' is not recognized as an internal or external command,
operable program or batch file.
04/23/2023 14:10:33 ERROR    Got Exception on main handler:
Traceback (most recent call last):
  File "C:\Users\Jim Symon\faceswap\lib\cli\launcher.py", line 230, in execute_script
    process.process()
  File "C:\Users\Jim Symon\faceswap\scripts\train.py", line 213, in process
    self._end_thread(thread, err)
  File "C:\Users\Jim Symon\faceswap\scripts\train.py", line 253, in _end_thread
    thread.join()
  File "C:\Users\Jim Symon\faceswap\lib\multithreading.py", line 220, in join
    raise thread.err[1].with_traceback(thread.err[2])
  File "C:\Users\Jim Symon\faceswap\lib\multithreading.py", line 96, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Users\Jim Symon\faceswap\scripts\train.py", line 275, in _training
    raise err
  File "C:\Users\Jim Symon\faceswap\scripts\train.py", line 265, in _training
    self._run_training_cycle(model, trainer)
  File "C:\Users\Jim Symon\faceswap\scripts\train.py", line 367, in _run_training_cycle
    model.save(is_exit=False)
  File "C:\Users\Jim Symon\faceswap\plugins\train\model\_base\model.py", line 437, in save
    self._io.save(is_exit=is_exit)
  File "C:\Users\Jim Symon\faceswap\plugins\train\model\_base\io.py", line 207, in save
    self._plugin.model.save(self._filename, include_optimizer=include_optimizer)
  File "C:\Users\Jim Symon\MiniConda3\envs\faceswap\lib\site-packages\keras\utils\traceback_utils.py", line 70, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "C:\Users\Jim Symon\MiniConda3\envs\faceswap\lib\site-packages\keras\backend.py", line 4240, in <listcomp>
    return [x.numpy() for x in tensors]
numpy.core._exceptions._ArrayMemoryError: Unable to allocate 64.0 MiB for an array with shape (1024, 16384) and data type float32
04/23/2023 14:10:33 CRITICAL An unexpected crash has occurred. Crash report written to 'C:\Users\Jim Symon\faceswap\crash_report.2023.04.23.141026315936.log'. You MUST provide this file if seeking assistance. Please verify you are running the latest version of faceswap before reporting
Process exited.
Last edited by torzdf on Thu Apr 27, 2023 12:27 pm, edited 1 time in total.
User avatar
torzdf
Posts: 2649
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 128 times
Been thanked: 622 times

Re: Training exception after 14 hours

Post by torzdf »

There are 2 issues here.

You haven't shown where this occurs:

Code: Select all

C:\Users\Jim' is not recognized as an internal or external command

So I can't really help with that. However, you should try to install Faceswap in a location with no spaces in the path. This is a Conda limitation, not a Faceswap limitation, so there is little I can do about this.

As to your crash report, you are running out of RAM. I don't really understand your error though, as it comes from Numpy, which means you have run out of system RAM, not GPU RAM. All I can really suggest is to close any other applications down. Also, watch your process monitor whilst training to see how much system memory Faceswap is using.

My word is final

Locked