Page 1 of 1

m1 Max Train exit code -11

Posted: Wed May 03, 2023 2:19 pm
by Banannzza

Hello,
tried different presets for Phaze-A: dny256, dny512, sym384. And it ends up with train.py exit code -11 at iteration N (usually at save interval).

BS: 1-32
M1 Max 64gb


Re: m1 Max Train exit code -11

Posted: Thu May 04, 2023 10:19 am
by torzdf

Exit Code 11 is an error code generated by MacOS, not by Faceswap. I don't own an M1 Mac so can't help you further, unfortunately. You will probably need to do some Googling around to find what is causing this issue.


Re: m1 Max Train exit code -11

Posted: Sat May 06, 2023 5:53 pm
by Banannzza

identified the problem

Code: Select all

[20:46:55] [#00254] Loss A: 0.05793, Loss B: 0.03490Fatal Python error: Segmentation fault

Thread 0x00000005b76a3000 (most recent call first):
  File "/Users/user/miniforge3/envs/faceswap/lib/python3.9/threading.py", line 312 in wait
  File "/Users/user/miniforge3/envs/faceswap/lib/python3.9/queue.py", line 140 in put
  File "/Users/user/Desktop/faceswap/lib/multithreading.py", line 276 in _run
  File "/Users/user/Desktop/faceswap/lib/multithreading.py", line 96 in run
  File "/Users/user/miniforge3/envs/faceswap/lib/python3.9/threading.py", line 980 in _bootstrap_inner
  File "/Users/user/miniforge3/envs/faceswap/lib/python3.9/threading.py", line 937 in _bootstrap

Thread 0x00000004db007000 (most recent call first):
  File "/Users/user/miniforge3/envs/faceswap/lib/python3.9/threading.py", line 312 in wait
  File "/Users/user/miniforge3/envs/faceswap/lib/python3.9/queue.py", line 140 in put
  File "/Users/user/Desktop/faceswap/lib/multithreading.py", line 276 in _run
  File "/Users/user/Desktop/faceswap/lib/multithreading.py", line 96 in run
  File "/Users/user/miniforge3/envs/faceswap/lib/python3.9/threading.py", line 980 in _bootstrap_inner
  File "/Users/user/miniforge3/envs/faceswap/lib/python3.9/threading.py", line 937 in _bootstrap

Thread 0x0000000376037000 (most recent call first):
  File "/Users/user/miniforge3/envs/faceswap/lib/python3.9/threading.py", line 312 in wait
  File "/Users/user/miniforge3/envs/faceswap/lib/python3.9/queue.py", line 140 in put
  File "/Users/user/Desktop/faceswap/lib/multithreading.py", line 276 in _run
  File "/Users/user/Desktop/faceswap/lib/multithreading.py", line 96 in run
  File "/Users/user/miniforge3/envs/faceswap/lib/python3.9/threading.py", line 980 in _bootstrap_inner
  File "/Users/user/miniforge3/envs/faceswap/lib/python3.9/threading.py", line 937 in _bootstrap

Thread 0x000000036702b000 (most recent call first):
  File "/Users/user/miniforge3/envs/faceswap/lib/python3.9/threading.py", line 312 in wait
  File "/Users/user/miniforge3/envs/faceswap/lib/python3.9/queue.py", line 140 in put
  File "/Users/user/Desktop/faceswap/lib/multithreading.py", line 276 in _run
  File "/Users/user/Desktop/faceswap/lib/multithreading.py", line 96 in run
  File "/Users/user/miniforge3/envs/faceswap/lib/python3.9/threading.py", line 980 in _bootstrap_inner
  File "/Users/user/miniforge3/envs/faceswap/lib/python3.9/threading.py", line 937 in _bootstrap

Thread 0x0000000348007000 (most recent call first):
  File "/Users/user/miniforge3/envs/faceswap/lib/python3.9/threading.py", line 312 in wait
  File "/Users/user/miniforge3/envs/faceswap/lib/python3.9/queue.py", line 140 in put
  File "/Users/user/Desktop/faceswap/lib/multithreading.py", line 276 in _run
  File "/Users/user/Desktop/faceswap/lib/multithreading.py", line 96 in run
  File "/Users/user/miniforge3/envs/faceswap/lib/python3.9/threading.py", line 980 in _bootstrap_inner
  File "/Users/user/miniforge3/envs/faceswap/lib/python3.9/threading.py", line 937 in _bootstrap

Thread 0x0000000304007000 (most recent call first):
  File "/Users/user/miniforge3/envs/faceswap/lib/python3.9/threading.py", line 312 in wait
  File "/Users/user/miniforge3/envs/faceswap/lib/python3.9/queue.py", line 140 in put
  File "/Users/user/Desktop/faceswap/lib/multithreading.py", line 276 in _run
  File "/Users/user/Desktop/faceswap/lib/multithreading.py", line 96 in run
  File "/Users/user/miniforge3/envs/faceswap/lib/python3.9/threading.py", line 980 in _bootstrap_inner
  File "/Users/user/miniforge3/envs/faceswap/lib/python3.9/threading.py", line 937 in _bootstrap

Current thread 0x00000002ce8c7000 (most recent call first):
  File "/Users/user/Desktop/faceswap/plugins/train/trainer/_base.py", line 894 in _compile_masked
  File "/Users/user/Desktop/faceswap/plugins/train/trainer/_base.py", line 822 in _to_full_frame
  File "/Users/user/Desktop/faceswap/plugins/train/trainer/_base.py", line 768 in _compile_preview
  File "/Users/user/Desktop/faceswap/plugins/train/trainer/_base.py", line 672 in show_sample
  File "/Users/user/Desktop/faceswap/plugins/train/trainer/_base.py", line 1104 in output_timelapse
  File "/Users/user/Desktop/faceswap/plugins/train/trainer/_base.py", line 365 in _update_viewers
  File "/Users/user/Desktop/faceswap/plugins/train/trainer/_base.py", line 257 in train_one_step
  File "/Users/user/Desktop/faceswap/scripts/train.py", line 358 in _run_training_cycle
  File "/Users/user/Desktop/faceswap/scripts/train.py", line 270 in _training
  File "/Users/user/Desktop/faceswap/lib/multithreading.py", line 96 in run
  File "/Users/user/miniforge3/envs/faceswap/lib/python3.9/threading.py", line 980 in _bootstrap_inner
  File "/Users/user/miniforge3/envs/faceswap/lib/python3.9/threading.py", line 937 in _bootstrap

Thread 0x0000000100cb4580 (most recent call first):
  File "/Users/user/Desktop/faceswap/scripts/train.py", line 478 in _monitor
  File "/Users/user/Desktop/faceswap/scripts/train.py", line 217 in process
  File "/Users/user/Desktop/faceswap/lib/cli/launcher.py", line 230 in execute_script
  File "/Users/user/Desktop/faceswap/faceswap.py", line 52 in _main
  File "/Users/user/Desktop/faceswap/faceswap.py", line 56 in <module>
zsh: segmentation fault  python faceswap.py train -A /Users/user/Desktop/deepfake/training

Re: m1 Max Train exit code -11

Posted: Sun May 07, 2023 5:50 am
by Banannzza

And sometimes i catch the following error:

Code: Select all

[09:08:25] [#01402] Loss A: 0.05251, Loss B: 0.03235zsh: bus error  python faceswap.py train -A /Users/user/Desktop/deepfake/training
(faceswap) user@user-MacBook-Pro facusereswap % /Users/user/miniforge3/envs/faceswap/lib/python3.9/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '

Re: m1 Max Train exit code -11

Posted: Tue May 09, 2023 10:58 am
by torzdf

Segmentation fault is generally a hardware (memory) issue.

Don't worry about the leaked semaphore message.


Re: m1 Max Train exit code -11

Posted: Wed May 31, 2023 9:31 pm
by Banannzza

After 10+ days of training, I have come to the conclusion that a training run without a preview (including writing to an image) is more stable or completely stable

My second suspect is a combination of loss functions and model training state: start/mid/end.


Re: m1 Max Train exit code -11

Posted: Wed May 31, 2023 9:39 pm
by torzdf

Entirely possible. The problem with software that runs across many OSes (and many chipsets!) is that there may be bugs specific to one or other. Unfortunately M1 support was community added, so I have no way of testing it.