m1 Max Train exit code -11

If training is failing to start, and you are not receiving an error message telling you what to do, tell us about it here


Forum rules

Read the FAQs and search the forum before posting a new topic.

This forum is for reporting errors with the Training process. If you want to get tips, or better understand the Training process, then you should look in the Training Discussion forum.

Please mark any answers that fixed your problems so others can find the solutions.

Locked
User avatar
Banannzza
Posts: 5
Joined: Wed May 03, 2023 1:27 pm
Has thanked: 1 time

m1 Max Train exit code -11

Post by Banannzza »

Hello,
tried different presets for Phaze-A: dny256, dny512, sym384. And it ends up with train.py exit code -11 at iteration N (usually at save interval).

BS: 1-32
M1 Max 64gb

User avatar
torzdf
Posts: 2649
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 128 times
Been thanked: 622 times

Re: m1 Max Train exit code -11

Post by torzdf »

Exit Code 11 is an error code generated by MacOS, not by Faceswap. I don't own an M1 Mac so can't help you further, unfortunately. You will probably need to do some Googling around to find what is causing this issue.

My word is final

User avatar
Banannzza
Posts: 5
Joined: Wed May 03, 2023 1:27 pm
Has thanked: 1 time

Re: m1 Max Train exit code -11

Post by Banannzza »

identified the problem

Code: Select all

[20:46:55] [#00254] Loss A: 0.05793, Loss B: 0.03490Fatal Python error: Segmentation fault

Thread 0x00000005b76a3000 (most recent call first):
  File "/Users/user/miniforge3/envs/faceswap/lib/python3.9/threading.py", line 312 in wait
  File "/Users/user/miniforge3/envs/faceswap/lib/python3.9/queue.py", line 140 in put
  File "/Users/user/Desktop/faceswap/lib/multithreading.py", line 276 in _run
  File "/Users/user/Desktop/faceswap/lib/multithreading.py", line 96 in run
  File "/Users/user/miniforge3/envs/faceswap/lib/python3.9/threading.py", line 980 in _bootstrap_inner
  File "/Users/user/miniforge3/envs/faceswap/lib/python3.9/threading.py", line 937 in _bootstrap

Thread 0x00000004db007000 (most recent call first):
  File "/Users/user/miniforge3/envs/faceswap/lib/python3.9/threading.py", line 312 in wait
  File "/Users/user/miniforge3/envs/faceswap/lib/python3.9/queue.py", line 140 in put
  File "/Users/user/Desktop/faceswap/lib/multithreading.py", line 276 in _run
  File "/Users/user/Desktop/faceswap/lib/multithreading.py", line 96 in run
  File "/Users/user/miniforge3/envs/faceswap/lib/python3.9/threading.py", line 980 in _bootstrap_inner
  File "/Users/user/miniforge3/envs/faceswap/lib/python3.9/threading.py", line 937 in _bootstrap

Thread 0x0000000376037000 (most recent call first):
  File "/Users/user/miniforge3/envs/faceswap/lib/python3.9/threading.py", line 312 in wait
  File "/Users/user/miniforge3/envs/faceswap/lib/python3.9/queue.py", line 140 in put
  File "/Users/user/Desktop/faceswap/lib/multithreading.py", line 276 in _run
  File "/Users/user/Desktop/faceswap/lib/multithreading.py", line 96 in run
  File "/Users/user/miniforge3/envs/faceswap/lib/python3.9/threading.py", line 980 in _bootstrap_inner
  File "/Users/user/miniforge3/envs/faceswap/lib/python3.9/threading.py", line 937 in _bootstrap

Thread 0x000000036702b000 (most recent call first):
  File "/Users/user/miniforge3/envs/faceswap/lib/python3.9/threading.py", line 312 in wait
  File "/Users/user/miniforge3/envs/faceswap/lib/python3.9/queue.py", line 140 in put
  File "/Users/user/Desktop/faceswap/lib/multithreading.py", line 276 in _run
  File "/Users/user/Desktop/faceswap/lib/multithreading.py", line 96 in run
  File "/Users/user/miniforge3/envs/faceswap/lib/python3.9/threading.py", line 980 in _bootstrap_inner
  File "/Users/user/miniforge3/envs/faceswap/lib/python3.9/threading.py", line 937 in _bootstrap

Thread 0x0000000348007000 (most recent call first):
  File "/Users/user/miniforge3/envs/faceswap/lib/python3.9/threading.py", line 312 in wait
  File "/Users/user/miniforge3/envs/faceswap/lib/python3.9/queue.py", line 140 in put
  File "/Users/user/Desktop/faceswap/lib/multithreading.py", line 276 in _run
  File "/Users/user/Desktop/faceswap/lib/multithreading.py", line 96 in run
  File "/Users/user/miniforge3/envs/faceswap/lib/python3.9/threading.py", line 980 in _bootstrap_inner
  File "/Users/user/miniforge3/envs/faceswap/lib/python3.9/threading.py", line 937 in _bootstrap

Thread 0x0000000304007000 (most recent call first):
  File "/Users/user/miniforge3/envs/faceswap/lib/python3.9/threading.py", line 312 in wait
  File "/Users/user/miniforge3/envs/faceswap/lib/python3.9/queue.py", line 140 in put
  File "/Users/user/Desktop/faceswap/lib/multithreading.py", line 276 in _run
  File "/Users/user/Desktop/faceswap/lib/multithreading.py", line 96 in run
  File "/Users/user/miniforge3/envs/faceswap/lib/python3.9/threading.py", line 980 in _bootstrap_inner
  File "/Users/user/miniforge3/envs/faceswap/lib/python3.9/threading.py", line 937 in _bootstrap

Current thread 0x00000002ce8c7000 (most recent call first):
  File "/Users/user/Desktop/faceswap/plugins/train/trainer/_base.py", line 894 in _compile_masked
  File "/Users/user/Desktop/faceswap/plugins/train/trainer/_base.py", line 822 in _to_full_frame
  File "/Users/user/Desktop/faceswap/plugins/train/trainer/_base.py", line 768 in _compile_preview
  File "/Users/user/Desktop/faceswap/plugins/train/trainer/_base.py", line 672 in show_sample
  File "/Users/user/Desktop/faceswap/plugins/train/trainer/_base.py", line 1104 in output_timelapse
  File "/Users/user/Desktop/faceswap/plugins/train/trainer/_base.py", line 365 in _update_viewers
  File "/Users/user/Desktop/faceswap/plugins/train/trainer/_base.py", line 257 in train_one_step
  File "/Users/user/Desktop/faceswap/scripts/train.py", line 358 in _run_training_cycle
  File "/Users/user/Desktop/faceswap/scripts/train.py", line 270 in _training
  File "/Users/user/Desktop/faceswap/lib/multithreading.py", line 96 in run
  File "/Users/user/miniforge3/envs/faceswap/lib/python3.9/threading.py", line 980 in _bootstrap_inner
  File "/Users/user/miniforge3/envs/faceswap/lib/python3.9/threading.py", line 937 in _bootstrap

Thread 0x0000000100cb4580 (most recent call first):
  File "/Users/user/Desktop/faceswap/scripts/train.py", line 478 in _monitor
  File "/Users/user/Desktop/faceswap/scripts/train.py", line 217 in process
  File "/Users/user/Desktop/faceswap/lib/cli/launcher.py", line 230 in execute_script
  File "/Users/user/Desktop/faceswap/faceswap.py", line 52 in _main
  File "/Users/user/Desktop/faceswap/faceswap.py", line 56 in <module>
zsh: segmentation fault  python faceswap.py train -A /Users/user/Desktop/deepfake/training
User avatar
Banannzza
Posts: 5
Joined: Wed May 03, 2023 1:27 pm
Has thanked: 1 time

Re: m1 Max Train exit code -11

Post by Banannzza »

And sometimes i catch the following error:

Code: Select all

[09:08:25] [#01402] Loss A: 0.05251, Loss B: 0.03235zsh: bus error  python faceswap.py train -A /Users/user/Desktop/deepfake/training
(faceswap) user@user-MacBook-Pro facusereswap % /Users/user/miniforge3/envs/faceswap/lib/python3.9/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '
Last edited by Banannzza on Sun May 07, 2023 6:10 am, edited 2 times in total.
User avatar
torzdf
Posts: 2649
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 128 times
Been thanked: 622 times

Re: m1 Max Train exit code -11

Post by torzdf »

Segmentation fault is generally a hardware (memory) issue.

Don't worry about the leaked semaphore message.

My word is final

User avatar
Banannzza
Posts: 5
Joined: Wed May 03, 2023 1:27 pm
Has thanked: 1 time

Re: m1 Max Train exit code -11

Post by Banannzza »

After 10+ days of training, I have come to the conclusion that a training run without a preview (including writing to an image) is more stable or completely stable

My second suspect is a combination of loss functions and model training state: start/mid/end.

User avatar
torzdf
Posts: 2649
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 128 times
Been thanked: 622 times

Re: m1 Max Train exit code -11

Post by torzdf »

Entirely possible. The problem with software that runs across many OSes (and many chipsets!) is that there may be bugs specific to one or other. Unfortunately M1 support was community added, so I have no way of testing it.

My word is final

Locked