EfficientNetV2 really is something. As advertised for sure.
Bug: Updated to Latest faceswap version now crashing when starting training
Read the FAQs and search the forum before posting a new topic.
This forum is for reporting errors with the Training process. If you want to get tips, or better understand the Training process, then you should look in the Training Discussion forum.
Please mark any answers that fixed your problems so others can find the solutions.
- ianstephens
- Posts: 117
- Joined: Sun Feb 14, 2021 7:20 pm
- Has thanked: 12 times
- Been thanked: 15 times
Re: Bug: Updated to Latest faceswap version now crashing when starting training
Yep, getting positive feedback for sure. It's good to know it was worth putting the time in to get the code updated
My word is final
- Hanrahahanrahan
- Posts: 8
- Joined: Sat Aug 22, 2020 1:44 pm
- Has thanked: 4 times
- Been thanked: 1 time
Re: Bug: Updated to Latest faceswap version now crashing when starting training
Eh, all this sarcasm, I do get pretty great results in shorter time with it. So thanks @torzdf for adding it. Also thanks in general that you still try to improve everything here and there, even just a little bit. I will keep donating via Patreon for a while, even if you would stop immediately, because what you done so far was worth it.
- ianstephens
- Posts: 117
- Joined: Sun Feb 14, 2021 7:20 pm
- Has thanked: 12 times
- Been thanked: 15 times
Re: Bug: Updated to Latest faceswap version now crashing when starting training
Where was the sarcasm?
The EfficientNetV2 really is something and the results are fantastic. I would agree with the quoted 5x-11x learning rate improvements.
- ianstephens
- Posts: 117
- Joined: Sun Feb 14, 2021 7:20 pm
- Has thanked: 12 times
- Been thanked: 15 times
Re: Bug: Updated to Latest faceswap version now crashing when starting training
So the white box preview continues when starting brand new models.
However, I have found the strangest workaround.
Before starting a brand new model, I simply update Faceswap (Help-->Update Faceswap). There doesn't even need to be an update available, I just simply need to perform this action.
Once done, I then start a training session, and boom - immediate previews.
It's consistent too - works every time.
I have no idea what's going on or why this works but perhaps it's loading something from the codebase/files that weren't present/loaded before.
Just thought I'd mention it - I'm sure you'll make more sense of it than me
Re: Bug: Updated to Latest faceswap version now crashing when starting training
ianstephens wrote: ↑Wed May 18, 2022 10:58 amJust thought I'd mention it - I'm sure you'll make more sense of it than me
Guess again! That makes absolutely no sense to me whatsoever . But, if it works, it works.
My word is final
Re: Bug: Updated to Latest faceswap version now crashing when starting training
Noticed when I switch to the graph the program closes out now. I removed and reloaded the program no change.
Re: Bug: Updated to Latest faceswap version now crashing when starting training
File "C:\Users\camer\faceswap\lib\cli\launcher.py", line 182, in execute_script
process.process()
File "C:\Users\camer\faceswap\scripts\train.py", line 190, in process
self._end_thread(thread, err)
File "C:\Users\camer\faceswap\scripts\train.py", line 230, in end_thread
thread.join()
File "C:\Users\camer\faceswap\lib\multithreading.py", line 121, in join
raise thread.err[1].with_traceback(thread.err[2])
File "C:\Users\camer\faceswap\lib\multithreading.py", line 37, in run
self.target(*self.args, **self.kwargs)
File "C:\Users\camer\faceswap\scripts\train.py", line 252, in _training
raise err
File "C:\Users\camer\faceswap\scripts\train.py", line 242, in training
self.run_training_cycle(model, trainer)
File "C:\Users\camer\faceswap\scripts\train.py", line 327, in run_training_cycle
trainer.train_one_step(viewer, timelapse)
File "C:\Users\camer\faceswap\plugins\train\trainer\_base.py", line 225, in train_one_step
self.print_loss(loss)
File "C:\Users\camer\faceswap\plugins\train\trainer\_base.py", line 314, in _print_loss
print(f"\r{output}", end="")
OSError: [Errno 22] Invalid argument
============ System Information ============
encoding: cp1252
git_branch: master
git_commits: c2595c4 bugfix - add missing mask key to alignments on legacy update
gpu_cuda: 11.5
gpu_cudnn: No global version found. Check Conda packages for Conda cuDNN
gpu_devices: GPU_0: NVIDIA GeForce GTX 1080
gpu_devices_active: GPU_0
gpu_driver: 472.39
gpu_vram: GPU_0: 8192MB
os_machine: AMD64
os_platform: Windows-10-10.0.19044-SP0
os_release: 10
py_command: C:\Users\camer\faceswap\faceswap.py train -A C:/Users/camer/Documents/Desktop/Worx 2/Sorted A -B C:/Users/camer/Documents/Desktop/Worx 2/Sorted (B) -m C:/Users/camer/Documents/Desktop/Worx 2/Models -t original -bs 16 -it 1000000 -s 250 -ss 25000 -ps 100 -L INFO -gui
py_conda_version: conda 4.12.0
py_implementation: CPython
py_version: 3.9.12
py_virtual_env: True
sys_cores: 8
sys_processor: Intel64 Family 6 Model 58 Stepping 9, GenuineIntel
sys_ram: Total: 32712MB, Available: 19316MB, Used: 13395MB, Free: 19316MB
- Attachments
-
- crash_report.2022.05.21.224906269220.log
- (42.51 KiB) Downloaded 111 times
Re: Bug: Updated to Latest faceswap version now crashing when starting training
Yeah, this is the same bug as in an earlier post (viewtopic.php?p=6807#p6807), and as before, I don't understand it, nor how to solve it....
basically this:
Code: Select all
print(f"\r{output}", end="")
OSError: [Errno 22] Invalid argument
is an I/O error as far as I can ascertain. It is raised directly from Windows. However, a simple print statement should not raise this kind of error, and it is not an error I can replicate.
I know this isn't helpful, but I really don't know how to solve this one, given that it makes no sense. It doubly doesn't make sense, as this is code generated from the core faceswap code, so the GUI shouldn't be impacting it in any way.
Would be interested to know if/how [mention]ianstephens[/mention] solved it. My guess would be that it just went away for him (which also doesn't help us :/)
My word is final
- ianstephens
- Posts: 117
- Joined: Sun Feb 14, 2021 7:20 pm
- Has thanked: 12 times
- Been thanked: 15 times
Re: Bug: Updated to Latest faceswap version now crashing when starting training
[mention]torzdf[/mention] - I didn't manage to solve it - it persists.
We worked around the issue by running the graph full time in the FaceSwap window and enabling the second optional (separate) preview window. That way there is no need for switching in the FS GUI. We simply leave the graph running and monitor the preview window separately.
For what it's worth, we're running Windows 11. Didn't have this issue on Windows 10. [mention]cedenburn[/mention] - what are you running?
Re: Bug: Updated to Latest faceswap version now crashing when starting training
It is such a weird one. Googling around for it nearly always come up with results of people getting that error when writing data to disk, which we definitely aren't doing here. I wish I could replicate it, as that way I may be able to find a work around, even if I couldn't find the actual cause.
My word is final
Re: Bug: Updated to Latest faceswap version now crashing when starting training
Thank you torzdf . I get the same exact message for completely different files so it appears to be related to my PC specifically. I have another message that appears during training that doesn't crash the training session but appears after every line of - Average loss since last save. I was wondering if you could assist with this.
Code: Select all
Exception in Tkinter callback
Traceback (most recent call last):
File "C:\Users\camer\anaconda3\envs\faceswap\lib\tkinter\__init__.py", line 1892, in __call__
return self.func(*args)
File "C:\Users\camer\faceswap\lib\gui\display_graph.py", line 364, in refresh
self._calcs = self._thread.get_result() # Terminate the LongRunningTask object
File "C:\Users\camer\faceswap\lib\gui\utils.py", line 1263, in get_result
raise self.err[1].with_traceback(self.err[2])
File "C:\Users\camer\faceswap\lib\gui\utils.py", line 1234, in run
retval = self._target(*self._args, **self._kwargs)
File "C:\Users\camer\faceswap\lib\gui\analysis\stats.py", line 565, in refresh
self._get_raw()
File "C:\Users\camer\faceswap\lib\gui\analysis\stats.py", line 628, in _get_raw
loss_dict = _SESSION.get_loss(self._session_id)
File "C:\Users\camer\faceswap\lib\gui\analysis\stats.py", line 174, in get_loss
loss_dict = self._tb_logs.get_loss(session_id=session_id)
File "C:\Users\camer\faceswap\lib\gui\analysis\event_reader.py", line 489, in get_loss
self._check_cache(idx)
File "C:\Users\camer\faceswap\lib\gui\analysis\event_reader.py", line 465, in _check_cache
self._cache_data(session_id)
File "C:\Users\camer\faceswap\lib\gui\analysis\event_reader.py", line 451, in _cache_data
parser.cache_events(session_id)
File "C:\Users\camer\faceswap\lib\gui\analysis\event_reader.py", line 610, in cache_events
self._cache.cache_data(session_id, data, self._loss_labels, is_live=self._live_data)
File "C:\Users\camer\faceswap\lib\gui\analysis\event_reader.py", line 181, in cache_data
self._add_latest_live(session_id, loss, timestamps)
File "C:\Users\camer\faceswap\lib\gui\analysis\event_reader.py", line 326, in _add_latest_live
old = np.frombuffer(zlib.decompress(cache[metric]), dtype=dtype).reshape(old_shape)
ValueError: cannot reshape array of size 39468 into shape (19746,2)
Re: Bug: Updated to Latest faceswap version now crashing when starting training
ianstephens wrote: ↑Sun May 22, 2022 2:26 pm@torzdf - I didn't manage to solve it - it persists.
We worked around the issue by running the graph full time in the FaceSwap window and enabling the second optional (separate) preview window. That way there is no need for switching in the FS GUI. We simply leave the graph running and monitor the preview window separately.
For what it's worth, we're running Windows 11. Didn't have this issue on Windows 10. @cedenburn - what are you running?
Windows 10 .
Re: Bug: Updated to Latest faceswap version now crashing when starting training
Ok, that's the ever-present graphing error. I have been playing whack-a-mole with this for the best part of 2 years :/
Please could you zip up your log files (inside your training folder) and provide me with a link? I may or may not be able to recreate the issue with the data in those files that currently exists on your HD.
My word is final
Re: Bug: Updated to Latest faceswap version now crashing when starting training
[mention]cedenburn[/mention] Thanks for the files. Unfortunately they opened just fine at my end, which makes me think it is a bug which only occurs during live training sessions... these are the worst kind of bugs to track down, sadly, so it's unlikely I'll have a solution any time soon
My word is final
Re: Bug: Updated to Latest faceswap version now crashing when starting training
ianstephens wrote: ↑Sat May 07, 2022 9:41 pmNo problem.
We just switched from preview back to session graph on an active session and reproduced a crash. It seemed to log a report so here it is:
Code: Select all
05/07/2022 22:22:09 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['03579.png', '06427.png', '01761.png', '06144.png', '01268.png'] 05/07/2022 22:22:11 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['09530.png', '04870.png', '03438.png', '07545.png', '01785.png'] 05/07/2022 22:22:14 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['02406.png', '03829.png', '09482.png', '05399.png', '01876.png'] 05/07/2022 22:22:16 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['02428.png', '10602.png', '00239.png', '08793.png', '08451.png'] 05/07/2022 22:22:19 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['00478.png', '08664.png', '04416.png', '09345.png', '00448.png'] 05/07/2022 22:22:22 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['09560.png', '03496.png', '09380.png', '05842.png', '03877.png'] 05/07/2022 22:22:24 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['05337.png', '08500.png', '04145.png', '05222.png', '03419.png'] 05/07/2022 22:22:27 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['09503.png', '08846.png', '06926.png', '03326.png', '05017.png'] 05/07/2022 22:22:30 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['00942.png', '03173.png', '09885.png', '10417.png', '10565.png'] 05/07/2022 22:22:32 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['02884.png', '03842.png', '09246.png', '04563.png', '04737.png'] 05/07/2022 22:22:35 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['07989.png', '03885.png', '10616.png', '07268.png', '00270.png'] 05/07/2022 22:22:38 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['07308.png', '05281.png', '08401.png', '09281.png', '08685.png'] 05/07/2022 22:22:40 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['10378.png', '05292.png', '07052.png', '00539.png', '07737.png'] 05/07/2022 22:22:43 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['07573.png', '08968.png', '00856.png', '00640.png', '01667.png'] 05/07/2022 22:22:46 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['08268.png', '00400.png', '08811.png', '01895.png', '00550.png'] 05/07/2022 22:22:48 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['02878.png', '09182.png', '08688.png', '01811.png', '10277.png'] 05/07/2022 22:22:51 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['06369.png', '04020.png', '10585.png', '02178.png', '09142.png'] 05/07/2022 22:22:54 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['04492.png', '01282.png', '06344.png', '03188.png', '02644.png'] 05/07/2022 22:22:57 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['05381.png', '04707.png', '10261.png', '04729.png', '09365.png'] 05/07/2022 22:22:59 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['03918.png', '05473.png', '09662.png', '05705.png', '02001.png'] 05/07/2022 22:23:01 MainProcess _run_0 generator cache_metadata VERBOSE Cache filled: 'C:\Convert AI\LVR2\Training Set' 05/07/2022 22:29:14 MainProcess _training_0 _base generate_preview DEBUG Generating preview 05/07/2022 22:29:14 MainProcess _training_0 _base compile_sample DEBUG Compiling samples: (side: 'a', samples: 14) 05/07/2022 22:29:14 MainProcess _training_0 _base compile_sample DEBUG Compiling samples: (side: 'b', samples: 14) 05/07/2022 22:29:14 MainProcess _training_0 _base show_sample DEBUG Showing sample 05/07/2022 22:29:14 MainProcess _training_0 _base _get_predictions DEBUG Getting Predictions 05/07/2022 22:29:16 MainProcess _training_0 _base _get_predictions DEBUG Returning predictions: {'a_a': (14, 384, 384, 3), 'b_b': (14, 384, 384, 3), 'a_b': (14, 384, 384, 3), 'b_a': (14, 384, 384, 3)} 05/07/2022 22:29:16 MainProcess _training_0 _base _to_full_frame DEBUG side: 'a', number of sample arrays: 3, prediction.shapes: [(14, 384, 384, 3), (14, 384, 384, 3)]) 05/07/2022 22:29:16 MainProcess _training_0 _base _process_full DEBUG full_size: 384, prediction_size: 384, color: (0, 0, 255) 05/07/2022 22:29:16 MainProcess _training_0 _base _resize_sample DEBUG Resizing sample: (side: 'a', sample.shape: (14, 384, 384, 3), target_size: 438, scale: 1.140625) 05/07/2022 22:29:16 MainProcess _training_0 _base _resize_sample DEBUG Resized sample: (side: 'a' shape: (14, 438, 438, 3)) 05/07/2022 22:29:16 MainProcess _training_0 _base _process_full DEBUG Overlayed background. Shape: (14, 438, 438, 3) 05/07/2022 22:29:16 MainProcess _training_0 _base _compile_masked DEBUG masked shapes: [(14, 384, 384, 3), (14, 384, 384, 3), (14, 384, 384, 3)] 05/07/2022 22:29:16 MainProcess _training_0 _base _overlay_foreground DEBUG Overlayed foreground. Shape: (14, 438, 438, 3) 05/07/2022 22:29:16 MainProcess _training_0 _base _overlay_foreground DEBUG Overlayed foreground. Shape: (14, 438, 438, 3) 05/07/2022 22:29:16 MainProcess _training_0 _base _overlay_foreground DEBUG Overlayed foreground. Shape: (14, 438, 438, 3) 05/07/2022 22:29:16 MainProcess _training_0 _base _resize_sample DEBUG Resizing sample: (side: 'a', sample.shape: (14, 438, 438, 3), target_size: 328, scale: 0.7488584474885844) 05/07/2022 22:29:16 MainProcess _training_0 _base _resize_sample DEBUG Resized sample: (side: 'a' shape: (14, 328, 328, 3)) 05/07/2022 22:29:16 MainProcess _training_0 _base _resize_sample DEBUG Resizing sample: (side: 'a', sample.shape: (14, 438, 438, 3), target_size: 328, scale: 0.7488584474885844) 05/07/2022 22:29:16 MainProcess _training_0 _base _resize_sample DEBUG Resized sample: (side: 'a' shape: (14, 328, 328, 3)) 05/07/2022 22:29:16 MainProcess _training_0 _base _resize_sample DEBUG Resizing sample: (side: 'a', sample.shape: (14, 438, 438, 3), target_size: 328, scale: 0.7488584474885844) 05/07/2022 22:29:16 MainProcess _training_0 _base _resize_sample DEBUG Resized sample: (side: 'a' shape: (14, 328, 328, 3)) 05/07/2022 22:29:16 MainProcess _training_0 _base _get_headers DEBUG side: 'a', width: 328 05/07/2022 22:29:16 MainProcess _training_0 _base _get_headers DEBUG height: 72, total_width: 984 05/07/2022 22:29:16 MainProcess _training_0 _base _get_headers DEBUG texts: ['Original (A)', 'Original > Original', 'Original > Swap'], text_sizes: [(183, 23), (296, 23), (259, 23)], text_x: [72, 344, 690], text_y: 47 05/07/2022 22:29:16 MainProcess _training_0 _base _get_headers DEBUG header_box.shape: (72, 984, 3) 05/07/2022 22:29:16 MainProcess _training_0 _base _to_full_frame DEBUG side: 'b', number of sample arrays: 3, prediction.shapes: [(14, 384, 384, 3), (14, 384, 384, 3)]) 05/07/2022 22:29:16 MainProcess _training_0 _base _process_full DEBUG full_size: 384, prediction_size: 384, color: (0, 0, 255) 05/07/2022 22:29:16 MainProcess _training_0 _base _resize_sample DEBUG Resizing sample: (side: 'b', sample.shape: (14, 384, 384, 3), target_size: 438, scale: 1.140625) 05/07/2022 22:29:17 MainProcess _training_0 _base _resize_sample DEBUG Resized sample: (side: 'b' shape: (14, 438, 438, 3)) 05/07/2022 22:29:17 MainProcess _training_0 _base _process_full DEBUG Overlayed background. Shape: (14, 438, 438, 3) 05/07/2022 22:29:17 MainProcess _training_0 _base _compile_masked DEBUG masked shapes: [(14, 384, 384, 3), (14, 384, 384, 3), (14, 384, 384, 3)] 05/07/2022 22:29:17 MainProcess _training_0 _base _overlay_foreground DEBUG Overlayed foreground. Shape: (14, 438, 438, 3) 05/07/2022 22:29:17 MainProcess _training_0 _base _overlay_foreground DEBUG Overlayed foreground. Shape: (14, 438, 438, 3) 05/07/2022 22:29:17 MainProcess _training_0 _base _overlay_foreground DEBUG Overlayed foreground. Shape: (14, 438, 438, 3) 05/07/2022 22:29:17 MainProcess _training_0 _base _resize_sample DEBUG Resizing sample: (side: 'b', sample.shape: (14, 438, 438, 3), target_size: 328, scale: 0.7488584474885844) 05/07/2022 22:29:17 MainProcess _training_0 _base _resize_sample DEBUG Resized sample: (side: 'b' shape: (14, 328, 328, 3)) 05/07/2022 22:29:17 MainProcess _training_0 _base _resize_sample DEBUG Resizing sample: (side: 'b', sample.shape: (14, 438, 438, 3), target_size: 328, scale: 0.7488584474885844) 05/07/2022 22:29:17 MainProcess _training_0 _base _resize_sample DEBUG Resized sample: (side: 'b' shape: (14, 328, 328, 3)) 05/07/2022 22:29:17 MainProcess _training_0 _base _resize_sample DEBUG Resizing sample: (side: 'b', sample.shape: (14, 438, 438, 3), target_size: 328, scale: 0.7488584474885844) 05/07/2022 22:29:17 MainProcess _training_0 _base _resize_sample DEBUG Resized sample: (side: 'b' shape: (14, 328, 328, 3)) 05/07/2022 22:29:17 MainProcess _training_0 _base _get_headers DEBUG side: 'b', width: 328 05/07/2022 22:29:17 MainProcess _training_0 _base _get_headers DEBUG height: 72, total_width: 984 05/07/2022 22:29:17 MainProcess _training_0 _base _get_headers DEBUG texts: ['Swap (B)', 'Swap > Swap', 'Swap > Original'], text_sizes: [(150, 23), (222, 23), (259, 23)], text_x: [89, 381, 690], text_y: 47 05/07/2022 22:29:17 MainProcess _training_0 _base _get_headers DEBUG header_box.shape: (72, 984, 3) 05/07/2022 22:29:17 MainProcess _training_0 _base _duplicate_headers DEBUG side: a header.shape: (72, 984, 3) 05/07/2022 22:29:17 MainProcess _training_0 _base _duplicate_headers DEBUG side: b header.shape: (72, 984, 3) 05/07/2022 22:29:17 MainProcess _training_0 _base _stack_images DEBUG Stack images 05/07/2022 22:29:17 MainProcess _training_0 _base get_transpose_axes DEBUG Even number of images to stack 05/07/2022 22:29:17 MainProcess _training_0 _base _stack_images DEBUG Stacked images 05/07/2022 22:29:17 MainProcess _training_0 _base show_sample DEBUG Compiled sample 05/07/2022 22:29:18 MainProcess _training_0 train _show DEBUG Updating preview: (name: Training - 'S': Save Now. 'R': Refresh Preview. 'M': Toggle Mask. 'ENTER': Save and Quit) 05/07/2022 22:29:18 MainProcess _training_0 train _show DEBUG Generating preview for GUI 05/07/2022 22:29:18 MainProcess _training_0 train _show DEBUG Generated preview for GUI: '.gui_training_preview.jpg' 05/07/2022 22:29:18 MainProcess _training_0 train _show DEBUG Generating preview for display: 'Training - 'S': Save Now. 'R': Refresh Preview. 'M': Toggle Mask. 'ENTER': Save and Quit' 05/07/2022 22:29:18 MainProcess _training_0 train _show DEBUG Generated preview for display: 'Training - 'S': Save Now. 'R': Refresh Preview. 'M': Toggle Mask. 'ENTER': Save and Quit' 05/07/2022 22:29:18 MainProcess _training_0 train _show DEBUG Updated preview: (name: Training - 'S': Save Now. 'R': Refresh Preview. 'M': Toggle Mask. 'ENTER': Save and Quit) 05/07/2022 22:29:18 MainProcess _training_0 train _run_training_cycle DEBUG Save Iteration: (iteration: 4500 05/07/2022 22:29:18 MainProcess _training_0 _base _save DEBUG Backing up and saving models 05/07/2022 22:29:18 MainProcess _training_0 _base _get_save_averages DEBUG Getting save averages 05/07/2022 22:29:18 MainProcess _training_0 _base _get_save_averages DEBUG Average losses since last save: [0.054676631107926366, 0.05488332705199719] 05/07/2022 22:29:18 MainProcess _training_0 _base _should_backup DEBUG Updated lowest historical save iteration averages from: {'a': 0.05644378334283829, 'b': 0.05529949029535055} to: {'a': 0.054676631107926366, 'b': 0.05488332705199719} 05/07/2022 22:29:18 MainProcess _training_0 _base _should_backup DEBUG Should backup: True 05/07/2022 22:29:18 MainProcess _training_0 backup_restore backup_model VERBOSE Backing up: 'C:\Convert AI\LVR2\Model\phaze_a.h5' to 'C:\Convert AI\LVR2\Model\phaze_a.h5.bk' 05/07/2022 22:29:18 MainProcess _training_0 backup_restore backup_model VERBOSE Backing up: 'C:\Convert AI\LVR2\Model\phaze_a_state.json' to 'C:\Convert AI\LVR2\Model\phaze_a_state.json.bk' 05/07/2022 22:29:22 MainProcess _training_0 _base save DEBUG Saving State 05/07/2022 22:29:22 MainProcess _training_0 serializer save DEBUG filename: C:\Convert AI\LVR2\Model\phaze_a_state.json, data type: <class 'dict'> 05/07/2022 22:29:22 MainProcess _training_0 serializer _check_extension DEBUG Original filename: 'C:\Convert AI\LVR2\Model\phaze_a_state.json', final filename: 'C:\Convert AI\LVR2\Model\phaze_a_state.json' 05/07/2022 22:29:22 MainProcess _training_0 serializer marshal DEBUG data type: <class 'dict'> 05/07/2022 22:29:22 MainProcess _training_0 serializer marshal DEBUG returned data type: <class 'bytes'> 05/07/2022 22:29:22 MainProcess _training_0 _base save DEBUG Saved State 05/07/2022 22:29:22 MainProcess _training_0 _base _save INFO [Saved models] - Average loss since last save: face_a: 0.05468, face_b: 0.05488 05/07/2022 22:34:03 MainProcess _training_0 multithreading run DEBUG Error in thread (_training_0): [Errno 22] Invalid argument 05/07/2022 22:34:05 MainProcess MainThread train _monitor DEBUG Thread error detected 05/07/2022 22:34:05 MainProcess MainThread train _monitor DEBUG Closed Monitor 05/07/2022 22:34:05 MainProcess MainThread train _end_thread DEBUG Ending Training thread 05/07/2022 22:34:05 MainProcess MainThread train _end_thread CRITICAL Error caught! Exiting... 05/07/2022 22:34:05 MainProcess MainThread multithreading join DEBUG Joining Threads: '_training' 05/07/2022 22:34:05 MainProcess MainThread multithreading join DEBUG Joining Thread: '_training_0' 05/07/2022 22:34:05 MainProcess MainThread multithreading join ERROR Caught exception in thread: '_training_0' Traceback (most recent call last): File "C:\Convert\lib\cli\launcher.py", line 182, in execute_script process.process() File "C:\Convert\scripts\train.py", line 190, in process self._end_thread(thread, err) File "C:\Convert\scripts\train.py", line 230, in _end_thread thread.join() File "C:\Convert\lib\multithreading.py", line 121, in join raise thread.err[1].with_traceback(thread.err[2]) File "C:\Convert\lib\multithreading.py", line 37, in run self._target(*self._args, **self._kwargs) File "C:\Convert\scripts\train.py", line 252, in _training raise err File "C:\Convert\scripts\train.py", line 242, in _training self._run_training_cycle(model, trainer) File "C:\Convert\scripts\train.py", line 327, in _run_training_cycle trainer.train_one_step(viewer, timelapse) File "C:\Convert\plugins\train\trainer\_base.py", line 225, in train_one_step self._print_loss(loss) File "C:\Convert\plugins\train\trainer\_base.py", line 314, in _print_loss print(f"\r{output}", end="") OSError: [Errno 22] Invalid argument ============ System Information ============ encoding: cp1252 git_branch: master git_commits: a046248 BugFix - lib.keypress. 60f95bb fix: PhazeA - Use correct name for EffNetV2 freezing gpu_cuda: No global version found. Check Conda packages for Conda Cuda gpu_cudnn: No global version found. Check Conda packages for Conda cuDNN gpu_devices: GPU_0: NVIDIA GeForce RTX 3090, GPU_1: NVIDIA GeForce RTX 2080 Ti gpu_devices_active: GPU_0 gpu_driver: 512.15 gpu_vram: GPU_0: 24576MB, GPU_1: 11264MB os_machine: AMD64 os_platform: Windows-10-10.0.22000-SP0 os_release: 10 py_command: C:\Convert\faceswap.py train -A C:/Convert AI/LVR2/Training Set -B C:/Convert AI/L Work Folder/Brand New Set 512 -m C:/Convert AI/LVR2/Model -t phaze-a -bs 5 -it 1000000 -s 500 -ss 25000 -ps 75 -p -wl -X 1 -L INFO -gui py_conda_version: conda 4.12.0 py_implementation: CPython py_version: 3.8.13 py_virtual_env: True sys_cores: 48 sys_processor: Intel64 Family 6 Model 85 Stepping 4, GenuineIntel sys_ram: Total: 130718MB, Available: 117856MB, Used: 12861MB, Free: 117856MB =============== Pip Packages =============== ============== Conda Packages ============== # packages in environment at C:\Users\ \MiniConda3\envs\faceswap: # # Name Version Build Channel absl-py 1.0.0 pypi_0 pypi astunparse 1.6.3 pypi_0 pypi blas 1.0 mkl
ca-certificates 2021.10.8 h5b45459_0 conda-forge cachetools 5.0.0 pypi_0 pypi certifi 2021.10.8 py38haa244fe_2 conda-forge charset-normalizer 2.0.12 pypi_0 pypi colorama 0.4.4 pyhd3eb1b0_0
cudatoolkit 11.2.2 h933977f_10 conda-forge cudnn 8.1.0.77 h3e0f4f4_0 conda-forge cycler 0.11.0 pyhd3eb1b0_0
fastcluster 1.2.6 py38hcc40339_1 conda-forge ffmpeg 4.3.1 ha925a31_0 conda-forge ffmpy 0.2.3 pypi_0 pypi flatbuffers 2.0 pypi_0 pypi freetype 2.10.4 hd328e21_0
gast 0.5.3 pypi_0 pypi git 2.34.1 haa95532_0
google-auth 2.6.6 pypi_0 pypi google-auth-oauthlib 0.4.6 pypi_0 pypi google-pasta 0.2.0 pypi_0 pypi grpcio 1.46.0 pypi_0 pypi h5py 3.6.0 pypi_0 pypi icc_rt 2019.0.0 h0cc432a_1
icu 58.2 ha925a31_3
idna 3.3 pypi_0 pypi imageio 2.9.0 pyhd3eb1b0_0
imageio-ffmpeg 0.4.7 pyhd8ed1ab_0 conda-forge importlib-metadata 4.11.3 pypi_0 pypi intel-openmp 2021.4.0 haa95532_3556
joblib 1.1.0 pyhd3eb1b0_0
jpeg 9e h2bbff1b_0
keras 2.8.0 pypi_0 pypi keras-preprocessing 1.1.2 pypi_0 pypi kiwisolver 1.3.2 py38hd77b12b_0
libclang 14.0.1 pypi_0 pypi libpng 1.6.37 h2a8f88b_0
libtiff 4.2.0 hd0e1b90_0
libwebp 1.2.2 h2bbff1b_0
lz4-c 1.9.3 h2bbff1b_1
markdown 3.3.7 pypi_0 pypi matplotlib 3.2.2 0
matplotlib-base 3.2.2 py38h64f37c6_0
mkl 2021.4.0 haa95532_640
mkl-service 2.4.0 py38h2bbff1b_0
mkl_fft 1.3.1 py38h277e83a_0
mkl_random 1.2.2 py38hf11a4ad_0
numpy 1.21.5 py38h7a0a035_2
numpy-base 1.21.5 py38hca35cd5_2
nvidia-ml-py 11.510.69 pypi_0 pypi oauthlib 3.2.0 pypi_0 pypi opencv-python 4.5.5.64 pypi_0 pypi openssl 1.1.1o h8ffe710_0 conda-forge opt-einsum 3.3.0 pypi_0 pypi pillow 9.0.1 py38hdc2b20a_0
pip 21.2.2 py38haa95532_0
protobuf 3.20.1 pypi_0 pypi psutil 5.8.0 py38h2bbff1b_1
pyasn1 0.4.8 pypi_0 pypi pyasn1-modules 0.2.8 pypi_0 pypi pyparsing 3.0.4 pyhd3eb1b0_0
pyqt 5.9.2 py38hd77b12b_6
python 3.8.13 h6244533_0
python-dateutil 2.8.2 pyhd3eb1b0_0
python_abi 3.8 2_cp38 conda-forge pywin32 302 py38h2bbff1b_2
qt 5.9.7 vc14h73c81de_0
requests 2.27.1 pypi_0 pypi requests-oauthlib 1.3.1 pypi_0 pypi rsa 4.8 pypi_0 pypi scikit-learn 1.0.2 py38hf11a4ad_1
scipy 1.7.3 py38h0a974cb_0
setuptools 61.2.0 py38haa95532_0
sip 4.19.13 py38hd77b12b_0
six 1.16.0 pyhd3eb1b0_1
sqlite 3.38.3 h2bbff1b_0
tensorboard 2.8.0 pypi_0 pypi tensorboard-data-server 0.6.1 pypi_0 pypi tensorboard-plugin-wit 1.8.1 pypi_0 pypi tensorflow-gpu 2.8.0 pypi_0 pypi tensorflow-io-gcs-filesystem 0.25.0 pypi_0 pypi termcolor 1.1.0 pypi_0 pypi tf-estimator-nightly 2.8.0.dev2021122109 pypi_0 pypi threadpoolctl 2.2.0 pyh0d69192_0
tk 8.6.11 h2bbff1b_0
tornado 6.1 py38h2bbff1b_0
tqdm 4.64.0 py38haa95532_0
typing-extensions 4.2.0 pypi_0 pypi urllib3 1.26.9 pypi_0 pypi vc 14.2 h21ff451_1
vs2015_runtime 14.27.29016 h5e58377_2
werkzeug 2.1.2 pypi_0 pypi wheel 0.37.1 pyhd3eb1b0_0
wincertstore 0.2 py38haa95532_2
wrapt 1.14.1 pypi_0 pypi xz 5.2.5 h8cc25b3_1
zipp 3.8.0 pypi_0 pypi zlib 1.2.12 h8cc25b3_2
zstd 1.4.9 h19a0ad4_0 ================= Configs ================== --------- .faceswap --------- backend: nvidia --------- convert.ini --------- [color.color_transfer] clip: True preserve_paper: True [color.manual_balance] colorspace: HSV balance_1: 0.0 balance_2: 0.0 balance_3: 0.0 contrast: 0.0 brightness: 0.0 [color.match_hist] threshold: 99.0 [mask.box_blend] type: gaussian distance: 11.0 radius: 5.0 passes: 1 [mask.mask_blend] type: normalized kernel_size: 3 passes: 4 threshold: 4 erosion: 0.0 [scaling.sharpen] method: none amount: 150 radius: 0.3 threshold: 5.0 [writer.ffmpeg] container: mp4 codec: libx264 crf: 23 preset: medium tune: none profile: auto level: auto skip_mux: False [writer.gif] fps: 25 loop: 0 palettesize: 256 subrectangles: False [writer.opencv] format: png draw_transparent: False jpg_quality: 75 png_compress_level: 3 [writer.pillow] format: png draw_transparent: False optimize: False gif_interlace: True jpg_quality: 75 png_compress_level: 3 tif_compression: tiff_deflate --------- extract.ini --------- [global] allow_growth: False [align.fan] batch-size: 12 [detect.cv2_dnn] confidence: 50 [detect.mtcnn] minsize: 20 scalefactor: 0.709 batch-size: 8 threshold_1: 0.6 threshold_2: 0.7 threshold_3: 0.7 [detect.s3fd] confidence: 50 batch-size: 4 [mask.bisenet_fp] batch-size: 8 weights: faceswap include_ears: False include_hair: False include_glasses: True [mask.unet_dfl] batch-size: 8 [mask.vgg_clear] batch-size: 6 [mask.vgg_obstructed] batch-size: 2 --------- gui.ini --------- [global] fullscreen: False tab: extract options_panel_width: 30 console_panel_height: 20 icon_size: 14 font: default font_size: 9 autosave_last_session: prompt timeout: 120 auto_load_model_stats: False --------- train.ini --------- [global] centering: face coverage: 87.5 icnr_init: False conv_aware_init: True optimizer: adam learning_rate: 4e-05 epsilon_exponent: -5 reflect_padding: False allow_growth: False mixed_precision: True nan_protection: True convert_batchsize: 16 [global.loss] loss_function: ssim mask_loss_function: mse l2_reg_term: 100 eye_multiplier: 3 mouth_multiplier: 2 penalized_mask_loss: True mask_type: bisenet-fp_face mask_blur_kernel: 3 mask_threshold: 4 learn_mask: False [model.dfaker] output_size: 128 [model.dfl_h128] lowmem: False [model.dfl_sae] input_size: 128 clipnorm: True architecture: df autoencoder_dims: 0 encoder_dims: 42 decoder_dims: 21 multiscale_decoder: False [model.dlight] features: best details: good output_size: 256 [model.original] lowmem: False [model.phaze_a] output_size: 384 shared_fc: None enable_gblock: True split_fc: True split_gblock: False split_decoders: False enc_architecture: efficientnet_v2_l enc_scaling: 80 enc_load_weights: True bottleneck_type: dense bottleneck_norm: None bottleneck_size: 512 bottleneck_in_encoder: True fc_depth: 1 fc_min_filters: 1280 fc_max_filters: 1280 fc_dimensions: 8 fc_filter_slope: -0.5 fc_dropout: 0.0 fc_upsampler: upsample2d fc_upsamples: 1 fc_upsample_filters: 1280 fc_gblock_depth: 3 fc_gblock_min_nodes: 512 fc_gblock_max_nodes: 512 fc_gblock_filter_slope: -0.5 fc_gblock_dropout: 0.0 dec_upscale_method: resize_images dec_norm: None dec_min_filters: 160 dec_max_filters: 640 dec_filter_slope: -0.33 dec_res_blocks: 1 dec_output_kernel: 3 dec_gaussian: True dec_skip_last_residual: False freeze_layers: keras_encoder load_layers: encoder fs_original_depth: 4 fs_original_min_filters: 128 fs_original_max_filters: 1024 mobilenet_width: 1.0 mobilenet_depth: 1 mobilenet_dropout: 0.001 mobilenet_minimalistic: False [model.realface] input_size: 64 output_size: 128 dense_nodes: 1536 complexity_encoder: 128 complexity_decoder: 512 [model.unbalanced] input_size: 128 lowmem: False clipnorm: True nodes: 1024 complexity_encoder: 128 complexity_decoder_a: 384 complexity_decoder_b: 512 [model.villain] lowmem: False [trainer.original] preview_images: 14 zoom_amount: 5 rotation_range: 10 shift_range: 5 flip_chance: 50 color_lightness: 30 color_ab: 8 color_clahe_chance: 50 color_clahe_max_size: 4
Ok, I think I know what causes this error, but not how to fix it. Similar to the other errors it appears to be to do with distributed training. Whilst fixing the other bug, I noticed I got a similar but different error on Linux relating to file descriptors. It appears that Tensorflow are doing something with multiprocessing when distributed is enabled.
I did not get this error when distributed training was disabled. Why switching to graph tab would cause this, I do not know (my specific error would come every time I was stopping training on the cli). As this appears to be happening upstream of us, I'm not sure what I can do about it. But I can, if nothing else, confirm that something weird is happening there.
My word is final
Re: Bug: Updated to Latest faceswap version now crashing when starting training
torzdf wrote: ↑Sun May 29, 2022 5:20 pmianstephens wrote: ↑Sat May 07, 2022 9:41 pmNo problem.
We just switched from preview back to session graph on an active session and reproduced a crash. It seemed to log a report so here it is:
Code: Select all
05/07/2022 22:22:09 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['03579.png', '06427.png', '01761.png', '06144.png', '01268.png'] 05/07/2022 22:22:11 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['09530.png', '04870.png', '03438.png', '07545.png', '01785.png'] 05/07/2022 22:22:14 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['02406.png', '03829.png', '09482.png', '05399.png', '01876.png'] 05/07/2022 22:22:16 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['02428.png', '10602.png', '00239.png', '08793.png', '08451.png'] 05/07/2022 22:22:19 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['00478.png', '08664.png', '04416.png', '09345.png', '00448.png'] 05/07/2022 22:22:22 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['09560.png', '03496.png', '09380.png', '05842.png', '03877.png'] 05/07/2022 22:22:24 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['05337.png', '08500.png', '04145.png', '05222.png', '03419.png'] 05/07/2022 22:22:27 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['09503.png', '08846.png', '06926.png', '03326.png', '05017.png'] 05/07/2022 22:22:30 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['00942.png', '03173.png', '09885.png', '10417.png', '10565.png'] 05/07/2022 22:22:32 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['02884.png', '03842.png', '09246.png', '04563.png', '04737.png'] 05/07/2022 22:22:35 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['07989.png', '03885.png', '10616.png', '07268.png', '00270.png'] 05/07/2022 22:22:38 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['07308.png', '05281.png', '08401.png', '09281.png', '08685.png'] 05/07/2022 22:22:40 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['10378.png', '05292.png', '07052.png', '00539.png', '07737.png'] 05/07/2022 22:22:43 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['07573.png', '08968.png', '00856.png', '00640.png', '01667.png'] 05/07/2022 22:22:46 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['08268.png', '00400.png', '08811.png', '01895.png', '00550.png'] 05/07/2022 22:22:48 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['02878.png', '09182.png', '08688.png', '01811.png', '10277.png'] 05/07/2022 22:22:51 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['06369.png', '04020.png', '10585.png', '02178.png', '09142.png'] 05/07/2022 22:22:54 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['04492.png', '01282.png', '06344.png', '03188.png', '02644.png'] 05/07/2022 22:22:57 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['05381.png', '04707.png', '10261.png', '04729.png', '09365.png'] 05/07/2022 22:22:59 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['03918.png', '05473.png', '09662.png', '05705.png', '02001.png'] 05/07/2022 22:23:01 MainProcess _run_0 generator cache_metadata VERBOSE Cache filled: 'C:\Convert AI\LVR2\Training Set' 05/07/2022 22:29:14 MainProcess _training_0 _base generate_preview DEBUG Generating preview 05/07/2022 22:29:14 MainProcess _training_0 _base compile_sample DEBUG Compiling samples: (side: 'a', samples: 14) 05/07/2022 22:29:14 MainProcess _training_0 _base compile_sample DEBUG Compiling samples: (side: 'b', samples: 14) 05/07/2022 22:29:14 MainProcess _training_0 _base show_sample DEBUG Showing sample 05/07/2022 22:29:14 MainProcess _training_0 _base _get_predictions DEBUG Getting Predictions 05/07/2022 22:29:16 MainProcess _training_0 _base _get_predictions DEBUG Returning predictions: {'a_a': (14, 384, 384, 3), 'b_b': (14, 384, 384, 3), 'a_b': (14, 384, 384, 3), 'b_a': (14, 384, 384, 3)} 05/07/2022 22:29:16 MainProcess _training_0 _base _to_full_frame DEBUG side: 'a', number of sample arrays: 3, prediction.shapes: [(14, 384, 384, 3), (14, 384, 384, 3)]) 05/07/2022 22:29:16 MainProcess _training_0 _base _process_full DEBUG full_size: 384, prediction_size: 384, color: (0, 0, 255) 05/07/2022 22:29:16 MainProcess _training_0 _base _resize_sample DEBUG Resizing sample: (side: 'a', sample.shape: (14, 384, 384, 3), target_size: 438, scale: 1.140625) 05/07/2022 22:29:16 MainProcess _training_0 _base _resize_sample DEBUG Resized sample: (side: 'a' shape: (14, 438, 438, 3)) 05/07/2022 22:29:16 MainProcess _training_0 _base _process_full DEBUG Overlayed background. Shape: (14, 438, 438, 3) 05/07/2022 22:29:16 MainProcess _training_0 _base _compile_masked DEBUG masked shapes: [(14, 384, 384, 3), (14, 384, 384, 3), (14, 384, 384, 3)] 05/07/2022 22:29:16 MainProcess _training_0 _base _overlay_foreground DEBUG Overlayed foreground. Shape: (14, 438, 438, 3) 05/07/2022 22:29:16 MainProcess _training_0 _base _overlay_foreground DEBUG Overlayed foreground. Shape: (14, 438, 438, 3) 05/07/2022 22:29:16 MainProcess _training_0 _base _overlay_foreground DEBUG Overlayed foreground. Shape: (14, 438, 438, 3) 05/07/2022 22:29:16 MainProcess _training_0 _base _resize_sample DEBUG Resizing sample: (side: 'a', sample.shape: (14, 438, 438, 3), target_size: 328, scale: 0.7488584474885844) 05/07/2022 22:29:16 MainProcess _training_0 _base _resize_sample DEBUG Resized sample: (side: 'a' shape: (14, 328, 328, 3)) 05/07/2022 22:29:16 MainProcess _training_0 _base _resize_sample DEBUG Resizing sample: (side: 'a', sample.shape: (14, 438, 438, 3), target_size: 328, scale: 0.7488584474885844) 05/07/2022 22:29:16 MainProcess _training_0 _base _resize_sample DEBUG Resized sample: (side: 'a' shape: (14, 328, 328, 3)) 05/07/2022 22:29:16 MainProcess _training_0 _base _resize_sample DEBUG Resizing sample: (side: 'a', sample.shape: (14, 438, 438, 3), target_size: 328, scale: 0.7488584474885844) 05/07/2022 22:29:16 MainProcess _training_0 _base _resize_sample DEBUG Resized sample: (side: 'a' shape: (14, 328, 328, 3)) 05/07/2022 22:29:16 MainProcess _training_0 _base _get_headers DEBUG side: 'a', width: 328 05/07/2022 22:29:16 MainProcess _training_0 _base _get_headers DEBUG height: 72, total_width: 984 05/07/2022 22:29:16 MainProcess _training_0 _base _get_headers DEBUG texts: ['Original (A)', 'Original > Original', 'Original > Swap'], text_sizes: [(183, 23), (296, 23), (259, 23)], text_x: [72, 344, 690], text_y: 47 05/07/2022 22:29:16 MainProcess _training_0 _base _get_headers DEBUG header_box.shape: (72, 984, 3) 05/07/2022 22:29:16 MainProcess _training_0 _base _to_full_frame DEBUG side: 'b', number of sample arrays: 3, prediction.shapes: [(14, 384, 384, 3), (14, 384, 384, 3)]) 05/07/2022 22:29:16 MainProcess _training_0 _base _process_full DEBUG full_size: 384, prediction_size: 384, color: (0, 0, 255) 05/07/2022 22:29:16 MainProcess _training_0 _base _resize_sample DEBUG Resizing sample: (side: 'b', sample.shape: (14, 384, 384, 3), target_size: 438, scale: 1.140625) 05/07/2022 22:29:17 MainProcess _training_0 _base _resize_sample DEBUG Resized sample: (side: 'b' shape: (14, 438, 438, 3)) 05/07/2022 22:29:17 MainProcess _training_0 _base _process_full DEBUG Overlayed background. Shape: (14, 438, 438, 3) 05/07/2022 22:29:17 MainProcess _training_0 _base _compile_masked DEBUG masked shapes: [(14, 384, 384, 3), (14, 384, 384, 3), (14, 384, 384, 3)] 05/07/2022 22:29:17 MainProcess _training_0 _base _overlay_foreground DEBUG Overlayed foreground. Shape: (14, 438, 438, 3) 05/07/2022 22:29:17 MainProcess _training_0 _base _overlay_foreground DEBUG Overlayed foreground. Shape: (14, 438, 438, 3) 05/07/2022 22:29:17 MainProcess _training_0 _base _overlay_foreground DEBUG Overlayed foreground. Shape: (14, 438, 438, 3) 05/07/2022 22:29:17 MainProcess _training_0 _base _resize_sample DEBUG Resizing sample: (side: 'b', sample.shape: (14, 438, 438, 3), target_size: 328, scale: 0.7488584474885844) 05/07/2022 22:29:17 MainProcess _training_0 _base _resize_sample DEBUG Resized sample: (side: 'b' shape: (14, 328, 328, 3)) 05/07/2022 22:29:17 MainProcess _training_0 _base _resize_sample DEBUG Resizing sample: (side: 'b', sample.shape: (14, 438, 438, 3), target_size: 328, scale: 0.7488584474885844) 05/07/2022 22:29:17 MainProcess _training_0 _base _resize_sample DEBUG Resized sample: (side: 'b' shape: (14, 328, 328, 3)) 05/07/2022 22:29:17 MainProcess _training_0 _base _resize_sample DEBUG Resizing sample: (side: 'b', sample.shape: (14, 438, 438, 3), target_size: 328, scale: 0.7488584474885844) 05/07/2022 22:29:17 MainProcess _training_0 _base _resize_sample DEBUG Resized sample: (side: 'b' shape: (14, 328, 328, 3)) 05/07/2022 22:29:17 MainProcess _training_0 _base _get_headers DEBUG side: 'b', width: 328 05/07/2022 22:29:17 MainProcess _training_0 _base _get_headers DEBUG height: 72, total_width: 984 05/07/2022 22:29:17 MainProcess _training_0 _base _get_headers DEBUG texts: ['Swap (B)', 'Swap > Swap', 'Swap > Original'], text_sizes: [(150, 23), (222, 23), (259, 23)], text_x: [89, 381, 690], text_y: 47 05/07/2022 22:29:17 MainProcess _training_0 _base _get_headers DEBUG header_box.shape: (72, 984, 3) 05/07/2022 22:29:17 MainProcess _training_0 _base _duplicate_headers DEBUG side: a header.shape: (72, 984, 3) 05/07/2022 22:29:17 MainProcess _training_0 _base _duplicate_headers DEBUG side: b header.shape: (72, 984, 3) 05/07/2022 22:29:17 MainProcess _training_0 _base _stack_images DEBUG Stack images 05/07/2022 22:29:17 MainProcess _training_0 _base get_transpose_axes DEBUG Even number of images to stack 05/07/2022 22:29:17 MainProcess _training_0 _base _stack_images DEBUG Stacked images 05/07/2022 22:29:17 MainProcess _training_0 _base show_sample DEBUG Compiled sample 05/07/2022 22:29:18 MainProcess _training_0 train _show DEBUG Updating preview: (name: Training - 'S': Save Now. 'R': Refresh Preview. 'M': Toggle Mask. 'ENTER': Save and Quit) 05/07/2022 22:29:18 MainProcess _training_0 train _show DEBUG Generating preview for GUI 05/07/2022 22:29:18 MainProcess _training_0 train _show DEBUG Generated preview for GUI: '.gui_training_preview.jpg' 05/07/2022 22:29:18 MainProcess _training_0 train _show DEBUG Generating preview for display: 'Training - 'S': Save Now. 'R': Refresh Preview. 'M': Toggle Mask. 'ENTER': Save and Quit' 05/07/2022 22:29:18 MainProcess _training_0 train _show DEBUG Generated preview for display: 'Training - 'S': Save Now. 'R': Refresh Preview. 'M': Toggle Mask. 'ENTER': Save and Quit' 05/07/2022 22:29:18 MainProcess _training_0 train _show DEBUG Updated preview: (name: Training - 'S': Save Now. 'R': Refresh Preview. 'M': Toggle Mask. 'ENTER': Save and Quit) 05/07/2022 22:29:18 MainProcess _training_0 train _run_training_cycle DEBUG Save Iteration: (iteration: 4500 05/07/2022 22:29:18 MainProcess _training_0 _base _save DEBUG Backing up and saving models 05/07/2022 22:29:18 MainProcess _training_0 _base _get_save_averages DEBUG Getting save averages 05/07/2022 22:29:18 MainProcess _training_0 _base _get_save_averages DEBUG Average losses since last save: [0.054676631107926366, 0.05488332705199719] 05/07/2022 22:29:18 MainProcess _training_0 _base _should_backup DEBUG Updated lowest historical save iteration averages from: {'a': 0.05644378334283829, 'b': 0.05529949029535055} to: {'a': 0.054676631107926366, 'b': 0.05488332705199719} 05/07/2022 22:29:18 MainProcess _training_0 _base _should_backup DEBUG Should backup: True 05/07/2022 22:29:18 MainProcess _training_0 backup_restore backup_model VERBOSE Backing up: 'C:\Convert AI\LVR2\Model\phaze_a.h5' to 'C:\Convert AI\LVR2\Model\phaze_a.h5.bk' 05/07/2022 22:29:18 MainProcess _training_0 backup_restore backup_model VERBOSE Backing up: 'C:\Convert AI\LVR2\Model\phaze_a_state.json' to 'C:\Convert AI\LVR2\Model\phaze_a_state.json.bk' 05/07/2022 22:29:22 MainProcess _training_0 _base save DEBUG Saving State 05/07/2022 22:29:22 MainProcess _training_0 serializer save DEBUG filename: C:\Convert AI\LVR2\Model\phaze_a_state.json, data type: <class 'dict'> 05/07/2022 22:29:22 MainProcess _training_0 serializer _check_extension DEBUG Original filename: 'C:\Convert AI\LVR2\Model\phaze_a_state.json', final filename: 'C:\Convert AI\LVR2\Model\phaze_a_state.json' 05/07/2022 22:29:22 MainProcess _training_0 serializer marshal DEBUG data type: <class 'dict'> 05/07/2022 22:29:22 MainProcess _training_0 serializer marshal DEBUG returned data type: <class 'bytes'> 05/07/2022 22:29:22 MainProcess _training_0 _base save DEBUG Saved State 05/07/2022 22:29:22 MainProcess _training_0 _base _save INFO [Saved models] - Average loss since last save: face_a: 0.05468, face_b: 0.05488 05/07/2022 22:34:03 MainProcess _training_0 multithreading run DEBUG Error in thread (_training_0): [Errno 22] Invalid argument 05/07/2022 22:34:05 MainProcess MainThread train _monitor DEBUG Thread error detected 05/07/2022 22:34:05 MainProcess MainThread train _monitor DEBUG Closed Monitor 05/07/2022 22:34:05 MainProcess MainThread train _end_thread DEBUG Ending Training thread 05/07/2022 22:34:05 MainProcess MainThread train _end_thread CRITICAL Error caught! Exiting... 05/07/2022 22:34:05 MainProcess MainThread multithreading join DEBUG Joining Threads: '_training' 05/07/2022 22:34:05 MainProcess MainThread multithreading join DEBUG Joining Thread: '_training_0' 05/07/2022 22:34:05 MainProcess MainThread multithreading join ERROR Caught exception in thread: '_training_0' Traceback (most recent call last): File "C:\Convert\lib\cli\launcher.py", line 182, in execute_script process.process() File "C:\Convert\scripts\train.py", line 190, in process self._end_thread(thread, err) File "C:\Convert\scripts\train.py", line 230, in _end_thread thread.join() File "C:\Convert\lib\multithreading.py", line 121, in join raise thread.err[1].with_traceback(thread.err[2]) File "C:\Convert\lib\multithreading.py", line 37, in run self._target(*self._args, **self._kwargs) File "C:\Convert\scripts\train.py", line 252, in _training raise err File "C:\Convert\scripts\train.py", line 242, in _training self._run_training_cycle(model, trainer) File "C:\Convert\scripts\train.py", line 327, in _run_training_cycle trainer.train_one_step(viewer, timelapse) File "C:\Convert\plugins\train\trainer\_base.py", line 225, in train_one_step self._print_loss(loss) File "C:\Convert\plugins\train\trainer\_base.py", line 314, in _print_loss print(f"\r{output}", end="") OSError: [Errno 22] Invalid argument ============ System Information ============ encoding: cp1252 git_branch: master git_commits: a046248 BugFix - lib.keypress. 60f95bb fix: PhazeA - Use correct name for EffNetV2 freezing gpu_cuda: No global version found. Check Conda packages for Conda Cuda gpu_cudnn: No global version found. Check Conda packages for Conda cuDNN gpu_devices: GPU_0: NVIDIA GeForce RTX 3090, GPU_1: NVIDIA GeForce RTX 2080 Ti gpu_devices_active: GPU_0 gpu_driver: 512.15 gpu_vram: GPU_0: 24576MB, GPU_1: 11264MB os_machine: AMD64 os_platform: Windows-10-10.0.22000-SP0 os_release: 10 py_command: C:\Convert\faceswap.py train -A C:/Convert AI/LVR2/Training Set -B C:/Convert AI/L Work Folder/Brand New Set 512 -m C:/Convert AI/LVR2/Model -t phaze-a -bs 5 -it 1000000 -s 500 -ss 25000 -ps 75 -p -wl -X 1 -L INFO -gui py_conda_version: conda 4.12.0 py_implementation: CPython py_version: 3.8.13 py_virtual_env: True sys_cores: 48 sys_processor: Intel64 Family 6 Model 85 Stepping 4, GenuineIntel sys_ram: Total: 130718MB, Available: 117856MB, Used: 12861MB, Free: 117856MB =============== Pip Packages =============== ============== Conda Packages ============== # packages in environment at C:\Users\ \MiniConda3\envs\faceswap: # # Name Version Build Channel absl-py 1.0.0 pypi_0 pypi astunparse 1.6.3 pypi_0 pypi blas 1.0 mkl
ca-certificates 2021.10.8 h5b45459_0 conda-forge cachetools 5.0.0 pypi_0 pypi certifi 2021.10.8 py38haa244fe_2 conda-forge charset-normalizer 2.0.12 pypi_0 pypi colorama 0.4.4 pyhd3eb1b0_0
cudatoolkit 11.2.2 h933977f_10 conda-forge cudnn 8.1.0.77 h3e0f4f4_0 conda-forge cycler 0.11.0 pyhd3eb1b0_0
fastcluster 1.2.6 py38hcc40339_1 conda-forge ffmpeg 4.3.1 ha925a31_0 conda-forge ffmpy 0.2.3 pypi_0 pypi flatbuffers 2.0 pypi_0 pypi freetype 2.10.4 hd328e21_0
gast 0.5.3 pypi_0 pypi git 2.34.1 haa95532_0
google-auth 2.6.6 pypi_0 pypi google-auth-oauthlib 0.4.6 pypi_0 pypi google-pasta 0.2.0 pypi_0 pypi grpcio 1.46.0 pypi_0 pypi h5py 3.6.0 pypi_0 pypi icc_rt 2019.0.0 h0cc432a_1
icu 58.2 ha925a31_3
idna 3.3 pypi_0 pypi imageio 2.9.0 pyhd3eb1b0_0
imageio-ffmpeg 0.4.7 pyhd8ed1ab_0 conda-forge importlib-metadata 4.11.3 pypi_0 pypi intel-openmp 2021.4.0 haa95532_3556
joblib 1.1.0 pyhd3eb1b0_0
jpeg 9e h2bbff1b_0
keras 2.8.0 pypi_0 pypi keras-preprocessing 1.1.2 pypi_0 pypi kiwisolver 1.3.2 py38hd77b12b_0
libclang 14.0.1 pypi_0 pypi libpng 1.6.37 h2a8f88b_0
libtiff 4.2.0 hd0e1b90_0
libwebp 1.2.2 h2bbff1b_0
lz4-c 1.9.3 h2bbff1b_1
markdown 3.3.7 pypi_0 pypi matplotlib 3.2.2 0
matplotlib-base 3.2.2 py38h64f37c6_0
mkl 2021.4.0 haa95532_640
mkl-service 2.4.0 py38h2bbff1b_0
mkl_fft 1.3.1 py38h277e83a_0
mkl_random 1.2.2 py38hf11a4ad_0
numpy 1.21.5 py38h7a0a035_2
numpy-base 1.21.5 py38hca35cd5_2
nvidia-ml-py 11.510.69 pypi_0 pypi oauthlib 3.2.0 pypi_0 pypi opencv-python 4.5.5.64 pypi_0 pypi openssl 1.1.1o h8ffe710_0 conda-forge opt-einsum 3.3.0 pypi_0 pypi pillow 9.0.1 py38hdc2b20a_0
pip 21.2.2 py38haa95532_0
protobuf 3.20.1 pypi_0 pypi psutil 5.8.0 py38h2bbff1b_1
pyasn1 0.4.8 pypi_0 pypi pyasn1-modules 0.2.8 pypi_0 pypi pyparsing 3.0.4 pyhd3eb1b0_0
pyqt 5.9.2 py38hd77b12b_6
python 3.8.13 h6244533_0
python-dateutil 2.8.2 pyhd3eb1b0_0
python_abi 3.8 2_cp38 conda-forge pywin32 302 py38h2bbff1b_2
qt 5.9.7 vc14h73c81de_0
requests 2.27.1 pypi_0 pypi requests-oauthlib 1.3.1 pypi_0 pypi rsa 4.8 pypi_0 pypi scikit-learn 1.0.2 py38hf11a4ad_1
scipy 1.7.3 py38h0a974cb_0
setuptools 61.2.0 py38haa95532_0
sip 4.19.13 py38hd77b12b_0
six 1.16.0 pyhd3eb1b0_1
sqlite 3.38.3 h2bbff1b_0
tensorboard 2.8.0 pypi_0 pypi tensorboard-data-server 0.6.1 pypi_0 pypi tensorboard-plugin-wit 1.8.1 pypi_0 pypi tensorflow-gpu 2.8.0 pypi_0 pypi tensorflow-io-gcs-filesystem 0.25.0 pypi_0 pypi termcolor 1.1.0 pypi_0 pypi tf-estimator-nightly 2.8.0.dev2021122109 pypi_0 pypi threadpoolctl 2.2.0 pyh0d69192_0
tk 8.6.11 h2bbff1b_0
tornado 6.1 py38h2bbff1b_0
tqdm 4.64.0 py38haa95532_0
typing-extensions 4.2.0 pypi_0 pypi urllib3 1.26.9 pypi_0 pypi vc 14.2 h21ff451_1
vs2015_runtime 14.27.29016 h5e58377_2
werkzeug 2.1.2 pypi_0 pypi wheel 0.37.1 pyhd3eb1b0_0
wincertstore 0.2 py38haa95532_2
wrapt 1.14.1 pypi_0 pypi xz 5.2.5 h8cc25b3_1
zipp 3.8.0 pypi_0 pypi zlib 1.2.12 h8cc25b3_2
zstd 1.4.9 h19a0ad4_0 ================= Configs ================== --------- .faceswap --------- backend: nvidia --------- convert.ini --------- [color.color_transfer] clip: True preserve_paper: True [color.manual_balance] colorspace: HSV balance_1: 0.0 balance_2: 0.0 balance_3: 0.0 contrast: 0.0 brightness: 0.0 [color.match_hist] threshold: 99.0 [mask.box_blend] type: gaussian distance: 11.0 radius: 5.0 passes: 1 [mask.mask_blend] type: normalized kernel_size: 3 passes: 4 threshold: 4 erosion: 0.0 [scaling.sharpen] method: none amount: 150 radius: 0.3 threshold: 5.0 [writer.ffmpeg] container: mp4 codec: libx264 crf: 23 preset: medium tune: none profile: auto level: auto skip_mux: False [writer.gif] fps: 25 loop: 0 palettesize: 256 subrectangles: False [writer.opencv] format: png draw_transparent: False jpg_quality: 75 png_compress_level: 3 [writer.pillow] format: png draw_transparent: False optimize: False gif_interlace: True jpg_quality: 75 png_compress_level: 3 tif_compression: tiff_deflate --------- extract.ini --------- [global] allow_growth: False [align.fan] batch-size: 12 [detect.cv2_dnn] confidence: 50 [detect.mtcnn] minsize: 20 scalefactor: 0.709 batch-size: 8 threshold_1: 0.6 threshold_2: 0.7 threshold_3: 0.7 [detect.s3fd] confidence: 50 batch-size: 4 [mask.bisenet_fp] batch-size: 8 weights: faceswap include_ears: False include_hair: False include_glasses: True [mask.unet_dfl] batch-size: 8 [mask.vgg_clear] batch-size: 6 [mask.vgg_obstructed] batch-size: 2 --------- gui.ini --------- [global] fullscreen: False tab: extract options_panel_width: 30 console_panel_height: 20 icon_size: 14 font: default font_size: 9 autosave_last_session: prompt timeout: 120 auto_load_model_stats: False --------- train.ini --------- [global] centering: face coverage: 87.5 icnr_init: False conv_aware_init: True optimizer: adam learning_rate: 4e-05 epsilon_exponent: -5 reflect_padding: False allow_growth: False mixed_precision: True nan_protection: True convert_batchsize: 16 [global.loss] loss_function: ssim mask_loss_function: mse l2_reg_term: 100 eye_multiplier: 3 mouth_multiplier: 2 penalized_mask_loss: True mask_type: bisenet-fp_face mask_blur_kernel: 3 mask_threshold: 4 learn_mask: False [model.dfaker] output_size: 128 [model.dfl_h128] lowmem: False [model.dfl_sae] input_size: 128 clipnorm: True architecture: df autoencoder_dims: 0 encoder_dims: 42 decoder_dims: 21 multiscale_decoder: False [model.dlight] features: best details: good output_size: 256 [model.original] lowmem: False [model.phaze_a] output_size: 384 shared_fc: None enable_gblock: True split_fc: True split_gblock: False split_decoders: False enc_architecture: efficientnet_v2_l enc_scaling: 80 enc_load_weights: True bottleneck_type: dense bottleneck_norm: None bottleneck_size: 512 bottleneck_in_encoder: True fc_depth: 1 fc_min_filters: 1280 fc_max_filters: 1280 fc_dimensions: 8 fc_filter_slope: -0.5 fc_dropout: 0.0 fc_upsampler: upsample2d fc_upsamples: 1 fc_upsample_filters: 1280 fc_gblock_depth: 3 fc_gblock_min_nodes: 512 fc_gblock_max_nodes: 512 fc_gblock_filter_slope: -0.5 fc_gblock_dropout: 0.0 dec_upscale_method: resize_images dec_norm: None dec_min_filters: 160 dec_max_filters: 640 dec_filter_slope: -0.33 dec_res_blocks: 1 dec_output_kernel: 3 dec_gaussian: True dec_skip_last_residual: False freeze_layers: keras_encoder load_layers: encoder fs_original_depth: 4 fs_original_min_filters: 128 fs_original_max_filters: 1024 mobilenet_width: 1.0 mobilenet_depth: 1 mobilenet_dropout: 0.001 mobilenet_minimalistic: False [model.realface] input_size: 64 output_size: 128 dense_nodes: 1536 complexity_encoder: 128 complexity_decoder: 512 [model.unbalanced] input_size: 128 lowmem: False clipnorm: True nodes: 1024 complexity_encoder: 128 complexity_decoder_a: 384 complexity_decoder_b: 512 [model.villain] lowmem: False [trainer.original] preview_images: 14 zoom_amount: 5 rotation_range: 10 shift_range: 5 flip_chance: 50 color_lightness: 30 color_ab: 8 color_clahe_chance: 50 color_clahe_max_size: 4Ok, I think I know what causes this error, but not how to fix it. Similar to the other errors it appears to be to do with distributed training. Whilst fixing the other bug, I noticed I got a similar but different error on Linux relating to file descriptors. It appears that Tensorflow are doing something with multiprocessing when distributed is enabled.
I did not get this error when distributed training was disabled. Why switching to graph tab would cause this, I do not know (my specific error would come every time I was stopping training on the cli). As this appears to be happening upstream of us, I'm not sure what I can do about it. But I can, if nothing else, confirm that something weird is happening there.
Thank you for your time on this.
Re: Bug: Updated to Latest faceswap version now crashing when starting training
ianstephens wrote: ↑Sat May 07, 2022 9:41 pmNo problem.
We just switched from preview back to session graph on an active session and reproduced a crash. It seemed to log a report so here it is:
Code: Select all
05/07/2022 22:22:09 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['03579.png', '06427.png', '01761.png', '06144.png', '01268.png'] 05/07/2022 22:22:11 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['09530.png', '04870.png', '03438.png', '07545.png', '01785.png'] 05/07/2022 22:22:14 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['02406.png', '03829.png', '09482.png', '05399.png', '01876.png'] 05/07/2022 22:22:16 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['02428.png', '10602.png', '00239.png', '08793.png', '08451.png'] 05/07/2022 22:22:19 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['00478.png', '08664.png', '04416.png', '09345.png', '00448.png'] 05/07/2022 22:22:22 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['09560.png', '03496.png', '09380.png', '05842.png', '03877.png'] 05/07/2022 22:22:24 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['05337.png', '08500.png', '04145.png', '05222.png', '03419.png'] 05/07/2022 22:22:27 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['09503.png', '08846.png', '06926.png', '03326.png', '05017.png'] 05/07/2022 22:22:30 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['00942.png', '03173.png', '09885.png', '10417.png', '10565.png'] 05/07/2022 22:22:32 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['02884.png', '03842.png', '09246.png', '04563.png', '04737.png'] 05/07/2022 22:22:35 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['07989.png', '03885.png', '10616.png', '07268.png', '00270.png'] 05/07/2022 22:22:38 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['07308.png', '05281.png', '08401.png', '09281.png', '08685.png'] 05/07/2022 22:22:40 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['10378.png', '05292.png', '07052.png', '00539.png', '07737.png'] 05/07/2022 22:22:43 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['07573.png', '08968.png', '00856.png', '00640.png', '01667.png'] 05/07/2022 22:22:46 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['08268.png', '00400.png', '08811.png', '01895.png', '00550.png'] 05/07/2022 22:22:48 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['02878.png', '09182.png', '08688.png', '01811.png', '10277.png'] 05/07/2022 22:22:51 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['06369.png', '04020.png', '10585.png', '02178.png', '09142.png'] 05/07/2022 22:22:54 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['04492.png', '01282.png', '06344.png', '03188.png', '02644.png'] 05/07/2022 22:22:57 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['05381.png', '04707.png', '10261.png', '04729.png', '09365.png'] 05/07/2022 22:22:59 MainProcess _run_1 generator cache_metadata DEBUG All metadata already cached for: ['03918.png', '05473.png', '09662.png', '05705.png', '02001.png'] 05/07/2022 22:23:01 MainProcess _run_0 generator cache_metadata VERBOSE Cache filled: 'C:\Convert AI\LVR2\Training Set' 05/07/2022 22:29:14 MainProcess _training_0 _base generate_preview DEBUG Generating preview 05/07/2022 22:29:14 MainProcess _training_0 _base compile_sample DEBUG Compiling samples: (side: 'a', samples: 14) 05/07/2022 22:29:14 MainProcess _training_0 _base compile_sample DEBUG Compiling samples: (side: 'b', samples: 14) 05/07/2022 22:29:14 MainProcess _training_0 _base show_sample DEBUG Showing sample 05/07/2022 22:29:14 MainProcess _training_0 _base _get_predictions DEBUG Getting Predictions 05/07/2022 22:29:16 MainProcess _training_0 _base _get_predictions DEBUG Returning predictions: {'a_a': (14, 384, 384, 3), 'b_b': (14, 384, 384, 3), 'a_b': (14, 384, 384, 3), 'b_a': (14, 384, 384, 3)} 05/07/2022 22:29:16 MainProcess _training_0 _base _to_full_frame DEBUG side: 'a', number of sample arrays: 3, prediction.shapes: [(14, 384, 384, 3), (14, 384, 384, 3)]) 05/07/2022 22:29:16 MainProcess _training_0 _base _process_full DEBUG full_size: 384, prediction_size: 384, color: (0, 0, 255) 05/07/2022 22:29:16 MainProcess _training_0 _base _resize_sample DEBUG Resizing sample: (side: 'a', sample.shape: (14, 384, 384, 3), target_size: 438, scale: 1.140625) 05/07/2022 22:29:16 MainProcess _training_0 _base _resize_sample DEBUG Resized sample: (side: 'a' shape: (14, 438, 438, 3)) 05/07/2022 22:29:16 MainProcess _training_0 _base _process_full DEBUG Overlayed background. Shape: (14, 438, 438, 3) 05/07/2022 22:29:16 MainProcess _training_0 _base _compile_masked DEBUG masked shapes: [(14, 384, 384, 3), (14, 384, 384, 3), (14, 384, 384, 3)] 05/07/2022 22:29:16 MainProcess _training_0 _base _overlay_foreground DEBUG Overlayed foreground. Shape: (14, 438, 438, 3) 05/07/2022 22:29:16 MainProcess _training_0 _base _overlay_foreground DEBUG Overlayed foreground. Shape: (14, 438, 438, 3) 05/07/2022 22:29:16 MainProcess _training_0 _base _overlay_foreground DEBUG Overlayed foreground. Shape: (14, 438, 438, 3) 05/07/2022 22:29:16 MainProcess _training_0 _base _resize_sample DEBUG Resizing sample: (side: 'a', sample.shape: (14, 438, 438, 3), target_size: 328, scale: 0.7488584474885844) 05/07/2022 22:29:16 MainProcess _training_0 _base _resize_sample DEBUG Resized sample: (side: 'a' shape: (14, 328, 328, 3)) 05/07/2022 22:29:16 MainProcess _training_0 _base _resize_sample DEBUG Resizing sample: (side: 'a', sample.shape: (14, 438, 438, 3), target_size: 328, scale: 0.7488584474885844) 05/07/2022 22:29:16 MainProcess _training_0 _base _resize_sample DEBUG Resized sample: (side: 'a' shape: (14, 328, 328, 3)) 05/07/2022 22:29:16 MainProcess _training_0 _base _resize_sample DEBUG Resizing sample: (side: 'a', sample.shape: (14, 438, 438, 3), target_size: 328, scale: 0.7488584474885844) 05/07/2022 22:29:16 MainProcess _training_0 _base _resize_sample DEBUG Resized sample: (side: 'a' shape: (14, 328, 328, 3)) 05/07/2022 22:29:16 MainProcess _training_0 _base _get_headers DEBUG side: 'a', width: 328 05/07/2022 22:29:16 MainProcess _training_0 _base _get_headers DEBUG height: 72, total_width: 984 05/07/2022 22:29:16 MainProcess _training_0 _base _get_headers DEBUG texts: ['Original (A)', 'Original > Original', 'Original > Swap'], text_sizes: [(183, 23), (296, 23), (259, 23)], text_x: [72, 344, 690], text_y: 47 05/07/2022 22:29:16 MainProcess _training_0 _base _get_headers DEBUG header_box.shape: (72, 984, 3) 05/07/2022 22:29:16 MainProcess _training_0 _base _to_full_frame DEBUG side: 'b', number of sample arrays: 3, prediction.shapes: [(14, 384, 384, 3), (14, 384, 384, 3)]) 05/07/2022 22:29:16 MainProcess _training_0 _base _process_full DEBUG full_size: 384, prediction_size: 384, color: (0, 0, 255) 05/07/2022 22:29:16 MainProcess _training_0 _base _resize_sample DEBUG Resizing sample: (side: 'b', sample.shape: (14, 384, 384, 3), target_size: 438, scale: 1.140625) 05/07/2022 22:29:17 MainProcess _training_0 _base _resize_sample DEBUG Resized sample: (side: 'b' shape: (14, 438, 438, 3)) 05/07/2022 22:29:17 MainProcess _training_0 _base _process_full DEBUG Overlayed background. Shape: (14, 438, 438, 3) 05/07/2022 22:29:17 MainProcess _training_0 _base _compile_masked DEBUG masked shapes: [(14, 384, 384, 3), (14, 384, 384, 3), (14, 384, 384, 3)] 05/07/2022 22:29:17 MainProcess _training_0 _base _overlay_foreground DEBUG Overlayed foreground. Shape: (14, 438, 438, 3) 05/07/2022 22:29:17 MainProcess _training_0 _base _overlay_foreground DEBUG Overlayed foreground. Shape: (14, 438, 438, 3) 05/07/2022 22:29:17 MainProcess _training_0 _base _overlay_foreground DEBUG Overlayed foreground. Shape: (14, 438, 438, 3) 05/07/2022 22:29:17 MainProcess _training_0 _base _resize_sample DEBUG Resizing sample: (side: 'b', sample.shape: (14, 438, 438, 3), target_size: 328, scale: 0.7488584474885844) 05/07/2022 22:29:17 MainProcess _training_0 _base _resize_sample DEBUG Resized sample: (side: 'b' shape: (14, 328, 328, 3)) 05/07/2022 22:29:17 MainProcess _training_0 _base _resize_sample DEBUG Resizing sample: (side: 'b', sample.shape: (14, 438, 438, 3), target_size: 328, scale: 0.7488584474885844) 05/07/2022 22:29:17 MainProcess _training_0 _base _resize_sample DEBUG Resized sample: (side: 'b' shape: (14, 328, 328, 3)) 05/07/2022 22:29:17 MainProcess _training_0 _base _resize_sample DEBUG Resizing sample: (side: 'b', sample.shape: (14, 438, 438, 3), target_size: 328, scale: 0.7488584474885844) 05/07/2022 22:29:17 MainProcess _training_0 _base _resize_sample DEBUG Resized sample: (side: 'b' shape: (14, 328, 328, 3)) 05/07/2022 22:29:17 MainProcess _training_0 _base _get_headers DEBUG side: 'b', width: 328 05/07/2022 22:29:17 MainProcess _training_0 _base _get_headers DEBUG height: 72, total_width: 984 05/07/2022 22:29:17 MainProcess _training_0 _base _get_headers DEBUG texts: ['Swap (B)', 'Swap > Swap', 'Swap > Original'], text_sizes: [(150, 23), (222, 23), (259, 23)], text_x: [89, 381, 690], text_y: 47 05/07/2022 22:29:17 MainProcess _training_0 _base _get_headers DEBUG header_box.shape: (72, 984, 3) 05/07/2022 22:29:17 MainProcess _training_0 _base _duplicate_headers DEBUG side: a header.shape: (72, 984, 3) 05/07/2022 22:29:17 MainProcess _training_0 _base _duplicate_headers DEBUG side: b header.shape: (72, 984, 3) 05/07/2022 22:29:17 MainProcess _training_0 _base _stack_images DEBUG Stack images 05/07/2022 22:29:17 MainProcess _training_0 _base get_transpose_axes DEBUG Even number of images to stack 05/07/2022 22:29:17 MainProcess _training_0 _base _stack_images DEBUG Stacked images 05/07/2022 22:29:17 MainProcess _training_0 _base show_sample DEBUG Compiled sample 05/07/2022 22:29:18 MainProcess _training_0 train _show DEBUG Updating preview: (name: Training - 'S': Save Now. 'R': Refresh Preview. 'M': Toggle Mask. 'ENTER': Save and Quit) 05/07/2022 22:29:18 MainProcess _training_0 train _show DEBUG Generating preview for GUI 05/07/2022 22:29:18 MainProcess _training_0 train _show DEBUG Generated preview for GUI: '.gui_training_preview.jpg' 05/07/2022 22:29:18 MainProcess _training_0 train _show DEBUG Generating preview for display: 'Training - 'S': Save Now. 'R': Refresh Preview. 'M': Toggle Mask. 'ENTER': Save and Quit' 05/07/2022 22:29:18 MainProcess _training_0 train _show DEBUG Generated preview for display: 'Training - 'S': Save Now. 'R': Refresh Preview. 'M': Toggle Mask. 'ENTER': Save and Quit' 05/07/2022 22:29:18 MainProcess _training_0 train _show DEBUG Updated preview: (name: Training - 'S': Save Now. 'R': Refresh Preview. 'M': Toggle Mask. 'ENTER': Save and Quit) 05/07/2022 22:29:18 MainProcess _training_0 train _run_training_cycle DEBUG Save Iteration: (iteration: 4500 05/07/2022 22:29:18 MainProcess _training_0 _base _save DEBUG Backing up and saving models 05/07/2022 22:29:18 MainProcess _training_0 _base _get_save_averages DEBUG Getting save averages 05/07/2022 22:29:18 MainProcess _training_0 _base _get_save_averages DEBUG Average losses since last save: [0.054676631107926366, 0.05488332705199719] 05/07/2022 22:29:18 MainProcess _training_0 _base _should_backup DEBUG Updated lowest historical save iteration averages from: {'a': 0.05644378334283829, 'b': 0.05529949029535055} to: {'a': 0.054676631107926366, 'b': 0.05488332705199719} 05/07/2022 22:29:18 MainProcess _training_0 _base _should_backup DEBUG Should backup: True 05/07/2022 22:29:18 MainProcess _training_0 backup_restore backup_model VERBOSE Backing up: 'C:\Convert AI\LVR2\Model\phaze_a.h5' to 'C:\Convert AI\LVR2\Model\phaze_a.h5.bk' 05/07/2022 22:29:18 MainProcess _training_0 backup_restore backup_model VERBOSE Backing up: 'C:\Convert AI\LVR2\Model\phaze_a_state.json' to 'C:\Convert AI\LVR2\Model\phaze_a_state.json.bk' 05/07/2022 22:29:22 MainProcess _training_0 _base save DEBUG Saving State 05/07/2022 22:29:22 MainProcess _training_0 serializer save DEBUG filename: C:\Convert AI\LVR2\Model\phaze_a_state.json, data type: <class 'dict'> 05/07/2022 22:29:22 MainProcess _training_0 serializer _check_extension DEBUG Original filename: 'C:\Convert AI\LVR2\Model\phaze_a_state.json', final filename: 'C:\Convert AI\LVR2\Model\phaze_a_state.json' 05/07/2022 22:29:22 MainProcess _training_0 serializer marshal DEBUG data type: <class 'dict'> 05/07/2022 22:29:22 MainProcess _training_0 serializer marshal DEBUG returned data type: <class 'bytes'> 05/07/2022 22:29:22 MainProcess _training_0 _base save DEBUG Saved State 05/07/2022 22:29:22 MainProcess _training_0 _base _save INFO [Saved models] - Average loss since last save: face_a: 0.05468, face_b: 0.05488 05/07/2022 22:34:03 MainProcess _training_0 multithreading run DEBUG Error in thread (_training_0): [Errno 22] Invalid argument 05/07/2022 22:34:05 MainProcess MainThread train _monitor DEBUG Thread error detected 05/07/2022 22:34:05 MainProcess MainThread train _monitor DEBUG Closed Monitor 05/07/2022 22:34:05 MainProcess MainThread train _end_thread DEBUG Ending Training thread 05/07/2022 22:34:05 MainProcess MainThread train _end_thread CRITICAL Error caught! Exiting... 05/07/2022 22:34:05 MainProcess MainThread multithreading join DEBUG Joining Threads: '_training' 05/07/2022 22:34:05 MainProcess MainThread multithreading join DEBUG Joining Thread: '_training_0' 05/07/2022 22:34:05 MainProcess MainThread multithreading join ERROR Caught exception in thread: '_training_0' Traceback (most recent call last): File "C:\Convert\lib\cli\launcher.py", line 182, in execute_script process.process() File "C:\Convert\scripts\train.py", line 190, in process self._end_thread(thread, err) File "C:\Convert\scripts\train.py", line 230, in _end_thread thread.join() File "C:\Convert\lib\multithreading.py", line 121, in join raise thread.err[1].with_traceback(thread.err[2]) File "C:\Convert\lib\multithreading.py", line 37, in run self._target(*self._args, **self._kwargs) File "C:\Convert\scripts\train.py", line 252, in _training raise err File "C:\Convert\scripts\train.py", line 242, in _training self._run_training_cycle(model, trainer) File "C:\Convert\scripts\train.py", line 327, in _run_training_cycle trainer.train_one_step(viewer, timelapse) File "C:\Convert\plugins\train\trainer\_base.py", line 225, in train_one_step self._print_loss(loss) File "C:\Convert\plugins\train\trainer\_base.py", line 314, in _print_loss print(f"\r{output}", end="") OSError: [Errno 22] Invalid argument ============ System Information ============ encoding: cp1252 git_branch: master git_commits: a046248 BugFix - lib.keypress. 60f95bb fix: PhazeA - Use correct name for EffNetV2 freezing gpu_cuda: No global version found. Check Conda packages for Conda Cuda gpu_cudnn: No global version found. Check Conda packages for Conda cuDNN gpu_devices: GPU_0: NVIDIA GeForce RTX 3090, GPU_1: NVIDIA GeForce RTX 2080 Ti gpu_devices_active: GPU_0 gpu_driver: 512.15 gpu_vram: GPU_0: 24576MB, GPU_1: 11264MB os_machine: AMD64 os_platform: Windows-10-10.0.22000-SP0 os_release: 10 py_command: C:\Convert\faceswap.py train -A C:/Convert AI/LVR2/Training Set -B C:/Convert AI/L Work Folder/Brand New Set 512 -m C:/Convert AI/LVR2/Model -t phaze-a -bs 5 -it 1000000 -s 500 -ss 25000 -ps 75 -p -wl -X 1 -L INFO -gui py_conda_version: conda 4.12.0 py_implementation: CPython py_version: 3.8.13 py_virtual_env: True sys_cores: 48 sys_processor: Intel64 Family 6 Model 85 Stepping 4, GenuineIntel sys_ram: Total: 130718MB, Available: 117856MB, Used: 12861MB, Free: 117856MB =============== Pip Packages =============== ============== Conda Packages ============== # packages in environment at C:\Users\ \MiniConda3\envs\faceswap: # # Name Version Build Channel absl-py 1.0.0 pypi_0 pypi astunparse 1.6.3 pypi_0 pypi blas 1.0 mkl
ca-certificates 2021.10.8 h5b45459_0 conda-forge cachetools 5.0.0 pypi_0 pypi certifi 2021.10.8 py38haa244fe_2 conda-forge charset-normalizer 2.0.12 pypi_0 pypi colorama 0.4.4 pyhd3eb1b0_0
cudatoolkit 11.2.2 h933977f_10 conda-forge cudnn 8.1.0.77 h3e0f4f4_0 conda-forge cycler 0.11.0 pyhd3eb1b0_0
fastcluster 1.2.6 py38hcc40339_1 conda-forge ffmpeg 4.3.1 ha925a31_0 conda-forge ffmpy 0.2.3 pypi_0 pypi flatbuffers 2.0 pypi_0 pypi freetype 2.10.4 hd328e21_0
gast 0.5.3 pypi_0 pypi git 2.34.1 haa95532_0
google-auth 2.6.6 pypi_0 pypi google-auth-oauthlib 0.4.6 pypi_0 pypi google-pasta 0.2.0 pypi_0 pypi grpcio 1.46.0 pypi_0 pypi h5py 3.6.0 pypi_0 pypi icc_rt 2019.0.0 h0cc432a_1
icu 58.2 ha925a31_3
idna 3.3 pypi_0 pypi imageio 2.9.0 pyhd3eb1b0_0
imageio-ffmpeg 0.4.7 pyhd8ed1ab_0 conda-forge importlib-metadata 4.11.3 pypi_0 pypi intel-openmp 2021.4.0 haa95532_3556
joblib 1.1.0 pyhd3eb1b0_0
jpeg 9e h2bbff1b_0
keras 2.8.0 pypi_0 pypi keras-preprocessing 1.1.2 pypi_0 pypi kiwisolver 1.3.2 py38hd77b12b_0
libclang 14.0.1 pypi_0 pypi libpng 1.6.37 h2a8f88b_0
libtiff 4.2.0 hd0e1b90_0
libwebp 1.2.2 h2bbff1b_0
lz4-c 1.9.3 h2bbff1b_1
markdown 3.3.7 pypi_0 pypi matplotlib 3.2.2 0
matplotlib-base 3.2.2 py38h64f37c6_0
mkl 2021.4.0 haa95532_640
mkl-service 2.4.0 py38h2bbff1b_0
mkl_fft 1.3.1 py38h277e83a_0
mkl_random 1.2.2 py38hf11a4ad_0
numpy 1.21.5 py38h7a0a035_2
numpy-base 1.21.5 py38hca35cd5_2
nvidia-ml-py 11.510.69 pypi_0 pypi oauthlib 3.2.0 pypi_0 pypi opencv-python 4.5.5.64 pypi_0 pypi openssl 1.1.1o h8ffe710_0 conda-forge opt-einsum 3.3.0 pypi_0 pypi pillow 9.0.1 py38hdc2b20a_0
pip 21.2.2 py38haa95532_0
protobuf 3.20.1 pypi_0 pypi psutil 5.8.0 py38h2bbff1b_1
pyasn1 0.4.8 pypi_0 pypi pyasn1-modules 0.2.8 pypi_0 pypi pyparsing 3.0.4 pyhd3eb1b0_0
pyqt 5.9.2 py38hd77b12b_6
python 3.8.13 h6244533_0
python-dateutil 2.8.2 pyhd3eb1b0_0
python_abi 3.8 2_cp38 conda-forge pywin32 302 py38h2bbff1b_2
qt 5.9.7 vc14h73c81de_0
requests 2.27.1 pypi_0 pypi requests-oauthlib 1.3.1 pypi_0 pypi rsa 4.8 pypi_0 pypi scikit-learn 1.0.2 py38hf11a4ad_1
scipy 1.7.3 py38h0a974cb_0
setuptools 61.2.0 py38haa95532_0
sip 4.19.13 py38hd77b12b_0
six 1.16.0 pyhd3eb1b0_1
sqlite 3.38.3 h2bbff1b_0
tensorboard 2.8.0 pypi_0 pypi tensorboard-data-server 0.6.1 pypi_0 pypi tensorboard-plugin-wit 1.8.1 pypi_0 pypi tensorflow-gpu 2.8.0 pypi_0 pypi tensorflow-io-gcs-filesystem 0.25.0 pypi_0 pypi termcolor 1.1.0 pypi_0 pypi tf-estimator-nightly 2.8.0.dev2021122109 pypi_0 pypi threadpoolctl 2.2.0 pyh0d69192_0
tk 8.6.11 h2bbff1b_0
tornado 6.1 py38h2bbff1b_0
tqdm 4.64.0 py38haa95532_0
typing-extensions 4.2.0 pypi_0 pypi urllib3 1.26.9 pypi_0 pypi vc 14.2 h21ff451_1
vs2015_runtime 14.27.29016 h5e58377_2
werkzeug 2.1.2 pypi_0 pypi wheel 0.37.1 pyhd3eb1b0_0
wincertstore 0.2 py38haa95532_2
wrapt 1.14.1 pypi_0 pypi xz 5.2.5 h8cc25b3_1
zipp 3.8.0 pypi_0 pypi zlib 1.2.12 h8cc25b3_2
zstd 1.4.9 h19a0ad4_0 ================= Configs ================== --------- .faceswap --------- backend: nvidia --------- convert.ini --------- [color.color_transfer] clip: True preserve_paper: True [color.manual_balance] colorspace: HSV balance_1: 0.0 balance_2: 0.0 balance_3: 0.0 contrast: 0.0 brightness: 0.0 [color.match_hist] threshold: 99.0 [mask.box_blend] type: gaussian distance: 11.0 radius: 5.0 passes: 1 [mask.mask_blend] type: normalized kernel_size: 3 passes: 4 threshold: 4 erosion: 0.0 [scaling.sharpen] method: none amount: 150 radius: 0.3 threshold: 5.0 [writer.ffmpeg] container: mp4 codec: libx264 crf: 23 preset: medium tune: none profile: auto level: auto skip_mux: False [writer.gif] fps: 25 loop: 0 palettesize: 256 subrectangles: False [writer.opencv] format: png draw_transparent: False jpg_quality: 75 png_compress_level: 3 [writer.pillow] format: png draw_transparent: False optimize: False gif_interlace: True jpg_quality: 75 png_compress_level: 3 tif_compression: tiff_deflate --------- extract.ini --------- [global] allow_growth: False [align.fan] batch-size: 12 [detect.cv2_dnn] confidence: 50 [detect.mtcnn] minsize: 20 scalefactor: 0.709 batch-size: 8 threshold_1: 0.6 threshold_2: 0.7 threshold_3: 0.7 [detect.s3fd] confidence: 50 batch-size: 4 [mask.bisenet_fp] batch-size: 8 weights: faceswap include_ears: False include_hair: False include_glasses: True [mask.unet_dfl] batch-size: 8 [mask.vgg_clear] batch-size: 6 [mask.vgg_obstructed] batch-size: 2 --------- gui.ini --------- [global] fullscreen: False tab: extract options_panel_width: 30 console_panel_height: 20 icon_size: 14 font: default font_size: 9 autosave_last_session: prompt timeout: 120 auto_load_model_stats: False --------- train.ini --------- [global] centering: face coverage: 87.5 icnr_init: False conv_aware_init: True optimizer: adam learning_rate: 4e-05 epsilon_exponent: -5 reflect_padding: False allow_growth: False mixed_precision: True nan_protection: True convert_batchsize: 16 [global.loss] loss_function: ssim mask_loss_function: mse l2_reg_term: 100 eye_multiplier: 3 mouth_multiplier: 2 penalized_mask_loss: True mask_type: bisenet-fp_face mask_blur_kernel: 3 mask_threshold: 4 learn_mask: False [model.dfaker] output_size: 128 [model.dfl_h128] lowmem: False [model.dfl_sae] input_size: 128 clipnorm: True architecture: df autoencoder_dims: 0 encoder_dims: 42 decoder_dims: 21 multiscale_decoder: False [model.dlight] features: best details: good output_size: 256 [model.original] lowmem: False [model.phaze_a] output_size: 384 shared_fc: None enable_gblock: True split_fc: True split_gblock: False split_decoders: False enc_architecture: efficientnet_v2_l enc_scaling: 80 enc_load_weights: True bottleneck_type: dense bottleneck_norm: None bottleneck_size: 512 bottleneck_in_encoder: True fc_depth: 1 fc_min_filters: 1280 fc_max_filters: 1280 fc_dimensions: 8 fc_filter_slope: -0.5 fc_dropout: 0.0 fc_upsampler: upsample2d fc_upsamples: 1 fc_upsample_filters: 1280 fc_gblock_depth: 3 fc_gblock_min_nodes: 512 fc_gblock_max_nodes: 512 fc_gblock_filter_slope: -0.5 fc_gblock_dropout: 0.0 dec_upscale_method: resize_images dec_norm: None dec_min_filters: 160 dec_max_filters: 640 dec_filter_slope: -0.33 dec_res_blocks: 1 dec_output_kernel: 3 dec_gaussian: True dec_skip_last_residual: False freeze_layers: keras_encoder load_layers: encoder fs_original_depth: 4 fs_original_min_filters: 128 fs_original_max_filters: 1024 mobilenet_width: 1.0 mobilenet_depth: 1 mobilenet_dropout: 0.001 mobilenet_minimalistic: False [model.realface] input_size: 64 output_size: 128 dense_nodes: 1536 complexity_encoder: 128 complexity_decoder: 512 [model.unbalanced] input_size: 128 lowmem: False clipnorm: True nodes: 1024 complexity_encoder: 128 complexity_decoder_a: 384 complexity_decoder_b: 512 [model.villain] lowmem: False [trainer.original] preview_images: 14 zoom_amount: 5 rotation_range: 10 shift_range: 5 flip_chance: 50 color_lightness: 30 color_ab: 8 color_clahe_chance: 50 color_clahe_max_size: 4
If you get a second, could you try the latest update please.
I have basically wrapped that line in the code to swallow the error and output a warning. I don't like it as a solution, but I like the process failing even less.
I suspect it's a false positive, and the failure will just be pushed elsewhere, but we will see. What I think is happening is that there is an issue writing/reading from the Tensorboard log files when Distributed Training is active in Tensorflow. What pushes me towards this is that this seems to only happen when you click the graph tab (so you trigger a read into the Tensorboard logs) and the fact that the specific OSError should relate to i/o actions, not print statements.
Also, Distributed training seems to have had a few niggly issues in Tensorflow for quite a few releases now, so it would not surprise me if this is another one.
My word is final