OOM out of memory during convert but not training

Getting errors or found a bug when converting faces from a trained model? Post about them here


Forum rules

Read the FAQs and search the forum before posting a new topic.

This forum is for reporting errors with the Convert process. If you want to get tips, or better understand the Convert process, then you should look in the Convert Discussion forum.

Please mark any answers that fixed your problems so others can find the solutions.

Locked
User avatar
tochan
Posts: 21
Joined: Sun Sep 22, 2019 8:17 am
Been thanked: 5 times

OOM out of memory during convert but not training

Post by tochan »

Hi,

i train with the Dlight Trainer (without problems) but when i want to use the converter.... :?

Can someone help her? Faceswap is up to date (2019.11.10)Win10 OS

Code: Select all

11/10/2019 08:38:17 MainProcess     MainThread      config          check_exists              DEBUG    Config file exists: 'C:\Users\denni\faceswap\config\convert.ini'
11/10/2019 08:38:17 MainProcess     MainThread      config          load_config               VERBOSE  Loading config: 'C:\Users\denni\faceswap\config\convert.ini'
11/10/2019 08:38:17 MainProcess     MainThread      config          validate_config           DEBUG    Validating config
11/10/2019 08:38:17 MainProcess     MainThread      config          check_config_change       DEBUG    Default config has not changed
11/10/2019 08:38:17 MainProcess     MainThread      config          check_config_choices      DEBUG    Checking config choices
11/10/2019 08:38:17 MainProcess     MainThread      config          check_config_choices      DEBUG    Checked config choices
11/10/2019 08:38:17 MainProcess     MainThread      config          validate_config           DEBUG    Validated config
11/10/2019 08:38:17 MainProcess     MainThread      config          handle_config             DEBUG    Handled config
11/10/2019 08:38:17 MainProcess     MainThread      config          __init__                  DEBUG    Initialized: Config
11/10/2019 08:38:17 MainProcess     MainThread      config          get                       DEBUG    Getting config item: (section: 'scaling.sharpen', option: 'method')
11/10/2019 08:38:17 MainProcess     MainThread      config          get                       DEBUG    Returning item: (type: <class 'str'>, value: gaussian)
11/10/2019 08:38:17 MainProcess     MainThread      config          get                       DEBUG    Getting config item: (section: 'scaling.sharpen', option: 'amount')
11/10/2019 08:38:17 MainProcess     MainThread      config          get                       DEBUG    Returning item: (type: <class 'int'>, value: 150)
11/10/2019 08:38:17 MainProcess     MainThread      config          get                       DEBUG    Getting config item: (section: 'scaling.sharpen', option: 'radius')
11/10/2019 08:38:17 MainProcess     MainThread      config          get                       DEBUG    Returning item: (type: <class 'float'>, value: 0.3)
11/10/2019 08:38:17 MainProcess     MainThread      config          get                       DEBUG    Getting config item: (section: 'scaling.sharpen', option: 'threshold')
11/10/2019 08:38:17 MainProcess     MainThread      config          get                       DEBUG    Returning item: (type: <class 'float'>, value: 5.0)
11/10/2019 08:38:17 MainProcess     MainThread      _base           set_config                DEBUG    Config: {'method': 'gaussian', 'amount': 150, 'radius': 0.3, 'threshold': 5.0}
11/10/2019 08:38:17 MainProcess     MainThread      _base           __init__                  DEBUG    config: {'method': 'gaussian', 'amount': 150, 'radius': 0.3, 'threshold': 5.0}
11/10/2019 08:38:17 MainProcess     MainThread      _base           __init__                  DEBUG    Initialized Scaling
11/10/2019 08:38:17 MainProcess     MainThread      convert         load_plugins              DEBUG    Loaded plugins: {'box': <plugins.convert.mask.box_blend.Mask object at 0x0000028D78266358>, 'mask': <plugins.convert.mask.mask_blend.Mask object at 0x0000028D78266D68>, 'color': <plugins.convert.color.avg_color.Color object at 0x0000028D782D8F60>, 'seamless': None, 'scaling': <plugins.convert.scaling.sharpen.Scaling object at 0x0000028D78302978>}
11/10/2019 08:38:17 MainProcess     MainThread      convert         __init__                  DEBUG    Initialized Converter
11/10/2019 08:38:17 MainProcess     MainThread      convert         __init__                  DEBUG    Initialized Convert
11/10/2019 08:38:17 MainProcess     MainThread      convert         process                   DEBUG    Starting Conversion
11/10/2019 08:38:17 MainProcess     MainThread      convert         convert_images            DEBUG    Converting images
11/10/2019 08:38:17 MainProcess     MainThread      queue_manager   get_queue                 DEBUG    QueueManager getting: 'convert_out'
11/10/2019 08:38:17 MainProcess     MainThread      queue_manager   get_queue                 DEBUG    QueueManager got: 'convert_out'
11/10/2019 08:38:17 MainProcess     MainThread      queue_manager   get_queue                 DEBUG    QueueManager getting: 'patch'
11/10/2019 08:38:17 MainProcess     MainThread      queue_manager   get_queue                 DEBUG    QueueManager got: 'patch'
11/10/2019 08:38:17 MainProcess     MainThread      convert         pool_processes            DEBUG    16
11/10/2019 08:38:17 MainProcess     MainThread      multithreading  __init__                  DEBUG    Initializing MultiThread: (target: 'patch', thread_count: 16)
11/10/2019 08:38:17 MainProcess     MainThread      multithreading  __init__                  DEBUG    Initialized MultiThread: 'patch'
11/10/2019 08:38:17 MainProcess     MainThread      multithreading  start                     DEBUG    Starting thread(s): 'patch'
11/10/2019 08:38:17 MainProcess     MainThread      multithreading  start                     DEBUG    Starting thread 1 of 16: 'patch_0'
11/10/2019 08:38:17 MainProcess     patch_0         convert         process                   DEBUG    Starting convert process. (in_queue: <queue.Queue object at 0x0000028BB2297320>, out_queue: <queue.Queue object at 0x0000028BB22971D0>, completion_queue: None)
11/10/2019 08:38:17 MainProcess     MainThread      multithreading  start                     DEBUG    Starting thread 2 of 16: 'patch_1'
11/10/2019 08:38:17 MainProcess     predict_faces_0 _base           largest_face_index        DEBUG    0
11/10/2019 08:38:17 MainProcess     predict_faces_0 _base           largest_face_index        DEBUG    0
11/10/2019 08:38:17 MainProcess     patch_1         convert         process                   DEBUG    Starting convert process. (in_queue: <queue.Queue object at 0x0000028BB2297320>, out_queue: <queue.Queue object at 0x0000028BB22971D0>, completion_queue: None)
11/10/2019 08:38:17 MainProcess     MainThread      multithreading  start                     DEBUG    Starting thread 3 of 16: 'patch_2'
11/10/2019 08:38:17 MainProcess     patch_2         convert         process                   DEBUG    Starting convert process. (in_queue: <queue.Queue object at 0x0000028BB2297320>, out_queue: <queue.Queue object at 0x0000028BB22971D0>, completion_queue: None)
11/10/2019 08:38:17 MainProcess     MainThread      multithreading  start                     DEBUG    Starting thread 4 of 16: 'patch_3'
11/10/2019 08:38:17 MainProcess     patch_3         convert         process                   DEBUG    Starting convert process. (in_queue: <queue.Queue object at 0x0000028BB2297320>, out_queue: <queue.Queue object at 0x0000028BB22971D0>, completion_queue: None)
11/10/2019 08:38:17 MainProcess     MainThread      multithreading  start                     DEBUG    Starting thread 5 of 16: 'patch_4'
11/10/2019 08:38:17 MainProcess     patch_4         convert         process                   DEBUG    Starting convert process. (in_queue: <queue.Queue object at 0x0000028BB2297320>, out_queue: <queue.Queue object at 0x0000028BB22971D0>, completion_queue: None)
11/10/2019 08:38:17 MainProcess     MainThread      multithreading  start                     DEBUG    Starting thread 6 of 16: 'patch_5'
11/10/2019 08:38:17 MainProcess     patch_5         convert         process                   DEBUG    Starting convert process. (in_queue: <queue.Queue object at 0x0000028BB2297320>, out_queue: <queue.Queue object at 0x0000028BB22971D0>, completion_queue: None)
11/10/2019 08:38:17 MainProcess     MainThread      multithreading  start                     DEBUG    Starting thread 7 of 16: 'patch_6'
11/10/2019 08:38:17 MainProcess     patch_6         convert         process                   DEBUG    Starting convert process. (in_queue: <queue.Queue object at 0x0000028BB2297320>, out_queue: <queue.Queue object at 0x0000028BB22971D0>, completion_queue: None)
11/10/2019 08:38:17 MainProcess     MainThread      multithreading  start                     DEBUG    Starting thread 8 of 16: 'patch_7'
11/10/2019 08:38:17 MainProcess     patch_7         convert         process                   DEBUG    Starting convert process. (in_queue: <queue.Queue object at 0x0000028BB2297320>, out_queue: <queue.Queue object at 0x0000028BB22971D0>, completion_queue: None)
11/10/2019 08:38:17 MainProcess     MainThread      multithreading  start                     DEBUG    Starting thread 9 of 16: 'patch_8'
11/10/2019 08:38:17 MainProcess     patch_8         convert         process                   DEBUG    Starting convert process. (in_queue: <queue.Queue object at 0x0000028BB2297320>, out_queue: <queue.Queue object at 0x0000028BB22971D0>, completion_queue: None)
11/10/2019 08:38:17 MainProcess     MainThread      multithreading  start                     DEBUG    Starting thread 10 of 16: 'patch_9'
11/10/2019 08:38:17 MainProcess     patch_9         convert         process                   DEBUG    Starting convert process. (in_queue: <queue.Queue object at 0x0000028BB2297320>, out_queue: <queue.Queue object at 0x0000028BB22971D0>, completion_queue: None)
11/10/2019 08:38:17 MainProcess     MainThread      multithreading  start                     DEBUG    Starting thread 11 of 16: 'patch_10'
11/10/2019 08:38:17 MainProcess     patch_10        convert         process                   DEBUG    Starting convert process. (in_queue: <queue.Queue object at 0x0000028BB2297320>, out_queue: <queue.Queue object at 0x0000028BB22971D0>, completion_queue: None)
11/10/2019 08:38:17 MainProcess     MainThread      multithreading  start                     DEBUG    Starting thread 12 of 16: 'patch_11'
11/10/2019 08:38:17 MainProcess     patch_11        convert         process                   DEBUG    Starting convert process. (in_queue: <queue.Queue object at 0x0000028BB2297320>, out_queue: <queue.Queue object at 0x0000028BB22971D0>, completion_queue: None)
11/10/2019 08:38:17 MainProcess     MainThread      multithreading  start                     DEBUG    Starting thread 13 of 16: 'patch_12'
11/10/2019 08:38:17 MainProcess     predict_faces_0 _base           largest_face_index        DEBUG    0
11/10/2019 08:38:17 MainProcess     predict_faces_0 _base           largest_face_index        DEBUG    0
11/10/2019 08:38:17 MainProcess     patch_12        convert         process                   DEBUG    Starting convert process. (in_queue: <queue.Queue object at 0x0000028BB2297320>, out_queue: <queue.Queue object at 0x0000028BB22971D0>, completion_queue: None)
11/10/2019 08:38:17 MainProcess     MainThread      multithreading  start                     DEBUG    Starting thread 14 of 16: 'patch_13'
11/10/2019 08:38:17 MainProcess     patch_13        convert         process                   DEBUG    Starting convert process. (in_queue: <queue.Queue object at 0x0000028BB2297320>, out_queue: <queue.Queue object at 0x0000028BB22971D0>, completion_queue: None)
11/10/2019 08:38:17 MainProcess     MainThread      multithreading  start                     DEBUG    Starting thread 15 of 16: 'patch_14'
11/10/2019 08:38:17 MainProcess     patch_14        convert         process                   DEBUG    Starting convert process. (in_queue: <queue.Queue object at 0x0000028BB2297320>, out_queue: <queue.Queue object at 0x0000028BB22971D0>, completion_queue: None)
11/10/2019 08:38:17 MainProcess     MainThread      multithreading  start                     DEBUG    Starting thread 16 of 16: 'patch_15'
11/10/2019 08:38:17 MainProcess     patch_15        convert         process                   DEBUG    Starting convert process. (in_queue: <queue.Queue object at 0x0000028BB2297320>, out_queue: <queue.Queue object at 0x0000028BB22971D0>, completion_queue: None)
11/10/2019 08:38:17 MainProcess     MainThread      multithreading  start                     DEBUG    Started all threads 'patch': 16
11/10/2019 08:38:17 MainProcess     MainThread      multithreading  completed                 DEBUG    False
11/10/2019 08:38:17 MainProcess     predict_faces_0 _base           largest_face_index        DEBUG    0
11/10/2019 08:38:17 MainProcess     predict_faces_0 _base           largest_face_index        DEBUG    0
11/10/2019 08:38:17 MainProcess     predict_faces_0 _base           largest_face_index        DEBUG    0
11/10/2019 08:38:17 MainProcess     predict_faces_0 _base           largest_face_index        DEBUG    0
11/10/2019 08:38:17 MainProcess     predict_faces_0 _base           largest_face_index        DEBUG    0
11/10/2019 08:38:17 MainProcess     predict_faces_0 _base           largest_face_index        DEBUG    0
11/10/2019 08:38:17 MainProcess     predict_faces_0 _base           largest_face_index        DEBUG    0
11/10/2019 08:38:17 MainProcess     predict_faces_0 _base           largest_face_index        DEBUG    0
11/10/2019 08:38:17 MainProcess     predict_faces_0 _base           largest_face_index        DEBUG    0
11/10/2019 08:38:17 MainProcess     predict_faces_0 _base           largest_face_index        DEBUG    0
11/10/2019 08:38:17 MainProcess     predict_faces_0 _base           largest_face_index        DEBUG    0
11/10/2019 08:38:17 MainProcess     predict_faces_0 _base           largest_face_index        DEBUG    0
11/10/2019 08:38:17 MainProcess     predict_faces_0 _base           largest_face_index        DEBUG    0
11/10/2019 08:38:17 MainProcess     predict_faces_0 _base           largest_face_index        DEBUG    0
11/10/2019 08:38:17 MainProcess     predict_faces_0 _base           largest_face_index        DEBUG    0
11/10/2019 08:38:17 MainProcess     predict_faces_0 _base           largest_face_index        DEBUG    0
11/10/2019 08:38:17 MainProcess     predict_faces_0 _base           largest_face_index        DEBUG    0
11/10/2019 08:38:17 MainProcess     predict_faces_0 _base           largest_face_index        DEBUG    0
11/10/2019 08:38:17 MainProcess     predict_faces_0 _base           largest_face_index        DEBUG    0
11/10/2019 08:38:17 MainProcess     predict_faces_0 _base           largest_face_index        DEBUG    0
11/10/2019 08:38:17 MainProcess     predict_faces_0 _base           largest_face_index        DEBUG    0
11/10/2019 08:38:17 MainProcess     predict_faces_0 _base           largest_face_index        DEBUG    0
11/10/2019 08:38:17 MainProcess     predict_faces_0 _base           largest_face_index        DEBUG    0
11/10/2019 08:38:17 MainProcess     predict_faces_0 _base           largest_face_index        DEBUG    0
11/10/2019 08:38:17 MainProcess     predict_faces_0 _base           largest_face_index        DEBUG    0
11/10/2019 08:38:17 MainProcess     predict_faces_0 _base           largest_face_index        DEBUG    0
11/10/2019 08:38:18 MainProcess     MainThread      multithreading  completed                 DEBUG    False
11/10/2019 08:38:18 MainProcess     predict_faces_0 multithreading  run                       DEBUG    Error in thread (predict_faces_0): cudnn PoolForward launch failed\n     [[{{node encoder/average_pooling2d_1/AvgPool}} = AvgPool[T=DT_FLOAT, data_format="NCHW", ksize=[1, 1, 2, 2], padding="VALID", strides=[1, 1, 2, 2], _device="/job:localhost/replica:0/task:0/device:GPU:0"](encoder/average_pooling2d_1/AvgPool-0-TransposeNHWCToNCHW-LayoutOptimizer)]]\n     [[{{node decoder_b/face_out/Sigmoid/_761}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_971_decoder_b/face_out/Sigmoid", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
11/10/2019 08:38:19 MainProcess     MainThread      multithreading  check_and_raise_error     DEBUG    Thread error caught: [(<class 'tensorflow.python.framework.errors_impl.InternalError'>, InternalError(), <traceback object at 0x0000028D7831DA08>)]
Traceback (most recent call last):
  File "C:\Users\denni\faceswap\lib\cli.py", line 128, in execute_script
    process.process()
  File "C:\Users\denni\faceswap\scripts\convert.py", line 105, in process
    self.convert_images()
  File "C:\Users\denni\faceswap\scripts\convert.py", line 131, in convert_images
    self.check_thread_error()
  File "C:\Users\denni\faceswap\scripts\convert.py", line 151, in check_thread_error
    thread.check_and_raise_error()
  File "C:\Users\denni\faceswap\lib\multithreading.py", line 84, in check_and_raise_error
    raise error[1].with_traceback(error[2])
  File "C:\Users\denni\faceswap\lib\multithreading.py", line 37, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Users\denni\faceswap\scripts\convert.py", line 572, in predict_faces
    predicted = self.predict(feed_faces, batch_size)
  File "C:\Users\denni\faceswap\scripts\convert.py", line 622, in predict
    predicted = self.predictor(feed, batch_size=batch_size)
  File "C:\Users\denni\MiniConda3\envs\faceswap\lib\site-packages\keras\engine\training.py", line 1169, in predict
    steps=steps)
  File "C:\Users\denni\MiniConda3\envs\faceswap\lib\site-packages\keras\engine\training_arrays.py", line 294, in predict_loop
    batch_outs = f(ins_batch)
  File "C:\Users\denni\MiniConda3\envs\faceswap\lib\site-packages\keras\backend\tensorflow_backend.py", line 2715, in __call__
    return self._call(inputs)
  File "C:\Users\denni\MiniConda3\envs\faceswap\lib\site-packages\keras\backend\tensorflow_backend.py", line 2675, in _call
    fetched = self._callable_fn(*array_vals)
  File "C:\Users\denni\MiniConda3\envs\faceswap\lib\site-packages\tensorflow\python\client\session.py", line 1439, in __call__
    run_metadata_ptr)
  File "C:\Users\denni\MiniConda3\envs\faceswap\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 528, in __exit__
    c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.InternalError: cudnn PoolForward launch failed
     [[{{node encoder/average_pooling2d_1/AvgPool}} = AvgPool[T=DT_FLOAT, data_format="NCHW", ksize=[1, 1, 2, 2], padding="VALID", strides=[1, 1, 2, 2], _device="/job:localhost/replica:0/task:0/device:GPU:0"](encoder/average_pooling2d_1/AvgPool-0-TransposeNHWCToNCHW-LayoutOptimizer)]]
     [[{{node decoder_b/face_out/Sigmoid/_761}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_971_decoder_b/face_out/Sigmoid", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

============ System Information ============
encoding:            cp1252
git_branch:          master
git_commits:         ffd3829 Added allow_growth argument for preview. 87ccdfa Merge pull request #929 from kilroythethird/fsa_conv_fixes. c2d9a27 Add json to the filter in the GUI alignment open dialog. bb90bcd Merge branch 'double_fsa_fix' into fsa_conv_fixes. 61497a9 Updart INSTALL.md
gpu_cuda:            9.0
gpu_cudnn:           7.0.5
gpu_devices:         GPU_0: GeForce RTX 2080, GPU_1: GeForce GTX 1080
gpu_devices_active:  GPU_0, GPU_1
gpu_driver:          441.12
gpu_vram:            GPU_0: 8192MB, GPU_1: 8192MB
os_machine:          AMD64
os_platform:         Windows-10-10.0.18362-SP0
os_release:          10
py_command:          C:\Users\denni\faceswap\faceswap.py convert -i D:/In_progress/Face/Images_full_sets/Images -o D:/In_progress/Face/Images_full_sets/swap/Dlight/1_Sharpen -al D:/In_progress/Face/Images_full_sets/Images/alignments.fsa -m D:/In_progress/Face/Images_full_sets/Model/Dlight -c avg-color -M predicted -sc sharpen -w opencv -osc 100 -a D:/In_progress/Face/Images_full_sets/Faces -l 0.4 -j 0 -g 1 -L INFO -gui
py_conda_version:    conda 4.5.12
py_implementation:   CPython
py_version:          3.6.8
py_virtual_env:      True
sys_cores:           16
sys_processor:       AMD64 Family 23 Model 113 Stepping 0, AuthenticAMD
sys_ram:             Total: 65467MB, Available: 55716MB, Used: 9751MB, Free: 55716MB

=============== Pip Packages ===============
absl-py==0.7.0
astor==0.7.1
certifi==2019.6.16
Click==7.0
cloudpickle==0.8.0
cmake==3.13.3
cycler==0.10.0
cytoolz==0.9.0.1
dask==1.1.4
decorator==4.3.2
dlib==19.16.99
face-recognition==1.2.3
face-recognition-models==0.3.0
fastcluster==1.1.25
ffmpy==0.2.2
gast==0.2.2
grpcio==1.16.1
h5py==2.9.0
imageio==2.5.0
imageio-ffmpeg==0.3.0
Keras==2.2.4
Keras-Applications==1.0.7
Keras-Preprocessing==1.0.9
kiwisolver==1.0.1
Markdown==3.0.1
matplotlib==2.2.2
mkl-fft==1.0.10
mkl-random==1.0.2
mock==2.0.0
networkx==2.2
numpy==1.16.2
nvidia-ml-py3==7.352.1
olefile==0.46
opencv-python==4.1.1.26
pathlib==1.0.1
pbr==5.1.3
Pillow==6.1.0
protobuf==3.6.1
psutil==5.6.1
pyparsing==2.3.1
pyreadline==2.1
python-dateutil==2.8.0
pytz==2018.9
PyWavelets==1.0.2
pywin32==224
PyYAML==3.13
scikit-image==0.14.2
scikit-learn==0.20.3
scipy==1.2.1
six==1.12.0
tensorboard==1.12.2
tensorflow==1.12.0
tensorflow-estimator==1.13.0
termcolor==1.1.0
toolz==0.9.0
toposort==1.5
tornado==6.0.1
tqdm==4.31.1
Werkzeug==0.14.1
wincertstore==0.2

============== Conda Packages ==============
# packages in environment at C:\Users\denni\MiniConda3\envs\faceswap:
#
# Name                    Version                   Build  Channel
_tflow_select             2.1.0                       gpu 
absl-py                   0.7.0                    py36_0 
astor                     0.7.1                    py36_0 
blas                      1.0                         mkl 
ca-certificates           2019.5.15                     0 
certifi                   2019.6.16                py36_1 
Click                     7.0                       <pip>
cloudpickle               0.8.0                    py36_0 
cmake                     3.13.3                    <pip>
cudatoolkit               9.0                           1 
cudnn                     7.3.1                 cuda9.0_0 
cycler                    0.10.0           py36h009560c_0 
cytoolz                   0.9.0.1          py36hfa6e2cd_1 
dask-core                 1.1.4                      py_0 
decorator                 4.3.2                    py36_0 
dlib                      19.16.99                  <pip>
face-recognition          1.2.3                     <pip>
face-recognition-models   0.3.0                     <pip>
fastcluster               1.1.25          py36h830ac7b_1000    conda-forge
ffmpeg                    4.1               h6538335_1002    conda-forge
ffmpy                     0.2.2                     <pip>
freetype                  2.9.1                ha9979f8_1 
gast                      0.2.2                    py36_0 
grpcio                    1.16.1           py36h351948d_1 
h5py                      2.9.0            py36h5e291fa_0 
hdf5                      1.10.4               h7ebc959_0 
icc_rt                    2019.0.0             h0cc432a_1 
icu                       58.2                 ha66f8fd_1 
imageio                   2.5.0                    py36_0 
imageio-ffmpeg            0.3.0                     <pip>
intel-openmp              2019.1                      144 
jpeg                      9c                hfa6e2cd_1001    conda-forge
keras                     2.2.4                         0 
keras-applications        1.0.7                      py_0 
keras-base                2.2.4                    py36_0 
keras-preprocessing       1.0.9                      py_0 
kiwisolver                1.0.1            py36h6538335_0 
libblas                   3.8.0                     8_mkl    conda-forge
libcblas                  3.8.0                     8_mkl    conda-forge
liblapack                 3.8.0                     8_mkl    conda-forge
liblapacke                3.8.0                     8_mkl    conda-forge
libmklml                  2019.0.3                      0 
libpng                    1.6.36               h2a8f88b_0 
libprotobuf               3.6.1                h7bd577a_0 
libtiff                   4.0.10               hb898794_2 
libwebp                   1.0.2                hfa6e2cd_2    conda-forge
markdown                  3.0.1                    py36_0 
matplotlib                2.2.2            py36had4c4a9_2 
mkl                       2019.1                      144 
mkl_fft                   1.0.10           py36h14836fe_0 
mkl_random                1.0.2            py36h343c172_0 
mock                      2.0.0            py36h9086845_0 
networkx                  2.2                      py36_1 
numpy                     1.16.2           py36h19fb1c0_0 
numpy-base                1.16.2           py36hc3f5095_0 
nvidia-ml-py3             7.352.1                   <pip>
olefile                   0.46                     py36_0 
opencv                    4.1.0            py36hb4945ee_5    conda-forge
opencv-python             4.0.0.21                  <pip>
opencv-python             4.1.1.26                  <pip>
openssl                   1.1.1c               he774522_1 
pathlib                   1.0.1                    py36_1 
pbr                       5.1.3                      py_0 
pillow                    6.1.0            py36hdc69c19_0 
pip                       19.0.3                   py36_0 
protobuf                  3.6.1            py36h33f27b4_0 
psutil                    5.6.1            py36he774522_0 
pyparsing                 2.3.1                    py36_0 
pyqt                      5.9.2            py36h6538335_2 
pyreadline                2.1                      py36_1 
python                    3.6.8                h9f7ef89_7 
python-dateutil           2.8.0                    py36_0 
pytz                      2018.9                   py36_0 
pywavelets                1.0.2            py36h8c2d366_0 
pywin32                   224                       <pip>
pyyaml                    3.13             py36hfa6e2cd_0 
qt                        5.9.7            vc14h73c81de_0 
scikit-image              0.14.2           py36ha925a31_0 
scikit-learn              0.20.3           py36h343c172_0 
scipy                     1.2.1            py36h29ff71c_0 
setuptools                40.8.0                   py36_0 
sip                       4.19.8           py36h6538335_0 
six                       1.12.0                   py36_0 
sqlite                    3.27.2               he774522_0 
tensorboard               1.12.2           py36h33f27b4_0 
tensorflow                1.12.0          gpu_py36ha5f9131_0 
tensorflow-base           1.12.0          gpu_py36h6e53903_0 
tensorflow-estimator      1.13.0                     py_0 
tensorflow-gpu            1.12.0               h0d30ee6_0 
termcolor                 1.1.0                    py36_1 
tk                        8.6.8                hfa6e2cd_0 
toolz                     0.9.0                    py36_0 
toposort                  1.5                       <pip>
tornado                   6.0.1            py36he774522_0 
tqdm                      4.31.1                     py_0 
vc                        14.1                 h0510ff6_4 
vs2015_runtime            14.15.26706          h3a45250_0 
werkzeug                  0.14.1                   py36_0 
wheel                     0.33.1                   py36_0 
wincertstore              0.2              py36h7fe50ca_0 
xz                        5.2.4                h2fa13f4_4 
yaml                      0.1.7                hc54c509_2 
zlib                      1.2.11               h62dcd97_3 
zstd                      1.3.7                h508b16e_0 

================= Configs ==================
--------- .faceswap ---------
backend:                  nvidia

--------- convert.ini ---------

[color.color_transfer]
clip:                     True
preserve_paper:           True

[color.manual_balance]
colorspace:               HSV
balance_1:                0.0
balance_2:                0.0
balance_3:                0.0
contrast:                 0.0
brightness:               0.0

[color.match_hist]
threshold:                99.0

[mask.box_blend]
type:                     gaussian
distance:                 11.0
radius:                   5.0
passes:                   1

[mask.mask_blend]
type:                     gaussian
radius:                   3.0
passes:                   4
erosion:                  0.0

[scaling.sharpen]
method:                   gaussian
amount:                   150
radius:                   0.3
threshold:                5.0

[writer.ffmpeg]
container:                mp4
codec:                    libx264
crf:                      23
preset:                   medium
tune:                     none
profile:                  auto
level:                    auto

[writer.gif]
fps:                      25
loop:                     0
palettesize:              256
subrectangles:            False

[writer.opencv]
format:                   png
draw_transparent:         False
jpg_quality:              75
png_compress_level:       3

[writer.pillow]
format:                   png
draw_transparent:         False
optimize:                 False
gif_interlace:            True
jpg_quality:              75
png_compress_level:       3
tif_compression:          tiff_deflate

--------- extract.ini ---------

[global]
allow_growth:             False

[align.fan]
batch-size:               8

[detect.cv2_dnn]
confidence:               50

[detect.mtcnn]
minsize:                  20
threshold_1:              0.6
threshold_2:              0.7
threshold_3:              0.7
scalefactor:              0.709
batch-size:               8

[detect.s3fd]
confidence:               50
batch-size:               8

[mask.unet_dfl]
batch-size:               8

[mask.vgg_clear]
batch-size:               6

[mask.vgg_obstructed]
batch-size:               2

--------- gui.ini ---------

[global]
fullscreen:               False
tab:                      train
options_panel_width:      30
console_panel_height:     20
font:                     default
font_size:                9

--------- train.ini ---------

[global]
coverage:                 87.5
mask_type:                components
mask_blur:                True
icnr_init:                True
conv_aware_init:          False
subpixel_upscaling:       True
reflect_padding:          True
penalized_mask_loss:      True
loss_function:            mae
learning_rate:            5e-05

[model.dfl_h128]
lowmem:                   False

[model.dfl_sae]
input_size:               128
clipnorm:                 True
architecture:             df
autoencoder_dims:         0
encoder_dims:             42
decoder_dims:             21
multiscale_decoder:       False

[model.dlight]
features:                 best
details:                  good
output_size:              256

[model.original]
lowmem:                   False

[model.realface]
input_size:               64
output_size:              128
dense_nodes:              1536
complexity_encoder:       128
complexity_decoder:       512

[model.unbalanced]
input_size:               128
lowmem:                   False
clipnorm:                 True
nodes:                    1024
complexity_encoder:       128
complexity_decoder_a:     384
complexity_decoder_b:     512

[model.villain]
lowmem:                   False

[trainer.original]
preview_images:           14
zoom_amount:              5
rotation_range:           10
shift_range:              5
flip_chance:              50
color_lightness:          30
color_ab:                 8
color_clahe_chance:       50
color_clahe_max_size:     4
by torzdf » Fri Jan 22, 2021 11:53 am

Yes, this is out of memory.

You should be able to resolve your issue by going into Settings > Train Settings and lowering convert batchsize

Go to full post
User avatar
torzdf
Posts: 2649
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 128 times
Been thanked: 623 times

Re: Dlight Convert Crash: report

Post by torzdf »

Can you try enabling the "allow_growth" option and see if that fixes the issue?

My word is final

User avatar
tochan
Posts: 21
Joined: Sun Sep 22, 2019 8:17 am
Been thanked: 5 times

Re: Dlight Convert Crash: report

Post by tochan »

Nope, not working....

User avatar
bryanlyon
Site Admin
Posts: 793
Joined: Fri Jul 12, 2019 12:49 am
Answers: 44
Location: San Francisco
Has thanked: 4 times
Been thanked: 218 times
Contact:

Re: Dlight Convert Crash: report

Post by bryanlyon »

This is a very weird issue and doesn't make a whole lot of sense. Have you tried to run it after a reboot? Can you try convert using one of the snapshots to make sure it isn't a model that has gone corrupt? Does the Preview tool work?

If the problem persists after these steps, please set your logging to "Trace", try a convert, and send us a zipped copy of your faceswap.log (Not the crash report, but the faceswap.log in the same folder) and we'll look into it more.

User avatar
tochan
Posts: 21
Joined: Sun Sep 22, 2019 8:17 am
Been thanked: 5 times

Re: Dlight Convert Crash: report

Post by tochan »

I try a reboot, same crash.
I try a snapshot, same crash.

Then i set the trainer plugin to default, create a new model an it works!
Then i try the settings step by step with new Dlight models and the crash come, when the "Reflect Padding" is activ.

Not testet with other trainers to this moment, only with Dlight.

I hope i can help with this information.

And for me... Don't test a new trainer with days of training... test the convert after 100 iterations... ;)

second info: When i start Training with dual Gpu crashes (other report). When i start with one, save and try then 2 GPU's training works.

User avatar
superjj
Posts: 2
Joined: Sat Apr 04, 2020 9:59 pm

OOM out of memory during convert but not training

Post by superjj »

I've trained my model to about 91k iterations with no crashes. But when I try to convert, faceswap crashes with an OOM error in the logs. Conversion seems to work when I select a small range of frames to convert, maybe 15 frames at a time. But that's barely half a second of video.

Has anyone trained fine, but end up with OOM crashes during conversion?

I'm using a GTX 1650 Super 4gb, and training on the Dlight model with resource-saving options turned on.

Thanks!

User avatar
torzdf
Posts: 2649
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 128 times
Been thanked: 623 times

Re: OOM out of memory during convert but not training

Post by torzdf »

I have never seen this before, sadly.

My word is final

User avatar
bryanlyon
Site Admin
Posts: 793
Joined: Fri Jul 12, 2019 12:49 am
Answers: 44
Location: San Francisco
Has thanked: 4 times
Been thanked: 218 times
Contact:

Re: OOM out of memory during convert but not training

Post by bryanlyon »

Were you having to use Allow Growth during training? In which case, you might be running into a weird issue we've noticed on some people's setups.

User avatar
PLAY-911
Posts: 6
Joined: Mon Apr 13, 2020 6:52 pm
Has thanked: 1 time

Re: OOM out of memory during convert but not training

Post by PLAY-911 »

superjj wrote: Tue Apr 14, 2020 10:27 pm

I've trained my model to about 91k iterations with no crashes. But when I try to convert, faceswap crashes with an OOM error in the logs. Conversion seems to work when I select a small range of frames to convert, maybe 15 frames at a time. But that's barely half a second of video.

Has anyone trained fine, but end up with OOM crashes during conversion?

I'm using a GTX 1650 Super 4gb, and training on the Dlight model with resource-saving options turned on.

Thanks!

Are you in Windows? I had problems with virtual memory assigned by windows

User avatar
superjj
Posts: 2
Joined: Sat Apr 04, 2020 9:59 pm

Re: OOM out of memory during convert but not training

Post by superjj »

bryanlyon wrote: Wed Apr 15, 2020 7:59 pm

Were you having to use Allow Growth during training? In which case, you might be running into a weird issue we've noticed on some people's setups.

Yes I had Allow Growth turned on during training.

User avatar
torzdf
Posts: 2649
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 128 times
Been thanked: 623 times

Re: OOM out of memory during convert but not training

Post by torzdf »

Make sure you select "Allow Growth" for convert too

My word is final

User avatar
mgolvach
Posts: 3
Joined: Sun May 17, 2020 2:01 am
Has thanked: 1 time

Re: OOM out of memory during convert but not training

Post by mgolvach »

Just in case it helps, I had a similar situation. Training with DFL-SAE at 128px (max I could do) was working fine, but conversion gave me the error:

Resource exhausted: OOM when allocating tensor with shape[16,130,130,126] and type float...

I had turned on "allow growth" for conversion, but found I did not have "allow growth" checked for training. Though it seemed counterintuitive, I turned off (unchecked) "allow growth" for conversion, and that solved the problem.

I think, essentially, with regard to the "allow growth" option, you need to be consistent with training and conversion. If you train with it on (or off), you must do the same for conversion.

This may not be the case for everyone. I'm certain more GPU power would probably solve the problem as well ;)

Thanks for this board's wealth of information and help!

Mike

User avatar
bryanlyon
Site Admin
Posts: 793
Joined: Fri Jul 12, 2019 12:49 am
Answers: 44
Location: San Francisco
Has thanked: 4 times
Been thanked: 218 times
Contact:

Re: OOM out of memory during convert but not training

Post by bryanlyon »

Allow_growth does not affect your model in anyway, it only changes how Tensorflow allocates memory. You are likely running into a different issue. But we recommend leaving allow_growth off unless it's absolutely necessary to getting Faceswap running on your system.

User avatar
RahulRookie
Posts: 4
Joined: Sun May 31, 2020 9:31 am

Error: OOM when allocating tensor

Post by RahulRookie »

Hi,

Need help to debug this error

I am getting the following error when I run Convert function. Full video is not generated, it crashes in the mid way.
What is the cause of this issue and how to fix. Appreciate any support. I am using GPU

Code: Select all

Traceback (most recent call last):
File "D:\FACESWAP_NEW\faceswap\lib\cli\launcher.py", line 155, in execute_script
process.process()
File "D:\FACESWAP_NEW\faceswap\scripts\convert.py", line 157, in process
self._convert_images()
File "D:\FACESWAP_NEW\faceswap\scripts\convert.py", line 184, in _convert_images
self._check_thread_error()
File "D:\FACESWAP_NEW\faceswap\scripts\convert.py", line 204, in _check_thread_error
thread.check_and_raise_error()
File "D:\FACESWAP_NEW\faceswap\lib\multithreading.py", line 84, in check_and_raise_error
raise error[1].with_traceback(error[2])
File "D:\FACESWAP_NEW\faceswap\lib\multithreading.py", line 37, in run
self._target(*self._args, **self._kwargs)
File "D:\FACESWAP_NEW\faceswap\scripts\convert.py", line 873, in _predict_faces
predicted = self._predict(feed_faces, batch_size)
File "D:\FACESWAP_NEW\faceswap\scripts\convert.py", line 958, in _predict
predicted = self._predictor(feed, batch_size=batch_size)
File "C:\Users\RR\MiniConda3\envs\faceswap\lib\site-packages\keras\engine\training.py", line 1169, in predict
steps=steps)
File "C:\Users\RR\MiniConda3\envs\faceswap\lib\site-packages\keras\engine\training_arrays.py", line 294, in predict_loop
batch_outs = f(ins_batch)
File "C:\Users\RR\MiniConda3\envs\faceswap\lib\site-packages\keras\backend\tensorflow_backend.py", line 2715, in __call__
return self._call(inputs)
File "C:\Users\RR\MiniConda3\envs\faceswap\lib\site-packages\keras\backend\tensorflow_backend.py", line 2675, in _call
fetched = self._callable_fn(*array_vals)
File "C:\Users\RR\MiniConda3\envs\faceswap\lib\site-packages\tensorflow_core\python\client\session.py", line 1472, in __call__
run_metadata_ptr)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: 2 root error(s) found.
(0) Resource exhausted: OOM when allocating tensor with shape[256,128,5,5] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node encoder/conv_32_0_conv2d/convolution}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

[[decoder_b/face_out/Sigmoid/_185]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

(1) Resource exhausted: OOM when allocating tensor with shape[256,128,5,5] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node encoder/conv_32_0_conv2d/convolution}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

0 successful operations.
0 derived errors ignored.
06/16/2020 12:28:31 CRITICAL An unexpected crash has occurred. Crash report written to 'D:\FACESWAP_NEW\faceswap\crash_report.2020.06.16.122825733693.log'. You MUST provide this file if seeking assistance. Please verify you are running the latest version of faceswap before reporting
Process exited.

Script from Generate function

C:\Users\RR\MiniConda3\envs\faceswap\python.exe D:\FACESWAP_NEW\faceswap\faceswap.py convert -i D:/FACESWAP_PROJECTS/PROJ10/DST-101.mp4 -o D:/FACESWAP_PROJECTS/PROJ10/Output -m D:/FACESWAP_PROJECTS/PROJ10/ModelAB -c avg-color -M none -sc none -w ffmpeg -osc 100 -l 0.4 -j 0 -g 1 -ag -otf -L INFO

User avatar
torzdf
Posts: 2649
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 128 times
Been thanked: 623 times

Re: CRITICAL An unexpected crash has occurred.

Post by torzdf »

Yes, this is out of memory.

You should be able to resolve your issue by going into Settings > Train Settings and lowering convert batchsize

My word is final

Locked