Tried to continue training my first model with multiple GPUs on an AWS p2.8xlarge instance. Had no trouble with the same setup on a 1 GPU instance p2.xlarge. Tried turning on and off the -o and -msg flags, and changing the batchsize, but no help there.
Code: Select all
(faceswap) ubuntu@ip-172-31-47-38:~/faceswap$ python /home/ubuntu/faceswap/faceswap.py train \
-A /home/ubuntu/myfolder/faceswap-project/face1/output \
-ala /home/ubuntu/myfolder/faceswap-project/face1/face1.fsa \
-B /home/ubuntu/myfolder/faceswap-project/face2/output \
-alb /home/ubuntu/myfolder/faceswap-project/face2/face2.fsa \
-m /home/ubuntu/myfolder/faceswap-project/models/face1face2 \
-t villain -bs 100 -it 1000000 -g 1 -s 50 -ss 25000 -ps 50 -ag -wl -L INFO -w
Setting Faceswap backend to NVIDIA
05/21/2020 07:28:20 INFO Log level set to: INFO
Using TensorFlow backend.
05/21/2020 07:28:22 INFO Model A Directory: /home/ubuntu/myfolder/faceswap-project/face1/output
05/21/2020 07:28:22 INFO Model B Directory: /home/ubuntu/myfolder/faceswap-project/face2/output
05/21/2020 07:28:22 INFO Training data directory: /home/ubuntu/myfolder/faceswap-project/models/face1face2
05/21/2020 07:28:22 WARNING `-wl`, ``--warp-to-landmarks`` has been deprecated and will be removed from a future update. This option will be available within training config settings (/config/train.ini).
05/21/2020 07:28:22 INFO ===================================================
05/21/2020 07:28:22 INFO Starting
05/21/2020 07:28:22 INFO Press 'ENTER' to save and quit
05/21/2020 07:28:22 INFO Press 'S' to save model weights immediately
05/21/2020 07:28:22 INFO ===================================================
05/21/2020 07:28:23 INFO Loading data, this may take a while...
05/21/2020 07:28:23 INFO Loading Model from Villain plugin...
05/21/2020 07:28:23 INFO Using configuration saved in state file
05/21/2020 07:28:28 CRITICAL Error caught! Exiting...
05/21/2020 07:28:28 ERROR Caught exception in thread: '_training_0'
05/21/2020 07:28:30 ERROR Got Exception on main handler:
Traceback (most recent call last):
File "/home/ubuntu/faceswap/lib/cli/launcher.py", line 155, in execute_script
process.process()
File "/home/ubuntu/faceswap/scripts/train.py", line 161, in process
self._end_thread(thread, err)
File "/home/ubuntu/faceswap/scripts/train.py", line 201, in _end_thread
thread.join()
File "/home/ubuntu/faceswap/lib/multithreading.py", line 121, in join
raise thread.err[1].with_traceback(thread.err[2])
File "/home/ubuntu/faceswap/lib/multithreading.py", line 37, in run
self._target(*self._args, **self._kwargs)
File "/home/ubuntu/faceswap/scripts/train.py", line 226, in _training
raise err
File "/home/ubuntu/faceswap/scripts/train.py", line 214, in _training
model = self._load_model()
File "/home/ubuntu/faceswap/scripts/train.py", line 255, in _load_model
predict=False)
File "/home/ubuntu/faceswap/plugins/train/model/villain.py", line 25, in __init__
super().__init__(*args, **kwargs)
File "/home/ubuntu/faceswap/plugins/train/model/original.py", line 25, in __init__
super().__init__(*args, **kwargs)
File "/home/ubuntu/faceswap/plugins/train/model/_base.py", line 125, in __init__
self.build()
File "/home/ubuntu/faceswap/plugins/train/model/_base.py", line 244, in build
self.load_models(swapped=False)
File "/home/ubuntu/faceswap/plugins/train/model/_base.py", line 456, in load_models
is_loaded = network.load(fullpath=model_mapping[network.side][network.type])
File "/home/ubuntu/faceswap/plugins/train/model/_base.py", line 834, in load
network = load_model(self.filename, custom_objects=get_custom_objects())
File "/home/ubuntu/anaconda3/envs/faceswap/lib/python3.7/site-packages/keras/engine/saving.py", line 419, in load_model
model = _deserialize_model(f, custom_objects, compile)
File "/home/ubuntu/anaconda3/envs/faceswap/lib/python3.7/site-packages/keras/engine/saving.py", line 287, in _deserialize_model
K.batch_set_value(weight_value_tuples)
File "/home/ubuntu/anaconda3/envs/faceswap/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py", line 2470, in batch_set_value
get_session().run(assign_ops, feed_dict=feed_dict)
File "/home/ubuntu/anaconda3/envs/faceswap/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py", line 186, in get_session
_SESSION = tf.Session(config=config)
File "/home/ubuntu/anaconda3/envs/faceswap/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1585, in __init__
super(Session, self).__init__(target, graph, config=config)
File "/home/ubuntu/anaconda3/envs/faceswap/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 699, in __init__
self._session = tf_session.TF_NewSessionRef(self._graph._c_graph, opts)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Invalid device ordinal value (1). Valid range is [0, 0].
while setting up XLA_GPU_JIT device number 1
05/21/2020 07:28:30 CRITICAL An unexpected crash has occurred. Crash report written to '/home/ubuntu/faceswap/crash_report.2020.05.21.072828229340.log'. You MUST provide this file if seeking assistance. Please verify you are running the latest version of faceswap before reporting
The crash log:
Code: Select all
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks _set_default_initializer DEBUG Using model specified initializer: <keras.initializers.RandomNormal object at 0x7f4148532150>
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks _switch_kernel_initializer DEBUG Switched kernel_initializer from <keras.initializers.RandomNormal object at 0x7f4148532150> to <keras.initializers.VarianceScaling object at 0x7f40b8474110>
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks conv2d DEBUG input_tensor: Tensor("residual_64_12_leakyrelu_1/LeakyRelu:0", shape=(?, 64, 64, 128), dtype=float32), filters: 128, kernel_size: 3, strides: (1, 1), padding: same, kwargs: {'kernel_initializer': <keras.initializers.VarianceScaling object at 0x7f40b8474110>})
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks _get_name DEBUG Generating block name: conv2d_64_12
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks _set_default_initializer DEBUG Using model specified initializer: <keras.initializers.VarianceScaling object at 0x7f40b8474110>
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks _switch_kernel_initializer DEBUG Switched kernel_initializer from <keras.initializers.VarianceScaling object at 0x7f40b8474110> to <keras.initializers.RandomNormal object at 0x7f4148532150>
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks res_block DEBUG input_tensor: Tensor("residual_64_12_leakyrelu_3/LeakyRelu:0", shape=(?, 64, 64, 128), dtype=float32), filters: 128, kernel_size: 3, kwargs: {'kernel_initializer': <keras.initializers.RandomNormal object at 0x7f4148532150>})
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks _get_name DEBUG Generating block name: residual_64_13
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks conv2d DEBUG input_tensor: Tensor("residual_64_13_leakyrelu_0/LeakyRelu:0", shape=(?, 64, 64, 128), dtype=float32), filters: 128, kernel_size: 3, strides: (1, 1), padding: same, kwargs: {'name': 'residual_64_13_conv2d_0', 'kernel_initializer': <keras.initializers.RandomNormal object at 0x7f4148532150>})
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks _set_default_initializer DEBUG Using model specified initializer: <keras.initializers.RandomNormal object at 0x7f4148532150>
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks _switch_kernel_initializer DEBUG Switched kernel_initializer from <keras.initializers.RandomNormal object at 0x7f4148532150> to <keras.initializers.VarianceScaling object at 0x7f40b848d050>
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks conv2d DEBUG input_tensor: Tensor("residual_64_13_leakyrelu_1/LeakyRelu:0", shape=(?, 64, 64, 128), dtype=float32), filters: 128, kernel_size: 3, strides: (1, 1), padding: same, kwargs: {'kernel_initializer': <keras.initializers.VarianceScaling object at 0x7f40b848d050>})
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks _get_name DEBUG Generating block name: conv2d_64_13
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks _set_default_initializer DEBUG Using model specified initializer: <keras.initializers.VarianceScaling object at 0x7f40b848d050>
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks _switch_kernel_initializer DEBUG Switched kernel_initializer from <keras.initializers.VarianceScaling object at 0x7f40b848d050> to <keras.initializers.RandomNormal object at 0x7f4148532150>
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks res_block DEBUG input_tensor: Tensor("residual_64_13_leakyrelu_3/LeakyRelu:0", shape=(?, 64, 64, 128), dtype=float32), filters: 128, kernel_size: 3, kwargs: {'kernel_initializer': <keras.initializers.RandomNormal object at 0x7f4148532150>})
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks _get_name DEBUG Generating block name: residual_64_14
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks conv2d DEBUG input_tensor: Tensor("residual_64_14_leakyrelu_0/LeakyRelu:0", shape=(?, 64, 64, 128), dtype=float32), filters: 128, kernel_size: 3, strides: (1, 1), padding: same, kwargs: {'name': 'residual_64_14_conv2d_0', 'kernel_initializer': <keras.initializers.RandomNormal object at 0x7f4148532150>})
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks _set_default_initializer DEBUG Using model specified initializer: <keras.initializers.RandomNormal object at 0x7f4148532150>
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks _switch_kernel_initializer DEBUG Switched kernel_initializer from <keras.initializers.RandomNormal object at 0x7f4148532150> to <keras.initializers.VarianceScaling object at 0x7f40b84a8050>
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks conv2d DEBUG input_tensor: Tensor("residual_64_14_leakyrelu_1/LeakyRelu:0", shape=(?, 64, 64, 128), dtype=float32), filters: 128, kernel_size: 3, strides: (1, 1), padding: same, kwargs: {'kernel_initializer': <keras.initializers.VarianceScaling object at 0x7f40b84a8050>})
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks _get_name DEBUG Generating block name: conv2d_64_14
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks _set_default_initializer DEBUG Using model specified initializer: <keras.initializers.VarianceScaling object at 0x7f40b84a8050>
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks _switch_kernel_initializer DEBUG Switched kernel_initializer from <keras.initializers.VarianceScaling object at 0x7f40b84a8050> to <keras.initializers.RandomNormal object at 0x7f4148532150>
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks res_block DEBUG input_tensor: Tensor("residual_64_14_leakyrelu_3/LeakyRelu:0", shape=(?, 64, 64, 128), dtype=float32), filters: 128, kernel_size: 3, kwargs: {'kernel_initializer': <keras.initializers.RandomNormal object at 0x7f4148532150>})
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks _get_name DEBUG Generating block name: residual_64_15
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks conv2d DEBUG input_tensor: Tensor("residual_64_15_leakyrelu_0/LeakyRelu:0", shape=(?, 64, 64, 128), dtype=float32), filters: 128, kernel_size: 3, strides: (1, 1), padding: same, kwargs: {'name': 'residual_64_15_conv2d_0', 'kernel_initializer': <keras.initializers.RandomNormal object at 0x7f4148532150>})
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks _set_default_initializer DEBUG Using model specified initializer: <keras.initializers.RandomNormal object at 0x7f4148532150>
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks _switch_kernel_initializer DEBUG Switched kernel_initializer from <keras.initializers.RandomNormal object at 0x7f4148532150> to <keras.initializers.VarianceScaling object at 0x7f40b84420d0>
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks conv2d DEBUG input_tensor: Tensor("residual_64_15_leakyrelu_1/LeakyRelu:0", shape=(?, 64, 64, 128), dtype=float32), filters: 128, kernel_size: 3, strides: (1, 1), padding: same, kwargs: {'kernel_initializer': <keras.initializers.VarianceScaling object at 0x7f40b84420d0>})
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks _get_name DEBUG Generating block name: conv2d_64_15
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks _set_default_initializer DEBUG Using model specified initializer: <keras.initializers.VarianceScaling object at 0x7f40b84420d0>
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks _switch_kernel_initializer DEBUG Switched kernel_initializer from <keras.initializers.VarianceScaling object at 0x7f40b84420d0> to <keras.initializers.RandomNormal object at 0x7f4148532150>
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks res_block DEBUG input_tensor: Tensor("residual_64_15_leakyrelu_3/LeakyRelu:0", shape=(?, 64, 64, 128), dtype=float32), filters: 128, kernel_size: 3, kwargs: {'kernel_initializer': <keras.initializers.RandomNormal object at 0x7f4148532150>})
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks _get_name DEBUG Generating block name: residual_64_16
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks conv2d DEBUG input_tensor: Tensor("residual_64_16_leakyrelu_0/LeakyRelu:0", shape=(?, 64, 64, 128), dtype=float32), filters: 128, kernel_size: 3, strides: (1, 1), padding: same, kwargs: {'name': 'residual_64_16_conv2d_0', 'kernel_initializer': <keras.initializers.RandomNormal object at 0x7f4148532150>})
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks _set_default_initializer DEBUG Using model specified initializer: <keras.initializers.RandomNormal object at 0x7f4148532150>
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks _switch_kernel_initializer DEBUG Switched kernel_initializer from <keras.initializers.RandomNormal object at 0x7f4148532150> to <keras.initializers.VarianceScaling object at 0x7f40b845b0d0>
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks conv2d DEBUG input_tensor: Tensor("residual_64_16_leakyrelu_1/LeakyRelu:0", shape=(?, 64, 64, 128), dtype=float32), filters: 128, kernel_size: 3, strides: (1, 1), padding: same, kwargs: {'kernel_initializer': <keras.initializers.VarianceScaling object at 0x7f40b845b0d0>})
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks _get_name DEBUG Generating block name: conv2d_64_16
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks _set_default_initializer DEBUG Using model specified initializer: <keras.initializers.VarianceScaling object at 0x7f40b845b0d0>
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks _switch_kernel_initializer DEBUG Switched kernel_initializer from <keras.initializers.VarianceScaling object at 0x7f40b845b0d0> to <keras.initializers.RandomNormal object at 0x7f4148532150>
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks res_block DEBUG input_tensor: Tensor("residual_64_16_leakyrelu_3/LeakyRelu:0", shape=(?, 64, 64, 128), dtype=float32), filters: 128, kernel_size: 3, kwargs: {'kernel_initializer': <keras.initializers.RandomNormal object at 0x7f4148532150>})
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks _get_name DEBUG Generating block name: residual_64_17
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks conv2d DEBUG input_tensor: Tensor("residual_64_17_leakyrelu_0/LeakyRelu:0", shape=(?, 64, 64, 128), dtype=float32), filters: 128, kernel_size: 3, strides: (1, 1), padding: same, kwargs: {'name': 'residual_64_17_conv2d_0', 'kernel_initializer': <keras.initializers.RandomNormal object at 0x7f4148532150>})
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks _set_default_initializer DEBUG Using model specified initializer: <keras.initializers.RandomNormal object at 0x7f4148532150>
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks _switch_kernel_initializer DEBUG Switched kernel_initializer from <keras.initializers.RandomNormal object at 0x7f4148532150> to <keras.initializers.VarianceScaling object at 0x7f40b83f8050>
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks conv2d DEBUG input_tensor: Tensor("residual_64_17_leakyrelu_1/LeakyRelu:0", shape=(?, 64, 64, 128), dtype=float32), filters: 128, kernel_size: 3, strides: (1, 1), padding: same, kwargs: {'kernel_initializer': <keras.initializers.VarianceScaling object at 0x7f40b83f8050>})
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks _get_name DEBUG Generating block name: conv2d_64_17
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks _set_default_initializer DEBUG Using model specified initializer: <keras.initializers.VarianceScaling object at 0x7f40b83f8050>
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks _switch_kernel_initializer DEBUG Switched kernel_initializer from <keras.initializers.VarianceScaling object at 0x7f40b83f8050> to <keras.initializers.RandomNormal object at 0x7f4148532150>
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks conv DEBUG input_tensor: Tensor("add_23/add:0", shape=(?, 64, 64, 128), dtype=float32), filters: 128, kernel_size: 5, strides: 2, use_instance_norm: False, kwargs: {'kernel_initializer': <keras.initializers.RandomNormal object at 0x7f4148532150>})
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks _get_name DEBUG Generating block name: conv_64_0
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks conv2d DEBUG input_tensor: Tensor("add_23/add:0", shape=(?, 64, 64, 128), dtype=float32), filters: 128, kernel_size: 5, strides: 2, padding: same, kwargs: {'name': 'conv_64_0_conv2d', 'kernel_initializer': <keras.initializers.RandomNormal object at 0x7f4148532150>})
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks _set_default_initializer DEBUG Using model specified initializer: <keras.initializers.RandomNormal object at 0x7f4148532150>
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks conv DEBUG input_tensor: Tensor("pixel_shuffler_1/Reshape_1:0", shape=(?, 64, 64, 32), dtype=float32), filters: 128, kernel_size: 5, strides: 2, use_instance_norm: False, kwargs: {'kernel_initializer': <keras.initializers.RandomNormal object at 0x7f4148532150>})
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks _get_name DEBUG Generating block name: conv_64_1
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks conv2d DEBUG input_tensor: Tensor("pixel_shuffler_1/Reshape_1:0", shape=(?, 64, 64, 32), dtype=float32), filters: 128, kernel_size: 5, strides: 2, padding: same, kwargs: {'name': 'conv_64_1_conv2d', 'kernel_initializer': <keras.initializers.RandomNormal object at 0x7f4148532150>})
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks _set_default_initializer DEBUG Using model specified initializer: <keras.initializers.RandomNormal object at 0x7f4148532150>
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks conv DEBUG input_tensor: Tensor("pixel_shuffler_2/Reshape_1:0", shape=(?, 64, 64, 32), dtype=float32), filters: 128, kernel_size: 5, strides: 2, use_instance_norm: False, kwargs: {'kernel_initializer': <keras.initializers.RandomNormal object at 0x7f4148532150>})
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks _get_name DEBUG Generating block name: conv_64_2
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks conv2d DEBUG input_tensor: Tensor("pixel_shuffler_2/Reshape_1:0", shape=(?, 64, 64, 32), dtype=float32), filters: 128, kernel_size: 5, strides: 2, padding: same, kwargs: {'name': 'conv_64_2_conv2d', 'kernel_initializer': <keras.initializers.RandomNormal object at 0x7f4148532150>})
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks _set_default_initializer DEBUG Using model specified initializer: <keras.initializers.RandomNormal object at 0x7f4148532150>
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks conv_sep DEBUG input_tensor: Tensor("conv_64_2_leakyrelu/LeakyRelu:0", shape=(?, 32, 32, 128), dtype=float32), filters: 256, kernel_size: 5, strides: 2, kwargs: {'kernel_initializer': <keras.initializers.RandomNormal object at 0x7f4148532150>})
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks _get_name DEBUG Generating block name: separableconv2d_32_0
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks _set_default_initializer DEBUG Using model specified initializer: <keras.initializers.RandomNormal object at 0x7f4148532150>
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks conv DEBUG input_tensor: Tensor("separableconv2d_32_0_relu/Relu:0", shape=(?, 16, 16, 256), dtype=float32), filters: 512, kernel_size: 5, strides: 2, use_instance_norm: False, kwargs: {'kernel_initializer': <keras.initializers.RandomNormal object at 0x7f4148532150>})
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks _get_name DEBUG Generating block name: conv_16_0
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks conv2d DEBUG input_tensor: Tensor("separableconv2d_32_0_relu/Relu:0", shape=(?, 16, 16, 256), dtype=float32), filters: 512, kernel_size: 5, strides: 2, padding: same, kwargs: {'name': 'conv_16_0_conv2d', 'kernel_initializer': <keras.initializers.RandomNormal object at 0x7f4148532150>})
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks _set_default_initializer DEBUG Using model specified initializer: <keras.initializers.RandomNormal object at 0x7f4148532150>
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks conv_sep DEBUG input_tensor: Tensor("conv_16_0_leakyrelu/LeakyRelu:0", shape=(?, 8, 8, 512), dtype=float32), filters: 1024, kernel_size: 5, strides: 2, kwargs: {'kernel_initializer': <keras.initializers.RandomNormal object at 0x7f4148532150>})
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks _get_name DEBUG Generating block name: separableconv2d_8_0
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks _set_default_initializer DEBUG Using model specified initializer: <keras.initializers.RandomNormal object at 0x7f4148532150>
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks upscale DEBUG input_tensor: Tensor("reshape_1/Reshape:0", shape=(?, 8, 8, 1024), dtype=float32), filters: 512, kernel_size: 3, use_instance_norm: False, kwargs: {'kernel_initializer': <keras.initializers.RandomNormal object at 0x7f4148532150>})
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks _get_name DEBUG Generating block name: upscale_8_0
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks _set_default_initializer DEBUG Using model specified initializer: <keras.initializers.RandomNormal object at 0x7f4148532150>
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks conv2d DEBUG input_tensor: Tensor("reshape_1/Reshape:0", shape=(?, 8, 8, 1024), dtype=float32), filters: 2048, kernel_size: 3, strides: (1, 1), padding: same, kwargs: {'name': 'upscale_8_0_conv2d', 'kernel_initializer': <keras.initializers.RandomNormal object at 0x7f4148532150>})
05/21/2020 07:28:25 MainProcess _training_0 nn_blocks _set_default_initializer DEBUG Using model specified initializer: <keras.initializers.RandomNormal object at 0x7f4148532150>
05/21/2020 07:28:25 MainProcess _training_0 _base add_network DEBUG network_type: 'encoder', side: 'None', network: '<keras.engine.training.Model object at 0x7f40b8388d90>', is_output: False
05/21/2020 07:28:25 MainProcess _training_0 _base name DEBUG model name: 'villain'
05/21/2020 07:28:25 MainProcess _training_0 _base add_network DEBUG name: 'encoder', filename: 'villain_encoder.h5'
05/21/2020 07:28:25 MainProcess _training_0 _base __init__ DEBUG Initializing NNMeta: (filename: '/home/ubuntu/myfolder/faceswap-project/models/face1face2/villain_encoder.h5', network_type: 'encoder', side: 'None', network: <keras.engine.training.Model object at 0x7f40b8388d90>, is_output: False
05/21/2020 07:28:26 MainProcess _training_0 _base __init__ DEBUG Initialized NNMeta
05/21/2020 07:28:26 MainProcess _training_0 original add_networks DEBUG Added networks
05/21/2020 07:28:26 MainProcess _training_0 _base load_models DEBUG Load model: (swapped: False)
05/21/2020 07:28:26 MainProcess _training_0 _base models_exist DEBUG Pre-existing models exist: True
05/21/2020 07:28:26 MainProcess _training_0 _base models_exist DEBUG Pre-existing models exist: True
05/21/2020 07:28:26 MainProcess _training_0 module_wrapper _tfmw_add_deprecation_warning DEBUG From /home/ubuntu/anaconda3/envs/faceswap/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:95: The name tf.reset_default_graph is deprecated. Please use tf.compat.v1.reset_default_graph instead.\n
05/21/2020 07:28:26 MainProcess _training_0 module_wrapper _tfmw_add_deprecation_warning DEBUG From /home/ubuntu/anaconda3/envs/faceswap/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:98: The name tf.placeholder_with_default is deprecated. Please use tf.compat.v1.placeholder_with_default instead.\n
05/21/2020 07:28:26 MainProcess _training_0 _base map_models DEBUG Map models: (swapped: False)
05/21/2020 07:28:26 MainProcess _training_0 _base map_models DEBUG Mapped models: (models_map: {'a': {'decoder': '/home/ubuntu/myfolder/faceswap-project/models/face1face2/villain_decoder_A.h5'}, 'b': {'decoder': '/home/ubuntu/myfolder/faceswap-project/models/face1face2/villain_decoder_B.h5'}})
05/21/2020 07:28:26 MainProcess _training_0 _base load DEBUG Loading model: '/home/ubuntu/myfolder/faceswap-project/models/face1face2/villain_decoder_A.h5'
05/21/2020 07:28:27 MainProcess _training_0 multithreading run DEBUG Error in thread (_training_0): Invalid device ordinal value (1). Valid range is [0, 0].\n while setting up XLA_GPU_JIT device number 1
05/21/2020 07:28:28 MainProcess MainThread train _monitor DEBUG Thread error detected
05/21/2020 07:28:28 MainProcess MainThread train _monitor DEBUG Closed Monitor
05/21/2020 07:28:28 MainProcess MainThread train _end_thread DEBUG Ending Training thread
05/21/2020 07:28:28 MainProcess MainThread train _end_thread CRITICAL Error caught! Exiting...
05/21/2020 07:28:28 MainProcess MainThread multithreading join DEBUG Joining Threads: '_training'
05/21/2020 07:28:28 MainProcess MainThread multithreading join DEBUG Joining Thread: '_training_0'
05/21/2020 07:28:28 MainProcess MainThread multithreading join ERROR Caught exception in thread: '_training_0'
Traceback (most recent call last):
File "/home/ubuntu/faceswap/lib/cli/launcher.py", line 155, in execute_script
process.process()
File "/home/ubuntu/faceswap/scripts/train.py", line 161, in process
self._end_thread(thread, err)
File "/home/ubuntu/faceswap/scripts/train.py", line 201, in _end_thread
thread.join()
File "/home/ubuntu/faceswap/lib/multithreading.py", line 121, in join
raise thread.err[1].with_traceback(thread.err[2])
File "/home/ubuntu/faceswap/lib/multithreading.py", line 37, in run
self._target(*self._args, **self._kwargs)
File "/home/ubuntu/faceswap/scripts/train.py", line 226, in _training
raise err
File "/home/ubuntu/faceswap/scripts/train.py", line 214, in _training
model = self._load_model()
File "/home/ubuntu/faceswap/scripts/train.py", line 255, in _load_model
predict=False)
File "/home/ubuntu/faceswap/plugins/train/model/villain.py", line 25, in __init__
super().__init__(*args, **kwargs)
File "/home/ubuntu/faceswap/plugins/train/model/original.py", line 25, in __init__
super().__init__(*args, **kwargs)
File "/home/ubuntu/faceswap/plugins/train/model/_base.py", line 125, in __init__
self.build()
File "/home/ubuntu/faceswap/plugins/train/model/_base.py", line 244, in build
self.load_models(swapped=False)
File "/home/ubuntu/faceswap/plugins/train/model/_base.py", line 456, in load_models
is_loaded = network.load(fullpath=model_mapping[network.side][network.type])
File "/home/ubuntu/faceswap/plugins/train/model/_base.py", line 834, in load
network = load_model(self.filename, custom_objects=get_custom_objects())
File "/home/ubuntu/anaconda3/envs/faceswap/lib/python3.7/site-packages/keras/engine/saving.py", line 419, in load_model
model = _deserialize_model(f, custom_objects, compile)
File "/home/ubuntu/anaconda3/envs/faceswap/lib/python3.7/site-packages/keras/engine/saving.py", line 287, in _deserialize_model
K.batch_set_value(weight_value_tuples)
File "/home/ubuntu/anaconda3/envs/faceswap/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py", line 2470, in batch_set_value
get_session().run(assign_ops, feed_dict=feed_dict)
File "/home/ubuntu/anaconda3/envs/faceswap/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py", line 186, in get_session
_SESSION = tf.Session(config=config)
File "/home/ubuntu/anaconda3/envs/faceswap/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1585, in __init__
super(Session, self).__init__(target, graph, config=config)
File "/home/ubuntu/anaconda3/envs/faceswap/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 699, in __init__
self._session = tf_session.TF_NewSessionRef(self._graph._c_graph, opts)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Invalid device ordinal value (1). Valid range is [0, 0].
while setting up XLA_GPU_JIT device number 1
============ System Information ============
encoding: UTF-8
git_branch: master
git_commits: ac40b0f Remove subpixel upscaling option (#1024)
gpu_cuda: 10.0
gpu_cudnn: 7.6.5
gpu_devices: GPU_0: Tesla K80, GPU_1: Tesla K80, GPU_2: Tesla K80, GPU_3: Tesla K80, GPU_4: Tesla K80, GPU_5: Tesla K80, GPU_6: Tesla K80, GPU_7: Tesla K80
gpu_devices_active: GPU_0, GPU_1, GPU_2, GPU_3, GPU_4, GPU_5, GPU_6, GPU_7
gpu_driver: 440.33.01
gpu_vram: GPU_0: 11441MB, GPU_1: 11441MB, GPU_2: 11441MB, GPU_3: 11441MB, GPU_4: 11441MB, GPU_5: 11441MB, GPU_6: 11441MB, GPU_7: 11441MB
os_machine: x86_64
os_platform: Linux-5.3.0-1017-aws-x86_64-with-debian-buster-sid
os_release: 5.3.0-1017-aws
py_command: /home/ubuntu/faceswap/faceswap.py train -A /home/ubuntu/myfolder/faceswap-project/face1/output -ala /home/ubuntu/myfolder/faceswap-project/face1/face1.fsa -B /home/ubuntu/myfolder/faceswap-project/face2/output -alb /home/ubuntu/myfolder/faceswap-project/face2/face2.fsa -m /home/ubuntu/myfolder/faceswap-project/models/face1face2 -t villain -bs 100 -it 1000000 -g 1 -s 50 -ss 25000 -ps 50 -ag -wl -L INFO -w
py_conda_version: conda 4.8.3
py_implementation: CPython
py_version: 3.7.7
py_virtual_env: True
sys_cores: 32
sys_processor: x86_64
sys_ram: Total: 491594MB, Available: 485096MB, Used: 1709MB, Free: 481857MB
=============== Pip Packages ===============
absl-py==0.9.0
astor==0.8.0
certifi==2020.4.5.1
cloudpickle==1.4.1
cycler==0.10.0
cytoolz==0.10.1
dask==2.16.0
decorator==4.4.2
fastcluster==1.1.26
ffmpy==0.2.2
gast==0.2.2
google-pasta==0.2.0
grpcio==1.27.2
h5py==2.9.0
imageio==2.6.1
imageio-ffmpeg==0.4.2
joblib==0.14.1
Keras==2.2.4
Keras-Applications==1.0.8
Keras-Preprocessing==1.1.0
kiwisolver==1.2.0
Markdown==3.1.1
matplotlib==3.1.3
mkl-fft==1.0.15
mkl-random==1.1.0
mkl-service==2.3.0
networkx==2.4
numpy==1.17.4
nvidia-ml-py3==7.352.1
olefile==0.46
opencv-python==4.1.2.30
opt-einsum==3.1.0
pathlib==1.0.1
Pillow==6.2.1
protobuf==3.11.4
psutil==5.7.0
pyparsing==2.4.7
python-dateutil==2.8.1
pytz==2020.1
PyWavelets==1.1.1
PyYAML==5.3.1
scikit-image==0.16.2
scikit-learn==0.22.1
scipy==1.4.1
six==1.14.0
tensorboard==1.15.0
tensorflow==1.15.0
tensorflow-estimator==1.15.1
termcolor==1.1.0
toolz==0.10.0
toposort==1.5
tornado==6.0.4
tqdm==4.46.0
webencodings==0.5.1
Werkzeug==0.16.1
wrapt==1.12.1
============== Conda Packages ==============
# packages in environment at /home/ubuntu/anaconda3/envs/faceswap:
#
# Name Version Build Channel
_libgcc_mutex 0.1 main
_tflow_select 2.1.0 gpu
absl-py 0.9.0 py37_0
astor 0.8.0 py37_0
blas 1.0 mkl
bzip2 1.0.8 h516909a_2 conda-forge
c-ares 1.15.0 h7b6447c_1001
ca-certificates 2020.1.1 0
certifi 2020.4.5.1 py37_0
cloudpickle 1.4.1 py_0
cudatoolkit 10.0.130 0
cudnn 7.6.5 cuda10.0_0
cupti 10.0.130 0
cycler 0.10.0 py37_0
cytoolz 0.10.1 py37h7b6447c_0
dask-core 2.16.0 py_0
dbus 1.13.14 hb2f20db_0
decorator 4.4.2 py_0
expat 2.2.6 he6710b0_0
fastcluster 1.1.26 py37hb3f55d8_0 conda-forge
ffmpeg 4.2 h167e202_0 conda-forge
ffmpy 0.2.2 pypi_0 pypi
fontconfig 2.13.0 h9420a91_0
freetype 2.9.1 h8a8886c_1
gast 0.2.2 py37_0
git 2.23.0 pl526hacde149_0
glib 2.63.1 h3eb4bd4_1
gmp 6.2.0 he1b5a44_2 conda-forge
gnutls 3.6.5 hd3a4fd2_1002 conda-forge
google-pasta 0.2.0 py_0
grpcio 1.27.2 py37hf8bcb03_0
gst-plugins-base 1.14.0 hbbd80ab_1
gstreamer 1.14.0 hb31296c_0
h5py 2.9.0 py37h7918eee_0
hdf5 1.10.4 hb1b8bf9_0
icu 58.2 he6710b0_3
imageio 2.6.1 py37_0
imageio-ffmpeg 0.4.2 py_0 conda-forge
intel-openmp 2020.1 217
joblib 0.14.1 py_0
jpeg 9b h024ee3a_2
keras 2.2.4 0
keras-applications 1.0.8 py_0
keras-base 2.2.4 py37_0
keras-preprocessing 1.1.0 py_1
kiwisolver 1.2.0 py37hfd86e86_0
krb5 1.17.1 h173b8e3_0
lame 3.100 h14c3975_1001 conda-forge
ld_impl_linux-64 2.33.1 h53a641e_7
libcurl 7.69.1 h20c2e04_0
libedit 3.1.20181209 hc058e9b_0
libffi 3.3 he6710b0_1
libgcc-ng 9.1.0 hdf63c60_0
libgfortran-ng 7.3.0 hdf63c60_0
libiconv 1.15 h516909a_1006 conda-forge
libpng 1.6.37 hbc83047_0
libprotobuf 3.11.4 hd408876_0
libssh2 1.9.0 h1ba5d50_1
libstdcxx-ng 9.1.0 hdf63c60_0
libtiff 4.1.0 h2733197_0
libuuid 1.0.3 h1bed415_2
libxcb 1.13 h1bed415_1
libxml2 2.9.9 hea5a465_1
markdown 3.1.1 py37_0
matplotlib 3.1.1 py37h5429711_0
matplotlib-base 3.1.3 py37hef1b27d_0
mkl 2020.1 217
mkl-service 2.3.0 py37he904b0f_0
mkl_fft 1.0.15 py37ha843d7b_0
mkl_random 1.1.0 py37hd6b4f25_0
ncurses 6.2 he6710b0_1
nettle 3.4.1 h1bed415_1002 conda-forge
networkx 2.4 py_0
numpy 1.17.4 py37hc1035e2_0
numpy-base 1.17.4 py37hde5b4d6_0
nvidia-ml-py3 7.352.1 pypi_0 pypi
olefile 0.46 py37_0
opencv-python 4.1.2.30 pypi_0 pypi
openh264 1.8.0 hdbcaa40_1000 conda-forge
openssl 1.1.1g h7b6447c_0
opt_einsum 3.1.0 py_0
pathlib 1.0.1 py37_1
pcre 8.43 he6710b0_0
perl 5.26.2 h14c3975_0
pillow 6.2.1 py37h34e0f95_0
pip 20.0.2 py37_3
protobuf 3.11.4 py37he6710b0_0
psutil 5.7.0 py37h7b6447c_0
pyparsing 2.4.7 py_0
pyqt 5.9.2 py37h05f1152_2
python 3.7.7 hcff3b4d_5
python-dateutil 2.8.1 py_0
python_abi 3.7 1_cp37m conda-forge
pytz 2020.1 py_0
pywavelets 1.1.1 py37h7b6447c_0
pyyaml 5.3.1 py37h7b6447c_0
qt 5.9.7 h5867ecd_1
readline 8.0 h7b6447c_0
scikit-image 0.16.2 py37h0573a6f_0
scikit-learn 0.22.1 py37hd81dba3_0
scipy 1.4.1 py37h0b6359f_0
setuptools 46.4.0 py37_0
sip 4.19.8 py37hf484d3e_0
six 1.14.0 py37_0
sqlite 3.31.1 h62c20be_1
tensorboard 1.15.0 pyhb230dea_0
tensorflow 1.15.0 gpu_py37h0f0df58_0
tensorflow-base 1.15.0 gpu_py37h9dcbed7_0
tensorflow-estimator 1.15.1 pyh2649769_0
tensorflow-gpu 1.15.0 h0d30ee6_0
termcolor 1.1.0 py37_1
tk 8.6.8 hbc83047_0
toolz 0.10.0 py_0
toposort 1.5 py_3 conda-forge
tornado 6.0.4 py37h7b6447c_1
tqdm 4.46.0 py_0
webencodings 0.5.1 py37_1
werkzeug 0.16.1 py_0
wheel 0.34.2 py37_0
wrapt 1.12.1 py37h7b6447c_1
x264 1!152.20180806 h14c3975_0 conda-forge
xz 5.2.5 h7b6447c_0
yaml 0.1.7 had09818_2
zlib 1.2.11 h7b6447c_3
zstd 1.3.7 h0b5b093_0
=============== State File =================
{
"name": "villain",
"sessions": {
"1": {
"timestamp": 1589747179.0678897,
"no_logs": false,
"pingpong": false,
"loss_names": {
"a": [
"face_loss"
],
"b": [
"face_loss"
]
},
"batchsize": 32,
"iterations": 617,
"config": {
"learning_rate": 5e-05
}
},
"2": {
"timestamp": 1589752564.8719282,
"no_logs": false,
"pingpong": false,
"loss_names": {
"a": [
"face_loss"
],
"b": [
"face_loss"
]
},
"batchsize": 32,
"iterations": 15701,
"config": {
"learning_rate": 5e-05
}
},
"3": {
"timestamp": 1589915467.228661,
"no_logs": false,
"pingpong": false,
"loss_names": {
"a": [
"face_loss"
],
"b": [
"face_loss"
]
},
"batchsize": 32,
"iterations": 8451,
"config": {
"learning_rate": 5e-05
}
}
},
"lowest_avg_loss": {
"a": 0.011524430494755506,
"b": 0.013505328968167305
},
"iterations": 24769,
"inputs": {
"face_in:0": [
128,
128,
3
],
"mask_in:0": [
128,
128,
1
]
},
"training_size": 256,
"config": {
"coverage": 100.0,
"mask_type": "vgg-clear",
"mask_blur_kernel": 3,
"mask_threshold": 4,
"learn_mask": false,
"icnr_init": false,
"conv_aware_init": false,
"reflect_padding": false,
"penalized_mask_loss": true,
"loss_function": "mae",
"learning_rate": 5e-05,
"lowmem": false
}
}
================= Configs ==================
--------- convert.ini ---------
[mask.mask_blend]
type: normalized
kernel_size: 3
passes: 4
threshold: 4
erosion: 0.0
[mask.box_blend]
type: gaussian
distance: 11.0
radius: 5.0
passes: 1
[color.color_transfer]
clip: True
preserve_paper: True
[color.manual_balance]
colorspace: HSV
balance_1: 0.0
balance_2: 0.0
balance_3: 0.0
contrast: 0.0
brightness: 0.0
[color.match_hist]
threshold: 99.0
[scaling.sharpen]
method: unsharp_mask
amount: 150
radius: 0.3
threshold: 5.0
[writer.ffmpeg]
container: mp4
codec: libx264
crf: 23
preset: medium
tune: none
profile: auto
level: auto
[writer.gif]
fps: 25
loop: 0
palettesize: 256
subrectangles: False
[writer.opencv]
format: png
draw_transparent: False
jpg_quality: 75
png_compress_level: 3
[writer.pillow]
format: png
draw_transparent: False
optimize: False
gif_interlace: True
jpg_quality: 75
png_compress_level: 3
tif_compression: tiff_deflate
--------- .faceswap ---------
backend: nvidia
--------- extract.ini ---------
[global]
allow_growth: False
[mask.vgg_obstructed]
batch-size: 2
[mask.vgg_clear]
batch-size: 6
[mask.unet_dfl]
batch-size: 8
[align.fan]
batch-size: 12
[detect.mtcnn]
minsize: 20
threshold_1: 0.6
threshold_2: 0.7
threshold_3: 0.7
scalefactor: 0.709
batch-size: 8
[detect.s3fd]
confidence: 70
batch-size: 4
[detect.cv2_dnn]
confidence: 50
--------- train.ini ---------
[global]
coverage: 100
mask_type: vgg-clear
mask_blur_kernel: 3
mask_threshold: 4
learn_mask: True
icnr_init: False
conv_aware_init: False
reflect_padding: False
penalized_mask_loss: True
loss_function: mae
learning_rate: 5e-05
[trainer.original]
preview_images: 14
zoom_amount: 5
rotation_range: 10
shift_range: 5
flip_chance: 50
color_lightness: 30
color_ab: 8
color_clahe_chance: 50
color_clahe_max_size: 4
[model.dfl_sae]
input_size: 128
clipnorm: True
architecture: df
autoencoder_dims: 0
encoder_dims: 42
decoder_dims: 21
multiscale_decoder: False
[model.dfl_h128]
lowmem: False
[model.realface]
input_size: 64
output_size: 128
dense_nodes: 1536
complexity_encoder: 128
complexity_decoder: 512
[model.villain]
lowmem: False
[model.original]
lowmem: False
[model.unbalanced]
input_size: 128
lowmem: False
clipnorm: True
nodes: 1024
complexity_encoder: 128
complexity_decoder_a: 384
complexity_decoder_b: 512
[model.dlight]
features: best
details: good
output_size: 256