Invalid device ordinal value (1). Valid range is [0, 0]

Want to use Faceswap in The Cloud? This is not directly supported by the Devs, but you may find community support here


Forum rules

Read the FAQs and search the forum before posting a new topic.

NB: The Devs do not directly support using Cloud based services, but you can find community support here.

Please mark any answers that fixed your problems so others can find the solutions.

Locked
User avatar
koroep
Posts: 3
Joined: Sun May 17, 2020 10:39 am

Invalid device ordinal value (1). Valid range is [0, 0]

Post by koroep »

Tried to continue training my first model with multiple GPUs on an AWS p2.8xlarge instance. Had no trouble with the same setup on a 1 GPU instance p2.xlarge. Tried turning on and off the -o and -msg flags, and changing the batchsize, but no help there.

Code: Select all

(faceswap) ubuntu@ip-172-31-47-38:~/faceswap$ python /home/ubuntu/faceswap/faceswap.py train \
-A /home/ubuntu/myfolder/faceswap-project/face1/output \
-ala /home/ubuntu/myfolder/faceswap-project/face1/face1.fsa \
-B /home/ubuntu/myfolder/faceswap-project/face2/output \
-alb /home/ubuntu/myfolder/faceswap-project/face2/face2.fsa \
-m /home/ubuntu/myfolder/faceswap-project/models/face1face2 \
-t villain -bs 100 -it 1000000 -g 1 -s 50 -ss 25000 -ps 50 -ag -wl -L INFO -w
Setting Faceswap backend to NVIDIA
05/21/2020 07:28:20 INFO     Log level set to: INFO
Using TensorFlow backend.
05/21/2020 07:28:22 INFO     Model A Directory: /home/ubuntu/myfolder/faceswap-project/face1/output
05/21/2020 07:28:22 INFO     Model B Directory: /home/ubuntu/myfolder/faceswap-project/face2/output
05/21/2020 07:28:22 INFO     Training data directory: /home/ubuntu/myfolder/faceswap-project/models/face1face2
05/21/2020 07:28:22 WARNING  `-wl`, ``--warp-to-landmarks``  has been deprecated and will be removed from a future update. This option will be available within training config settings (/config/train.ini).
05/21/2020 07:28:22 INFO     ===================================================
05/21/2020 07:28:22 INFO       Starting
05/21/2020 07:28:22 INFO       Press 'ENTER' to save and quit
05/21/2020 07:28:22 INFO       Press 'S' to save model weights immediately
05/21/2020 07:28:22 INFO     ===================================================
05/21/2020 07:28:23 INFO     Loading data, this may take a while...
05/21/2020 07:28:23 INFO     Loading Model from Villain plugin...
05/21/2020 07:28:23 INFO     Using configuration saved in state file
05/21/2020 07:28:28 CRITICAL Error caught! Exiting...
05/21/2020 07:28:28 ERROR    Caught exception in thread: '_training_0'
05/21/2020 07:28:30 ERROR    Got Exception on main handler:
Traceback (most recent call last):
  File "/home/ubuntu/faceswap/lib/cli/launcher.py", line 155, in execute_script
    process.process()
  File "/home/ubuntu/faceswap/scripts/train.py", line 161, in process
    self._end_thread(thread, err)
  File "/home/ubuntu/faceswap/scripts/train.py", line 201, in _end_thread
    thread.join()
  File "/home/ubuntu/faceswap/lib/multithreading.py", line 121, in join
    raise thread.err[1].with_traceback(thread.err[2])
  File "/home/ubuntu/faceswap/lib/multithreading.py", line 37, in run
    self._target(*self._args, **self._kwargs)
  File "/home/ubuntu/faceswap/scripts/train.py", line 226, in _training
    raise err
  File "/home/ubuntu/faceswap/scripts/train.py", line 214, in _training
    model = self._load_model()
  File "/home/ubuntu/faceswap/scripts/train.py", line 255, in _load_model
    predict=False)
  File "/home/ubuntu/faceswap/plugins/train/model/villain.py", line 25, in __init__
    super().__init__(*args, **kwargs)
  File "/home/ubuntu/faceswap/plugins/train/model/original.py", line 25, in __init__
    super().__init__(*args, **kwargs)
  File "/home/ubuntu/faceswap/plugins/train/model/_base.py", line 125, in __init__
    self.build()
  File "/home/ubuntu/faceswap/plugins/train/model/_base.py", line 244, in build
    self.load_models(swapped=False)
  File "/home/ubuntu/faceswap/plugins/train/model/_base.py", line 456, in load_models
    is_loaded = network.load(fullpath=model_mapping[network.side][network.type])
  File "/home/ubuntu/faceswap/plugins/train/model/_base.py", line 834, in load
    network = load_model(self.filename, custom_objects=get_custom_objects())
  File "/home/ubuntu/anaconda3/envs/faceswap/lib/python3.7/site-packages/keras/engine/saving.py", line 419, in load_model
    model = _deserialize_model(f, custom_objects, compile)
  File "/home/ubuntu/anaconda3/envs/faceswap/lib/python3.7/site-packages/keras/engine/saving.py", line 287, in _deserialize_model
    K.batch_set_value(weight_value_tuples)
  File "/home/ubuntu/anaconda3/envs/faceswap/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py", line 2470, in batch_set_value
    get_session().run(assign_ops, feed_dict=feed_dict)
  File "/home/ubuntu/anaconda3/envs/faceswap/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py", line 186, in get_session
    _SESSION = tf.Session(config=config)
  File "/home/ubuntu/anaconda3/envs/faceswap/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1585, in __init__
    super(Session, self).__init__(target, graph, config=config)
  File "/home/ubuntu/anaconda3/envs/faceswap/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 699, in __init__
    self._session = tf_session.TF_NewSessionRef(self._graph._c_graph, opts)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Invalid device ordinal value (1). Valid range is [0, 0].
        while setting up XLA_GPU_JIT device number 1
05/21/2020 07:28:30 CRITICAL An unexpected crash has occurred. Crash report written to '/home/ubuntu/faceswap/crash_report.2020.05.21.072828229340.log'. You MUST provide this file if seeking assistance. Please verify you are running the latest version of faceswap before reporting

The crash log:

Code: Select all

05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       _set_default_initializer  DEBUG    Using model specified initializer: <keras.initializers.RandomNormal object at 0x7f4148532150>
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       _switch_kernel_initializer DEBUG    Switched kernel_initializer from <keras.initializers.RandomNormal object at 0x7f4148532150> to <keras.initializers.VarianceScaling object at 0x7f40b8474110>
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       conv2d                    DEBUG    input_tensor: Tensor("residual_64_12_leakyrelu_1/LeakyRelu:0", shape=(?, 64, 64, 128), dtype=float32), filters: 128, kernel_size: 3, strides: (1, 1), padding: same, kwargs: {'kernel_initializer': <keras.initializers.VarianceScaling object at 0x7f40b8474110>})
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       _get_name                 DEBUG    Generating block name: conv2d_64_12
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       _set_default_initializer  DEBUG    Using model specified initializer: <keras.initializers.VarianceScaling object at 0x7f40b8474110>
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       _switch_kernel_initializer DEBUG    Switched kernel_initializer from <keras.initializers.VarianceScaling object at 0x7f40b8474110> to <keras.initializers.RandomNormal object at 0x7f4148532150>
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       res_block                 DEBUG    input_tensor: Tensor("residual_64_12_leakyrelu_3/LeakyRelu:0", shape=(?, 64, 64, 128), dtype=float32), filters: 128, kernel_size: 3, kwargs: {'kernel_initializer': <keras.initializers.RandomNormal object at 0x7f4148532150>})
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       _get_name                 DEBUG    Generating block name: residual_64_13
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       conv2d                    DEBUG    input_tensor: Tensor("residual_64_13_leakyrelu_0/LeakyRelu:0", shape=(?, 64, 64, 128), dtype=float32), filters: 128, kernel_size: 3, strides: (1, 1), padding: same, kwargs: {'name': 'residual_64_13_conv2d_0', 'kernel_initializer': <keras.initializers.RandomNormal object at 0x7f4148532150>})
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       _set_default_initializer  DEBUG    Using model specified initializer: <keras.initializers.RandomNormal object at 0x7f4148532150>
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       _switch_kernel_initializer DEBUG    Switched kernel_initializer from <keras.initializers.RandomNormal object at 0x7f4148532150> to <keras.initializers.VarianceScaling object at 0x7f40b848d050>
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       conv2d                    DEBUG    input_tensor: Tensor("residual_64_13_leakyrelu_1/LeakyRelu:0", shape=(?, 64, 64, 128), dtype=float32), filters: 128, kernel_size: 3, strides: (1, 1), padding: same, kwargs: {'kernel_initializer': <keras.initializers.VarianceScaling object at 0x7f40b848d050>})
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       _get_name                 DEBUG    Generating block name: conv2d_64_13
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       _set_default_initializer  DEBUG    Using model specified initializer: <keras.initializers.VarianceScaling object at 0x7f40b848d050>
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       _switch_kernel_initializer DEBUG    Switched kernel_initializer from <keras.initializers.VarianceScaling object at 0x7f40b848d050> to <keras.initializers.RandomNormal object at 0x7f4148532150>
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       res_block                 DEBUG    input_tensor: Tensor("residual_64_13_leakyrelu_3/LeakyRelu:0", shape=(?, 64, 64, 128), dtype=float32), filters: 128, kernel_size: 3, kwargs: {'kernel_initializer': <keras.initializers.RandomNormal object at 0x7f4148532150>})
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       _get_name                 DEBUG    Generating block name: residual_64_14
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       conv2d                    DEBUG    input_tensor: Tensor("residual_64_14_leakyrelu_0/LeakyRelu:0", shape=(?, 64, 64, 128), dtype=float32), filters: 128, kernel_size: 3, strides: (1, 1), padding: same, kwargs: {'name': 'residual_64_14_conv2d_0', 'kernel_initializer': <keras.initializers.RandomNormal object at 0x7f4148532150>})
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       _set_default_initializer  DEBUG    Using model specified initializer: <keras.initializers.RandomNormal object at 0x7f4148532150>
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       _switch_kernel_initializer DEBUG    Switched kernel_initializer from <keras.initializers.RandomNormal object at 0x7f4148532150> to <keras.initializers.VarianceScaling object at 0x7f40b84a8050>
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       conv2d                    DEBUG    input_tensor: Tensor("residual_64_14_leakyrelu_1/LeakyRelu:0", shape=(?, 64, 64, 128), dtype=float32), filters: 128, kernel_size: 3, strides: (1, 1), padding: same, kwargs: {'kernel_initializer': <keras.initializers.VarianceScaling object at 0x7f40b84a8050>})
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       _get_name                 DEBUG    Generating block name: conv2d_64_14
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       _set_default_initializer  DEBUG    Using model specified initializer: <keras.initializers.VarianceScaling object at 0x7f40b84a8050>
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       _switch_kernel_initializer DEBUG    Switched kernel_initializer from <keras.initializers.VarianceScaling object at 0x7f40b84a8050> to <keras.initializers.RandomNormal object at 0x7f4148532150>
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       res_block                 DEBUG    input_tensor: Tensor("residual_64_14_leakyrelu_3/LeakyRelu:0", shape=(?, 64, 64, 128), dtype=float32), filters: 128, kernel_size: 3, kwargs: {'kernel_initializer': <keras.initializers.RandomNormal object at 0x7f4148532150>})
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       _get_name                 DEBUG    Generating block name: residual_64_15
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       conv2d                    DEBUG    input_tensor: Tensor("residual_64_15_leakyrelu_0/LeakyRelu:0", shape=(?, 64, 64, 128), dtype=float32), filters: 128, kernel_size: 3, strides: (1, 1), padding: same, kwargs: {'name': 'residual_64_15_conv2d_0', 'kernel_initializer': <keras.initializers.RandomNormal object at 0x7f4148532150>})
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       _set_default_initializer  DEBUG    Using model specified initializer: <keras.initializers.RandomNormal object at 0x7f4148532150>
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       _switch_kernel_initializer DEBUG    Switched kernel_initializer from <keras.initializers.RandomNormal object at 0x7f4148532150> to <keras.initializers.VarianceScaling object at 0x7f40b84420d0>
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       conv2d                    DEBUG    input_tensor: Tensor("residual_64_15_leakyrelu_1/LeakyRelu:0", shape=(?, 64, 64, 128), dtype=float32), filters: 128, kernel_size: 3, strides: (1, 1), padding: same, kwargs: {'kernel_initializer': <keras.initializers.VarianceScaling object at 0x7f40b84420d0>})
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       _get_name                 DEBUG    Generating block name: conv2d_64_15
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       _set_default_initializer  DEBUG    Using model specified initializer: <keras.initializers.VarianceScaling object at 0x7f40b84420d0>
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       _switch_kernel_initializer DEBUG    Switched kernel_initializer from <keras.initializers.VarianceScaling object at 0x7f40b84420d0> to <keras.initializers.RandomNormal object at 0x7f4148532150>
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       res_block                 DEBUG    input_tensor: Tensor("residual_64_15_leakyrelu_3/LeakyRelu:0", shape=(?, 64, 64, 128), dtype=float32), filters: 128, kernel_size: 3, kwargs: {'kernel_initializer': <keras.initializers.RandomNormal object at 0x7f4148532150>})
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       _get_name                 DEBUG    Generating block name: residual_64_16
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       conv2d                    DEBUG    input_tensor: Tensor("residual_64_16_leakyrelu_0/LeakyRelu:0", shape=(?, 64, 64, 128), dtype=float32), filters: 128, kernel_size: 3, strides: (1, 1), padding: same, kwargs: {'name': 'residual_64_16_conv2d_0', 'kernel_initializer': <keras.initializers.RandomNormal object at 0x7f4148532150>})
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       _set_default_initializer  DEBUG    Using model specified initializer: <keras.initializers.RandomNormal object at 0x7f4148532150>
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       _switch_kernel_initializer DEBUG    Switched kernel_initializer from <keras.initializers.RandomNormal object at 0x7f4148532150> to <keras.initializers.VarianceScaling object at 0x7f40b845b0d0>
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       conv2d                    DEBUG    input_tensor: Tensor("residual_64_16_leakyrelu_1/LeakyRelu:0", shape=(?, 64, 64, 128), dtype=float32), filters: 128, kernel_size: 3, strides: (1, 1), padding: same, kwargs: {'kernel_initializer': <keras.initializers.VarianceScaling object at 0x7f40b845b0d0>})
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       _get_name                 DEBUG    Generating block name: conv2d_64_16
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       _set_default_initializer  DEBUG    Using model specified initializer: <keras.initializers.VarianceScaling object at 0x7f40b845b0d0>
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       _switch_kernel_initializer DEBUG    Switched kernel_initializer from <keras.initializers.VarianceScaling object at 0x7f40b845b0d0> to <keras.initializers.RandomNormal object at 0x7f4148532150>
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       res_block                 DEBUG    input_tensor: Tensor("residual_64_16_leakyrelu_3/LeakyRelu:0", shape=(?, 64, 64, 128), dtype=float32), filters: 128, kernel_size: 3, kwargs: {'kernel_initializer': <keras.initializers.RandomNormal object at 0x7f4148532150>})
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       _get_name                 DEBUG    Generating block name: residual_64_17
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       conv2d                    DEBUG    input_tensor: Tensor("residual_64_17_leakyrelu_0/LeakyRelu:0", shape=(?, 64, 64, 128), dtype=float32), filters: 128, kernel_size: 3, strides: (1, 1), padding: same, kwargs: {'name': 'residual_64_17_conv2d_0', 'kernel_initializer': <keras.initializers.RandomNormal object at 0x7f4148532150>})
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       _set_default_initializer  DEBUG    Using model specified initializer: <keras.initializers.RandomNormal object at 0x7f4148532150>
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       _switch_kernel_initializer DEBUG    Switched kernel_initializer from <keras.initializers.RandomNormal object at 0x7f4148532150> to <keras.initializers.VarianceScaling object at 0x7f40b83f8050>
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       conv2d                    DEBUG    input_tensor: Tensor("residual_64_17_leakyrelu_1/LeakyRelu:0", shape=(?, 64, 64, 128), dtype=float32), filters: 128, kernel_size: 3, strides: (1, 1), padding: same, kwargs: {'kernel_initializer': <keras.initializers.VarianceScaling object at 0x7f40b83f8050>})
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       _get_name                 DEBUG    Generating block name: conv2d_64_17
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       _set_default_initializer  DEBUG    Using model specified initializer: <keras.initializers.VarianceScaling object at 0x7f40b83f8050>
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       _switch_kernel_initializer DEBUG    Switched kernel_initializer from <keras.initializers.VarianceScaling object at 0x7f40b83f8050> to <keras.initializers.RandomNormal object at 0x7f4148532150>
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       conv                      DEBUG    input_tensor: Tensor("add_23/add:0", shape=(?, 64, 64, 128), dtype=float32), filters: 128, kernel_size: 5, strides: 2, use_instance_norm: False, kwargs: {'kernel_initializer': <keras.initializers.RandomNormal object at 0x7f4148532150>})
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       _get_name                 DEBUG    Generating block name: conv_64_0
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       conv2d                    DEBUG    input_tensor: Tensor("add_23/add:0", shape=(?, 64, 64, 128), dtype=float32), filters: 128, kernel_size: 5, strides: 2, padding: same, kwargs: {'name': 'conv_64_0_conv2d', 'kernel_initializer': <keras.initializers.RandomNormal object at 0x7f4148532150>})
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       _set_default_initializer  DEBUG    Using model specified initializer: <keras.initializers.RandomNormal object at 0x7f4148532150>
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       conv                      DEBUG    input_tensor: Tensor("pixel_shuffler_1/Reshape_1:0", shape=(?, 64, 64, 32), dtype=float32), filters: 128, kernel_size: 5, strides: 2, use_instance_norm: False, kwargs: {'kernel_initializer': <keras.initializers.RandomNormal object at 0x7f4148532150>})
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       _get_name                 DEBUG    Generating block name: conv_64_1
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       conv2d                    DEBUG    input_tensor: Tensor("pixel_shuffler_1/Reshape_1:0", shape=(?, 64, 64, 32), dtype=float32), filters: 128, kernel_size: 5, strides: 2, padding: same, kwargs: {'name': 'conv_64_1_conv2d', 'kernel_initializer': <keras.initializers.RandomNormal object at 0x7f4148532150>})
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       _set_default_initializer  DEBUG    Using model specified initializer: <keras.initializers.RandomNormal object at 0x7f4148532150>
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       conv                      DEBUG    input_tensor: Tensor("pixel_shuffler_2/Reshape_1:0", shape=(?, 64, 64, 32), dtype=float32), filters: 128, kernel_size: 5, strides: 2, use_instance_norm: False, kwargs: {'kernel_initializer': <keras.initializers.RandomNormal object at 0x7f4148532150>})
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       _get_name                 DEBUG    Generating block name: conv_64_2
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       conv2d                    DEBUG    input_tensor: Tensor("pixel_shuffler_2/Reshape_1:0", shape=(?, 64, 64, 32), dtype=float32), filters: 128, kernel_size: 5, strides: 2, padding: same, kwargs: {'name': 'conv_64_2_conv2d', 'kernel_initializer': <keras.initializers.RandomNormal object at 0x7f4148532150>})
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       _set_default_initializer  DEBUG    Using model specified initializer: <keras.initializers.RandomNormal object at 0x7f4148532150>
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       conv_sep                  DEBUG    input_tensor: Tensor("conv_64_2_leakyrelu/LeakyRelu:0", shape=(?, 32, 32, 128), dtype=float32), filters: 256, kernel_size: 5, strides: 2, kwargs: {'kernel_initializer': <keras.initializers.RandomNormal object at 0x7f4148532150>})
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       _get_name                 DEBUG    Generating block name: separableconv2d_32_0
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       _set_default_initializer  DEBUG    Using model specified initializer: <keras.initializers.RandomNormal object at 0x7f4148532150>
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       conv                      DEBUG    input_tensor: Tensor("separableconv2d_32_0_relu/Relu:0", shape=(?, 16, 16, 256), dtype=float32), filters: 512, kernel_size: 5, strides: 2, use_instance_norm: False, kwargs: {'kernel_initializer': <keras.initializers.RandomNormal object at 0x7f4148532150>})
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       _get_name                 DEBUG    Generating block name: conv_16_0
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       conv2d                    DEBUG    input_tensor: Tensor("separableconv2d_32_0_relu/Relu:0", shape=(?, 16, 16, 256), dtype=float32), filters: 512, kernel_size: 5, strides: 2, padding: same, kwargs: {'name': 'conv_16_0_conv2d', 'kernel_initializer': <keras.initializers.RandomNormal object at 0x7f4148532150>})
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       _set_default_initializer  DEBUG    Using model specified initializer: <keras.initializers.RandomNormal object at 0x7f4148532150>
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       conv_sep                  DEBUG    input_tensor: Tensor("conv_16_0_leakyrelu/LeakyRelu:0", shape=(?, 8, 8, 512), dtype=float32), filters: 1024, kernel_size: 5, strides: 2, kwargs: {'kernel_initializer': <keras.initializers.RandomNormal object at 0x7f4148532150>})
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       _get_name                 DEBUG    Generating block name: separableconv2d_8_0
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       _set_default_initializer  DEBUG    Using model specified initializer: <keras.initializers.RandomNormal object at 0x7f4148532150>
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       upscale                   DEBUG    input_tensor: Tensor("reshape_1/Reshape:0", shape=(?, 8, 8, 1024), dtype=float32), filters: 512, kernel_size: 3, use_instance_norm: False, kwargs: {'kernel_initializer': <keras.initializers.RandomNormal object at 0x7f4148532150>})
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       _get_name                 DEBUG    Generating block name: upscale_8_0
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       _set_default_initializer  DEBUG    Using model specified initializer: <keras.initializers.RandomNormal object at 0x7f4148532150>
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       conv2d                    DEBUG    input_tensor: Tensor("reshape_1/Reshape:0", shape=(?, 8, 8, 1024), dtype=float32), filters: 2048, kernel_size: 3, strides: (1, 1), padding: same, kwargs: {'name': 'upscale_8_0_conv2d', 'kernel_initializer': <keras.initializers.RandomNormal object at 0x7f4148532150>})
05/21/2020 07:28:25 MainProcess     _training_0     nn_blocks       _set_default_initializer  DEBUG    Using model specified initializer: <keras.initializers.RandomNormal object at 0x7f4148532150>
05/21/2020 07:28:25 MainProcess     _training_0     _base           add_network               DEBUG    network_type: 'encoder', side: 'None', network: '<keras.engine.training.Model object at 0x7f40b8388d90>', is_output: False
05/21/2020 07:28:25 MainProcess     _training_0     _base           name                      DEBUG    model name: 'villain'
05/21/2020 07:28:25 MainProcess     _training_0     _base           add_network               DEBUG    name: 'encoder', filename: 'villain_encoder.h5'
05/21/2020 07:28:25 MainProcess     _training_0     _base           __init__                  DEBUG    Initializing NNMeta: (filename: '/home/ubuntu/myfolder/faceswap-project/models/face1face2/villain_encoder.h5', network_type: 'encoder', side: 'None', network: <keras.engine.training.Model object at 0x7f40b8388d90>, is_output: False
05/21/2020 07:28:26 MainProcess     _training_0     _base           __init__                  DEBUG    Initialized NNMeta
05/21/2020 07:28:26 MainProcess     _training_0     original        add_networks              DEBUG    Added networks
05/21/2020 07:28:26 MainProcess     _training_0     _base           load_models               DEBUG    Load model: (swapped: False)
05/21/2020 07:28:26 MainProcess     _training_0     _base           models_exist              DEBUG    Pre-existing models exist: True
05/21/2020 07:28:26 MainProcess     _training_0     _base           models_exist              DEBUG    Pre-existing models exist: True
05/21/2020 07:28:26 MainProcess     _training_0     module_wrapper  _tfmw_add_deprecation_warning DEBUG    From /home/ubuntu/anaconda3/envs/faceswap/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:95: The name tf.reset_default_graph is deprecated. Please use tf.compat.v1.reset_default_graph instead.\n
05/21/2020 07:28:26 MainProcess     _training_0     module_wrapper  _tfmw_add_deprecation_warning DEBUG    From /home/ubuntu/anaconda3/envs/faceswap/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:98: The name tf.placeholder_with_default is deprecated. Please use tf.compat.v1.placeholder_with_default instead.\n
05/21/2020 07:28:26 MainProcess     _training_0     _base           map_models                DEBUG    Map models: (swapped: False)
05/21/2020 07:28:26 MainProcess     _training_0     _base           map_models                DEBUG    Mapped models: (models_map: {'a': {'decoder': '/home/ubuntu/myfolder/faceswap-project/models/face1face2/villain_decoder_A.h5'}, 'b': {'decoder': '/home/ubuntu/myfolder/faceswap-project/models/face1face2/villain_decoder_B.h5'}})
05/21/2020 07:28:26 MainProcess     _training_0     _base           load                      DEBUG    Loading model: '/home/ubuntu/myfolder/faceswap-project/models/face1face2/villain_decoder_A.h5'
05/21/2020 07:28:27 MainProcess     _training_0     multithreading  run                       DEBUG    Error in thread (_training_0): Invalid device ordinal value (1). Valid range is [0, 0].\n	while setting up XLA_GPU_JIT device number 1
05/21/2020 07:28:28 MainProcess     MainThread      train           _monitor                  DEBUG    Thread error detected
05/21/2020 07:28:28 MainProcess     MainThread      train           _monitor                  DEBUG    Closed Monitor
05/21/2020 07:28:28 MainProcess     MainThread      train           _end_thread               DEBUG    Ending Training thread
05/21/2020 07:28:28 MainProcess     MainThread      train           _end_thread               CRITICAL Error caught! Exiting...
05/21/2020 07:28:28 MainProcess     MainThread      multithreading  join                      DEBUG    Joining Threads: '_training'
05/21/2020 07:28:28 MainProcess     MainThread      multithreading  join                      DEBUG    Joining Thread: '_training_0'
05/21/2020 07:28:28 MainProcess     MainThread      multithreading  join                      ERROR    Caught exception in thread: '_training_0'
Traceback (most recent call last):
  File "/home/ubuntu/faceswap/lib/cli/launcher.py", line 155, in execute_script
    process.process()
  File "/home/ubuntu/faceswap/scripts/train.py", line 161, in process
    self._end_thread(thread, err)
  File "/home/ubuntu/faceswap/scripts/train.py", line 201, in _end_thread
    thread.join()
  File "/home/ubuntu/faceswap/lib/multithreading.py", line 121, in join
    raise thread.err[1].with_traceback(thread.err[2])
  File "/home/ubuntu/faceswap/lib/multithreading.py", line 37, in run
    self._target(*self._args, **self._kwargs)
  File "/home/ubuntu/faceswap/scripts/train.py", line 226, in _training
    raise err
  File "/home/ubuntu/faceswap/scripts/train.py", line 214, in _training
    model = self._load_model()
  File "/home/ubuntu/faceswap/scripts/train.py", line 255, in _load_model
    predict=False)
  File "/home/ubuntu/faceswap/plugins/train/model/villain.py", line 25, in __init__
    super().__init__(*args, **kwargs)
  File "/home/ubuntu/faceswap/plugins/train/model/original.py", line 25, in __init__
    super().__init__(*args, **kwargs)
  File "/home/ubuntu/faceswap/plugins/train/model/_base.py", line 125, in __init__
    self.build()
  File "/home/ubuntu/faceswap/plugins/train/model/_base.py", line 244, in build
    self.load_models(swapped=False)
  File "/home/ubuntu/faceswap/plugins/train/model/_base.py", line 456, in load_models
    is_loaded = network.load(fullpath=model_mapping[network.side][network.type])
  File "/home/ubuntu/faceswap/plugins/train/model/_base.py", line 834, in load
    network = load_model(self.filename, custom_objects=get_custom_objects())
  File "/home/ubuntu/anaconda3/envs/faceswap/lib/python3.7/site-packages/keras/engine/saving.py", line 419, in load_model
    model = _deserialize_model(f, custom_objects, compile)
  File "/home/ubuntu/anaconda3/envs/faceswap/lib/python3.7/site-packages/keras/engine/saving.py", line 287, in _deserialize_model
    K.batch_set_value(weight_value_tuples)
  File "/home/ubuntu/anaconda3/envs/faceswap/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py", line 2470, in batch_set_value
    get_session().run(assign_ops, feed_dict=feed_dict)
  File "/home/ubuntu/anaconda3/envs/faceswap/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py", line 186, in get_session
    _SESSION = tf.Session(config=config)
  File "/home/ubuntu/anaconda3/envs/faceswap/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1585, in __init__
    super(Session, self).__init__(target, graph, config=config)
  File "/home/ubuntu/anaconda3/envs/faceswap/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 699, in __init__
    self._session = tf_session.TF_NewSessionRef(self._graph._c_graph, opts)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Invalid device ordinal value (1). Valid range is [0, 0].
	while setting up XLA_GPU_JIT device number 1

============ System Information ============
encoding:            UTF-8
git_branch:          master
git_commits:         ac40b0f Remove subpixel upscaling option (#1024)
gpu_cuda:            10.0
gpu_cudnn:           7.6.5
gpu_devices:         GPU_0: Tesla K80, GPU_1: Tesla K80, GPU_2: Tesla K80, GPU_3: Tesla K80, GPU_4: Tesla K80, GPU_5: Tesla K80, GPU_6: Tesla K80, GPU_7: Tesla K80
gpu_devices_active:  GPU_0, GPU_1, GPU_2, GPU_3, GPU_4, GPU_5, GPU_6, GPU_7
gpu_driver:          440.33.01
gpu_vram:            GPU_0: 11441MB, GPU_1: 11441MB, GPU_2: 11441MB, GPU_3: 11441MB, GPU_4: 11441MB, GPU_5: 11441MB, GPU_6: 11441MB, GPU_7: 11441MB
os_machine:          x86_64
os_platform:         Linux-5.3.0-1017-aws-x86_64-with-debian-buster-sid
os_release:          5.3.0-1017-aws
py_command:          /home/ubuntu/faceswap/faceswap.py train -A /home/ubuntu/myfolder/faceswap-project/face1/output -ala /home/ubuntu/myfolder/faceswap-project/face1/face1.fsa -B /home/ubuntu/myfolder/faceswap-project/face2/output -alb /home/ubuntu/myfolder/faceswap-project/face2/face2.fsa -m /home/ubuntu/myfolder/faceswap-project/models/face1face2 -t villain -bs 100 -it 1000000 -g 1 -s 50 -ss 25000 -ps 50 -ag -wl -L INFO -w
py_conda_version:    conda 4.8.3
py_implementation:   CPython
py_version:          3.7.7
py_virtual_env:      True
sys_cores:           32
sys_processor:       x86_64
sys_ram:             Total: 491594MB, Available: 485096MB, Used: 1709MB, Free: 481857MB

=============== Pip Packages ===============
absl-py==0.9.0
astor==0.8.0
certifi==2020.4.5.1
cloudpickle==1.4.1
cycler==0.10.0
cytoolz==0.10.1
dask==2.16.0
decorator==4.4.2
fastcluster==1.1.26
ffmpy==0.2.2
gast==0.2.2
google-pasta==0.2.0
grpcio==1.27.2
h5py==2.9.0
imageio==2.6.1
imageio-ffmpeg==0.4.2
joblib==0.14.1
Keras==2.2.4
Keras-Applications==1.0.8
Keras-Preprocessing==1.1.0
kiwisolver==1.2.0
Markdown==3.1.1
matplotlib==3.1.3
mkl-fft==1.0.15
mkl-random==1.1.0
mkl-service==2.3.0
networkx==2.4
numpy==1.17.4
nvidia-ml-py3==7.352.1
olefile==0.46
opencv-python==4.1.2.30
opt-einsum==3.1.0
pathlib==1.0.1
Pillow==6.2.1
protobuf==3.11.4
psutil==5.7.0
pyparsing==2.4.7
python-dateutil==2.8.1
pytz==2020.1
PyWavelets==1.1.1
PyYAML==5.3.1
scikit-image==0.16.2
scikit-learn==0.22.1
scipy==1.4.1
six==1.14.0
tensorboard==1.15.0
tensorflow==1.15.0
tensorflow-estimator==1.15.1
termcolor==1.1.0
toolz==0.10.0
toposort==1.5
tornado==6.0.4
tqdm==4.46.0
webencodings==0.5.1
Werkzeug==0.16.1
wrapt==1.12.1

============== Conda Packages ==============
# packages in environment at /home/ubuntu/anaconda3/envs/faceswap:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                        main  
_tflow_select 2.1.0 gpu
absl-py 0.9.0 py37_0
astor 0.8.0 py37_0
blas 1.0 mkl
bzip2 1.0.8 h516909a_2 conda-forge c-ares 1.15.0 h7b6447c_1001
ca-certificates 2020.1.1 0
certifi 2020.4.5.1 py37_0
cloudpickle 1.4.1 py_0
cudatoolkit 10.0.130 0
cudnn 7.6.5 cuda10.0_0
cupti 10.0.130 0
cycler 0.10.0 py37_0
cytoolz 0.10.1 py37h7b6447c_0
dask-core 2.16.0 py_0
dbus 1.13.14 hb2f20db_0
decorator 4.4.2 py_0
expat 2.2.6 he6710b0_0
fastcluster 1.1.26 py37hb3f55d8_0 conda-forge ffmpeg 4.2 h167e202_0 conda-forge ffmpy 0.2.2 pypi_0 pypi fontconfig 2.13.0 h9420a91_0
freetype 2.9.1 h8a8886c_1
gast 0.2.2 py37_0
git 2.23.0 pl526hacde149_0
glib 2.63.1 h3eb4bd4_1
gmp 6.2.0 he1b5a44_2 conda-forge gnutls 3.6.5 hd3a4fd2_1002 conda-forge google-pasta 0.2.0 py_0
grpcio 1.27.2 py37hf8bcb03_0
gst-plugins-base 1.14.0 hbbd80ab_1
gstreamer 1.14.0 hb31296c_0
h5py 2.9.0 py37h7918eee_0
hdf5 1.10.4 hb1b8bf9_0
icu 58.2 he6710b0_3
imageio 2.6.1 py37_0
imageio-ffmpeg 0.4.2 py_0 conda-forge intel-openmp 2020.1 217
joblib 0.14.1 py_0
jpeg 9b h024ee3a_2
keras 2.2.4 0
keras-applications 1.0.8 py_0
keras-base 2.2.4 py37_0
keras-preprocessing 1.1.0 py_1
kiwisolver 1.2.0 py37hfd86e86_0
krb5 1.17.1 h173b8e3_0
lame 3.100 h14c3975_1001 conda-forge ld_impl_linux-64 2.33.1 h53a641e_7
libcurl 7.69.1 h20c2e04_0
libedit 3.1.20181209 hc058e9b_0
libffi 3.3 he6710b0_1
libgcc-ng 9.1.0 hdf63c60_0
libgfortran-ng 7.3.0 hdf63c60_0
libiconv 1.15 h516909a_1006 conda-forge libpng 1.6.37 hbc83047_0
libprotobuf 3.11.4 hd408876_0
libssh2 1.9.0 h1ba5d50_1
libstdcxx-ng 9.1.0 hdf63c60_0
libtiff 4.1.0 h2733197_0
libuuid 1.0.3 h1bed415_2
libxcb 1.13 h1bed415_1
libxml2 2.9.9 hea5a465_1
markdown 3.1.1 py37_0
matplotlib 3.1.1 py37h5429711_0
matplotlib-base 3.1.3 py37hef1b27d_0
mkl 2020.1 217
mkl-service 2.3.0 py37he904b0f_0
mkl_fft 1.0.15 py37ha843d7b_0
mkl_random 1.1.0 py37hd6b4f25_0
ncurses 6.2 he6710b0_1
nettle 3.4.1 h1bed415_1002 conda-forge networkx 2.4 py_0
numpy 1.17.4 py37hc1035e2_0
numpy-base 1.17.4 py37hde5b4d6_0
nvidia-ml-py3 7.352.1 pypi_0 pypi olefile 0.46 py37_0
opencv-python 4.1.2.30 pypi_0 pypi openh264 1.8.0 hdbcaa40_1000 conda-forge openssl 1.1.1g h7b6447c_0
opt_einsum 3.1.0 py_0
pathlib 1.0.1 py37_1
pcre 8.43 he6710b0_0
perl 5.26.2 h14c3975_0
pillow 6.2.1 py37h34e0f95_0
pip 20.0.2 py37_3
protobuf 3.11.4 py37he6710b0_0
psutil 5.7.0 py37h7b6447c_0
pyparsing 2.4.7 py_0
pyqt 5.9.2 py37h05f1152_2
python 3.7.7 hcff3b4d_5
python-dateutil 2.8.1 py_0
python_abi 3.7 1_cp37m conda-forge pytz 2020.1 py_0
pywavelets 1.1.1 py37h7b6447c_0
pyyaml 5.3.1 py37h7b6447c_0
qt 5.9.7 h5867ecd_1
readline 8.0 h7b6447c_0
scikit-image 0.16.2 py37h0573a6f_0
scikit-learn 0.22.1 py37hd81dba3_0
scipy 1.4.1 py37h0b6359f_0
setuptools 46.4.0 py37_0
sip 4.19.8 py37hf484d3e_0
six 1.14.0 py37_0
sqlite 3.31.1 h62c20be_1
tensorboard 1.15.0 pyhb230dea_0
tensorflow 1.15.0 gpu_py37h0f0df58_0
tensorflow-base 1.15.0 gpu_py37h9dcbed7_0
tensorflow-estimator 1.15.1 pyh2649769_0
tensorflow-gpu 1.15.0 h0d30ee6_0
termcolor 1.1.0 py37_1
tk 8.6.8 hbc83047_0
toolz 0.10.0 py_0
toposort 1.5 py_3 conda-forge tornado 6.0.4 py37h7b6447c_1
tqdm 4.46.0 py_0
webencodings 0.5.1 py37_1
werkzeug 0.16.1 py_0
wheel 0.34.2 py37_0
wrapt 1.12.1 py37h7b6447c_1
x264 1!152.20180806 h14c3975_0 conda-forge xz 5.2.5 h7b6447c_0
yaml 0.1.7 had09818_2
zlib 1.2.11 h7b6447c_3
zstd 1.3.7 h0b5b093_0 =============== State File ================= { "name": "villain", "sessions": { "1": { "timestamp": 1589747179.0678897, "no_logs": false, "pingpong": false, "loss_names": { "a": [ "face_loss" ], "b": [ "face_loss" ] }, "batchsize": 32, "iterations": 617, "config": { "learning_rate": 5e-05 } }, "2": { "timestamp": 1589752564.8719282, "no_logs": false, "pingpong": false, "loss_names": { "a": [ "face_loss" ], "b": [ "face_loss" ] }, "batchsize": 32, "iterations": 15701, "config": { "learning_rate": 5e-05 } }, "3": { "timestamp": 1589915467.228661, "no_logs": false, "pingpong": false, "loss_names": { "a": [ "face_loss" ], "b": [ "face_loss" ] }, "batchsize": 32, "iterations": 8451, "config": { "learning_rate": 5e-05 } } }, "lowest_avg_loss": { "a": 0.011524430494755506, "b": 0.013505328968167305 }, "iterations": 24769, "inputs": { "face_in:0": [ 128, 128, 3 ], "mask_in:0": [ 128, 128, 1 ] }, "training_size": 256, "config": { "coverage": 100.0, "mask_type": "vgg-clear", "mask_blur_kernel": 3, "mask_threshold": 4, "learn_mask": false, "icnr_init": false, "conv_aware_init": false, "reflect_padding": false, "penalized_mask_loss": true, "loss_function": "mae", "learning_rate": 5e-05, "lowmem": false } } ================= Configs ================== --------- convert.ini --------- [mask.mask_blend] type: normalized kernel_size: 3 passes: 4 threshold: 4 erosion: 0.0 [mask.box_blend] type: gaussian distance: 11.0 radius: 5.0 passes: 1 [color.color_transfer] clip: True preserve_paper: True [color.manual_balance] colorspace: HSV balance_1: 0.0 balance_2: 0.0 balance_3: 0.0 contrast: 0.0 brightness: 0.0 [color.match_hist] threshold: 99.0 [scaling.sharpen] method: unsharp_mask amount: 150 radius: 0.3 threshold: 5.0 [writer.ffmpeg] container: mp4 codec: libx264 crf: 23 preset: medium tune: none profile: auto level: auto [writer.gif] fps: 25 loop: 0 palettesize: 256 subrectangles: False [writer.opencv] format: png draw_transparent: False jpg_quality: 75 png_compress_level: 3 [writer.pillow] format: png draw_transparent: False optimize: False gif_interlace: True jpg_quality: 75 png_compress_level: 3 tif_compression: tiff_deflate --------- .faceswap --------- backend: nvidia --------- extract.ini --------- [global] allow_growth: False [mask.vgg_obstructed] batch-size: 2 [mask.vgg_clear] batch-size: 6 [mask.unet_dfl] batch-size: 8 [align.fan] batch-size: 12 [detect.mtcnn] minsize: 20 threshold_1: 0.6 threshold_2: 0.7 threshold_3: 0.7 scalefactor: 0.709 batch-size: 8 [detect.s3fd] confidence: 70 batch-size: 4 [detect.cv2_dnn] confidence: 50 --------- train.ini --------- [global] coverage: 100 mask_type: vgg-clear mask_blur_kernel: 3 mask_threshold: 4 learn_mask: True icnr_init: False conv_aware_init: False reflect_padding: False penalized_mask_loss: True loss_function: mae learning_rate: 5e-05 [trainer.original] preview_images: 14 zoom_amount: 5 rotation_range: 10 shift_range: 5 flip_chance: 50 color_lightness: 30 color_ab: 8 color_clahe_chance: 50 color_clahe_max_size: 4 [model.dfl_sae] input_size: 128 clipnorm: True architecture: df autoencoder_dims: 0 encoder_dims: 42 decoder_dims: 21 multiscale_decoder: False [model.dfl_h128] lowmem: False [model.realface] input_size: 64 output_size: 128 dense_nodes: 1536 complexity_encoder: 128 complexity_decoder: 512 [model.villain] lowmem: False [model.original] lowmem: False [model.unbalanced] input_size: 128 lowmem: False clipnorm: True nodes: 1024 complexity_encoder: 128 complexity_decoder_a: 384 complexity_decoder_b: 512 [model.dlight] features: best details: good output_size: 256
User avatar
torzdf
Posts: 2636
Joined: Fri Jul 12, 2019 12:53 am
Answers: 156
Has thanked: 128 times
Been thanked: 614 times

Re: Invalid device ordinal value (1). Valid range is [0, 0]

Post by torzdf »

Without knowing the ins and outs of how AWS build their VM images, I'm not going to be able to diagnose this.

However, this is a Tensorflow issue, so googling around the error will hopefully find you a solution. You can start here:
https://github.com/tensorflow/tensorflow/issues/32793

My word is final

Locked