UnicodeDecodeError: 'gbk' codec can't decode byte 0xa5 in position 230: illegal multibyte sequence

Extracting faces for training and converting
Forum rules
Read the FAQs and search the forum before posting a new topic.

Please mark any answers that fixed your problems so others can find the solutions.
Locked
User avatar
adventurer
Posts: 3
Joined: Thu Jan 16, 2020 4:21 am

UnicodeDecodeError: 'gbk' codec can't decode byte 0xa5 in position 230: illegal multibyte sequence

Post by adventurer »

I have already debug by myself, for example, adding encoding variable to the subprocess.Popen(), but it does not work too. It doubts me a lot.
I attach my log in the following.

Code: Select all

01/16/2020 12:10:19 MainProcess     MainThread      logger          log_setup                 INFO     Log level set to: INFO
01/16/2020 12:10:19 MainProcess     MainThread      cli             execute_script            DEBUG    Executing: extract. PID: 4568
01/16/2020 12:10:20 MainProcess     MainThread      cli             test_for_tf_version       DEBUG    Installed Tensorflow Version: 1.15
01/16/2020 12:10:20 MainProcess     MainThread      queue_manager   __init__                  DEBUG    Initializing QueueManager
01/16/2020 12:10:20 MainProcess     MainThread      queue_manager   __init__                  DEBUG    Initialized QueueManager
01/16/2020 12:10:20 MainProcess     MainThread      extract         __init__                  DEBUG    Initializing Extract: (args: Namespace(aligner='fan', alignments_path=None, colab=False, configfile=None, debug_landmarks=False, detector='s3fd', extract_every_n=1, filter=None, func=<bound method ScriptExecutor.execute_script of <lib.cli.ScriptExecutor object at 0x0000027239A4D288>>, input_dir='C:\\Healthy\\faceswap\\input\\input_vid1.mp4', logfile=None, loglevel='INFO', masker='extended', min_size=20, nfilter=None, normalization='hist', output_dir='C:\\Healthy\\faceswap\\output', redirect_gui=True, ref_threshold=0.4, rotate_images=None, save_interval=0, singleprocess=False, size=256, skip_existing=False, skip_faces=False)
01/16/2020 12:10:20 MainProcess     MainThread      utils           get_folder                DEBUG    Requested path: 'C:\Healthy\faceswap\output'
01/16/2020 12:10:20 MainProcess     MainThread      utils           get_folder                DEBUG    Returning: 'C:\Healthy\faceswap\output'
01/16/2020 12:10:20 MainProcess     MainThread      extract         __init__                  INFO     Output Directory: C:\Healthy\faceswap\output
01/16/2020 12:10:20 MainProcess     MainThread      image           __init__                  DEBUG    Initializing ImagesLoader: (path: C:\Healthy\faceswap\input\input_vid1.mp4, queue_size: 8, load_with_hash: False, fast_count: True, skip_list: None)
01/16/2020 12:10:20 MainProcess     MainThread      image           __init__                  DEBUG    Initializing ImagesLoader: (path: C:\Healthy\faceswap\input\input_vid1.mp4, queue_size: 8, args: (False,))
01/16/2020 12:10:20 MainProcess     MainThread      queue_manager   get_queue                 DEBUG    QueueManager getting: 'ImagesLoader'
01/16/2020 12:10:20 MainProcess     MainThread      queue_manager   add_queue                 DEBUG    QueueManager adding: (name: 'ImagesLoader', maxsize: 8)
01/16/2020 12:10:20 MainProcess     MainThread      queue_manager   add_queue                 DEBUG    QueueManager added: (name: 'ImagesLoader')
01/16/2020 12:10:20 MainProcess     MainThread      queue_manager   get_queue                 DEBUG    QueueManager got: 'ImagesLoader'
01/16/2020 12:10:20 MainProcess     MainThread      image           _check_for_video          DEBUG    Input 'C:\Healthy\faceswap\input\input_vid1.mp4' is_video: True
01/16/2020 12:10:20 MainProcess     MainThread      image           count_frames              DEBUG    filename: C:\Healthy\faceswap\input\input_vid1.mp4, fast: True
01/16/2020 12:10:20 MainProcess     MainThread      image           count_frames              DEBUG    FFMPEG Command: 'C:\ProgramData\Anaconda3\envs\faceswap\Library\bin\ffmpeg.exe -i C:\Healthy\faceswap\input\input_vid1.mp4 -map 0:v:0 -c copy -f null -'
Traceback (most recent call last):
  File "C:\Users\Panda\faceswap\lib\cli.py", line 127, in execute_script
    process = script(arguments)
  File "C:\Users\Panda\faceswap\scripts\extract.py", line 45, in __init__
    self._images = ImagesLoader(self._args.input_dir, load_with_hash=False, fast_count=True)
  File "C:\Users\Panda\faceswap\lib\image.py", line 494, in __init__
    self._get_count_and_filelist(fast_count)
  File "C:\Users\Panda\faceswap\lib\image.py", line 569, in _get_count_and_filelist
    self._count = int(count_frames(self.location, fast=fast_count))
  File "C:\Users\Panda\faceswap\lib\image.py", line 324, in count_frames
    output = process.stdout.readline().strip()
UnicodeDecodeError: 'gbk' codec can't decode byte 0xa5 in position 230: illegal multibyte sequence

============ System Information ============
encoding:            cp936
git_branch:          master
git_commits:         a3493a7 add encoding. ff76461 lib.cli - Add dfaker tooltip and typo fix.
gpu_cuda:            No global version found. Check Conda packages for Conda Cuda
gpu_cudnn:           No global version found. Check Conda packages for Conda cuDNN
gpu_devices:         GPU_0: GeForce MX250
gpu_devices_active:  GPU_0
gpu_driver:          417.81
gpu_vram:            GPU_0: 2048MB
os_machine:          AMD64
os_platform:         Windows-10-10.0.17763-SP0
os_release:          10
py_command:          C:\Users\Panda\faceswap\faceswap.py extract -i C:/Healthy/faceswap/input/input_vid1.mp4 -o C:/Healthy/faceswap/output -D s3fd -A fan -M extended -nm hist -min 20 -l 0.4 -een 1 -sz 256 -si 0 -L INFO -gui
py_conda_version:    conda 4.8.1
py_implementation:   CPython
py_version:          3.7.6
py_virtual_env:      True
sys_cores:           8
sys_processor:       Intel64 Family 6 Model 142 Stepping 11, GenuineIntel
sys_ram:             Total: 8000MB, Available: 903MB, Used: 7097MB, Free: 903MB

=============== Pip Packages ===============
absl-py==0.8.1
astor==0.8.0
certifi==2019.11.28
cloudpickle==1.2.2
cycler==0.10.0
cytoolz==0.10.1
dask==2.9.1
decorator==4.4.1
fastcluster==1.1.25
gast==0.2.2
google-pasta==0.1.8
grpcio==1.16.1
h5py==2.9.0
imageio==2.6.1
imageio-ffmpeg==0.3.0
joblib==0.14.1
Keras==2.2.4
Keras-Applications==1.0.8
Keras-Preprocessing==1.1.0
kiwisolver==1.1.0
Markdown==3.1.1
matplotlib==3.1.1
mkl-fft==1.0.15
mkl-random==1.1.0
mkl-service==2.3.0
networkx==2.4
numpy==1.17.4
nvidia-ml-py3==7.352.1
olefile==0.46
opencv-python==4.1.2.30
opt-einsum==3.1.0
pathlib==1.0.1
Pillow==6.2.1
protobuf==3.11.2
psutil==5.6.7
pyparsing==2.4.6
pyreadline==2.1
python-dateutil==2.8.1
pytz==2019.3
PyWavelets==1.1.1
pywin32==227
PyYAML==5.2
scikit-image==0.15.0
scikit-learn==0.22.1
scipy==1.3.2
six==1.13.0
tensorboard==2.0.0
tensorflow==1.15.0
tensorflow-estimator==1.15.1
termcolor==1.1.0
toolz==0.10.0
toposort==1.5
tornado==6.0.3
tqdm==4.41.1
Werkzeug==0.16.0
wincertstore==0.2
wrapt==1.11.2

============== Conda Packages ==============
# packages in environment at C:\ProgramData\Anaconda3\envs\faceswap:
#
# Name                    Version                   Build  Channel
_tflow_select             2.1.0                       gpu  
absl-py                   0.8.1                    py37_0  
astor                     0.8.0                    py37_0  
blas                      1.0                         mkl  
ca-certificates           2019.11.27                    0  
certifi                   2019.11.28               py37_0  
cloudpickle               1.2.2                      py_0  
cudatoolkit               10.0.130                      0  
cudnn                     7.6.5                cuda10.0_0  
cycler                    0.10.0                   py37_0  
cytoolz                   0.10.1           py37he774522_0  
dask-core                 2.9.1                      py_0  
decorator                 4.4.1                      py_0  
fastcluster               1.1.25          py37he350917_1000    conda-forge
ffmpeg                    4.2                  h6538335_0    conda-forge
freetype                  2.9.1                ha9979f8_1  
gast                      0.2.2                    py37_0  
git                       2.23.0               h6bb4b03_0  
google-pasta              0.1.8                      py_0  
grpcio                    1.16.1           py37h351948d_1  
h5py                      2.9.0            py37h5e291fa_0  
hdf5                      1.10.4               h7ebc959_0  
icc_rt                    2019.0.0             h0cc432a_1  
icu                       58.2                 ha66f8fd_1  
imageio                   2.6.1                    py37_0  
imageio-ffmpeg            0.3.0                      py_0    conda-forge
intel-openmp              2019.4                      245  
joblib                    0.14.1                     py_0  
jpeg                      9b                   hb83a4c4_2  
keras                     2.2.4                         0  
keras-applications        1.0.8                      py_0  
keras-base                2.2.4                    py37_0  
keras-preprocessing       1.1.0                      py_1  
kiwisolver                1.1.0            py37ha925a31_0  
libmklml                  2019.0.5                      0  
libpng                    1.6.37               h2a8f88b_0  
libprotobuf               3.11.2               h7bd577a_0  
libtiff                   4.1.0                h56a325e_0  
markdown                  3.1.1                    py37_0  
matplotlib                3.1.1            py37hc8f65d3_0  
mkl                       2019.4                      245  
mkl-service               2.3.0            py37hb782905_0  
mkl_fft                   1.0.15           py37h14836fe_0  
mkl_random                1.1.0            py37h675688f_0  
networkx                  2.4                        py_0  
numpy                     1.17.4           py37h4320e6b_0  
numpy-base                1.17.4           py37hc3f5095_0  
nvidia-ml-py3             7.352.1                  pypi_0    pypi
olefile                   0.46                     py37_0  
opencv-python             4.1.2.30                 pypi_0    pypi
openssl                   1.1.1d               he774522_3  
opt_einsum                3.1.0                      py_0  
pathlib                   1.0.1                    py37_1  
pillow                    6.2.1            py37hdc69c19_0  
pip                       19.3.1                   py37_0  
protobuf                  3.11.2           py37h33f27b4_0  
psutil                    5.6.7            py37he774522_0  
pyparsing                 2.4.6                      py_0  
pyqt                      5.9.2            py37h6538335_2  
pyreadline                2.1                      py37_1  
python                    3.7.6                h60c2a47_2  
python-dateutil           2.8.1                      py_0  
pytz                      2019.3                     py_0  
pywavelets                1.1.1            py37he774522_0  
pywin32                   227              py37he774522_0  
pyyaml                    5.2              py37he774522_0  
qt                        5.9.7            vc14h73c81de_0  
scikit-image              0.15.0           py37ha925a31_0  
scikit-learn              0.22.1           py37h6288b17_0  
scipy                     1.3.2            py37h29ff71c_0  
setuptools                44.0.0                   py37_0  
sip                       4.19.8           py37h6538335_0  
six                       1.13.0                   py37_0  
sqlite                    3.30.1               he774522_0  
tensorboard               2.0.0              pyhb38c66f_1  
tensorflow                1.15.0          gpu_py37hc3743a6_0  
tensorflow-base           1.15.0          gpu_py37h1afeea4_0  
tensorflow-estimator      1.15.1             pyh2649769_0  
tensorflow-gpu            1.15.0               h0d30ee6_0  
termcolor                 1.1.0                    py37_1  
tk                        8.6.8                hfa6e2cd_0  
toolz                     0.10.0                     py_0  
toposort                  1.5                        py_3    conda-forge
tornado                   6.0.3            py37he774522_0  
tqdm                      4.41.1                     py_0  
vc                        14.1                 h0510ff6_4  
vs2015_runtime            14.16.27012          hf0eaf9b_1  
werkzeug                  0.16.0                     py_0  
wheel                     0.33.6                   py37_0  
wincertstore              0.2                      py37_0  
wrapt                     1.11.2           py37he774522_0  
xz                        5.2.4                h2fa13f4_4  
yaml                      0.1.7                hc54c509_2  
zlib                      1.2.11               h62dcd97_3  
zstd                      1.3.7                h508b16e_0  

================= Configs ==================
--------- .faceswap ---------
backend:                  nvidia

--------- convert.ini ---------

[color.color_transfer]
clip:                     True
preserve_paper:           True

[color.manual_balance]
colorspace:               HSV
balance_1:                0.0
balance_2:                0.0
balance_3:                0.0
contrast:                 0.0
brightness:               0.0

[color.match_hist]
threshold:                99.0

[mask.box_blend]
type:                     gaussian
distance:                 11.0
radius:                   5.0
passes:                   1

[mask.mask_blend]
type:                     normalized
kernel_size:              3
passes:                   4
threshold:                4
erosion:                  0.0

[scaling.sharpen]
method:                   unsharp_mask
amount:                   150
radius:                   0.3
threshold:                5.0

[writer.ffmpeg]
container:                mp4
codec:                    libx264
crf:                      23
preset:                   medium
tune:                     none
profile:                  auto
level:                    auto

[writer.gif]
fps:                      25
loop:                     0
palettesize:              256
subrectangles:            False

[writer.opencv]
format:                   png
draw_transparent:         False
jpg_quality:              75
png_compress_level:       3

[writer.pillow]
format:                   png
draw_transparent:         False
optimize:                 False
gif_interlace:            True
jpg_quality:              75
png_compress_level:       3
tif_compression:          tiff_deflate

--------- extract.ini ---------

[global]
allow_growth:             False

[align.fan]
batch-size:               12

[detect.cv2_dnn]
confidence:               50

[detect.mtcnn]
minsize:                  20
threshold_1:              0.6
threshold_2:              0.7
threshold_3:              0.7
scalefactor:              0.709
batch-size:               8

[detect.s3fd]
confidence:               70
batch-size:               4

[mask.unet_dfl]
batch-size:               8

[mask.vgg_clear]
batch-size:               6

[mask.vgg_obstructed]
batch-size:               2

--------- gui.ini ---------

[global]
fullscreen:               False
tab:                      extract
options_panel_width:      30
console_panel_height:     20
icon_size:                14
font:                     default
font_size:                9
autosave_last_session:    prompt
timeout:                  120
auto_load_model_stats:    True

--------- train.ini ---------

[global]
coverage:                 68.75
mask_type:                none
mask_blur_kernel:         3
mask_threshold:           4
learn_mask:               False
icnr_init:                False
conv_aware_init:          False
subpixel_upscaling:       False
reflect_padding:          False
penalized_mask_loss:      True
loss_function:            mae
learning_rate:            5e-05

[model.dfl_h128]
lowmem:                   False

[model.dfl_sae]
input_size:               128
clipnorm:                 True
architecture:             df
autoencoder_dims:         0
encoder_dims:             42
decoder_dims:             21
multiscale_decoder:       False

[model.dlight]
features:                 best
details:                  good
output_size:              256

[model.original]
lowmem:                   False

[model.realface]
input_size:               64
output_size:              128
dense_nodes:              1536
complexity_encoder:       128
complexity_decoder:       512

[model.unbalanced]
input_size:               128
lowmem:                   False
clipnorm:                 True
nodes:                    1024
complexity_encoder:       128
complexity_decoder_a:     384
complexity_decoder_b:     512

[model.villain]
lowmem:                   False

[trainer.original]
preview_images:           14
zoom_amount:              5
rotation_range:           10
shift_range:              5
flip_chance:              50
color_lightness:          30
color_ab:                 8
color_clahe_chance:       50
color_clahe_max_size:     4
Here is my image.py:

Code: Select all

#!/usr/bin python3
""" Utilities for working with images and videos """

import logging
import subprocess
import os

from concurrent import futures
from hashlib import sha1

import cv2
import imageio
import imageio_ffmpeg as im_ffm
import numpy as np
from tqdm import tqdm

from lib.multithreading import MultiThread
from lib.queue_manager import queue_manager, QueueEmpty
from lib.utils import convert_to_secs, FaceswapError, _video_extensions, get_image_paths

logger = logging.getLogger(__name__)  # pylint:disable=invalid-name

# ################### #
# <<< IMAGE UTILS >>> #
# ################### #


# <<< IMAGE IO >>> #

def read_image(filename, raise_error=False, with_hash=False):
    """ Read an image file from a file location.

    Extends the functionality of :func:`cv2.imread()` by ensuring that an image was actually
    loaded. Errors can be logged and ignored so that the process can continue on an image load
    failure.

    Parameters
    ----------
    filename: str
        Full path to the image to be loaded.
    raise_error: bool, optional
        If ``True`` then any failures (including the returned image being ``None``) will be
        raised. If ``False`` then an error message will be logged, but the error will not be
        raised. Default: ``False``
    with_hash: bool, optional
        If ``True`` then returns the image's sha1 hash with the image. Default: ``False``

    Returns
    -------
    numpy.ndarray or tuple
        If :attr:`with_hash` is ``False`` then returns a `numpy.ndarray` of the image in `BGR`
        channel order. If :attr:`with_hash` is ``True`` then returns a `tuple` of (`numpy.ndarray`"
        of the image in `BGR`, `str` of sha` hash of image)
    Example
    -------
    >>> image_file = "/path/to/image.png"
    >>> try:
    >>>    image = read_image(image_file, raise_error=True, with_hash=False)
    >>> except:
    >>>     raise ValueError("There was an error")
    """
    logger.trace("Requested image: '%s'", filename)
    success = True
    image = None
    try:
        image = cv2.imread(filename)
        if image is None:
            raise ValueError
    except TypeError:
        success = False
        msg = "Error while reading image (TypeError): '{}'".format(filename)
        logger.error(msg)
        if raise_error:
            raise Exception(msg)
    except ValueError:
        success = False
        msg = ("Error while reading image. This is most likely caused by special characters in "
               "the filename: '{}'".format(filename))
        logger.error(msg)
        if raise_error:
            raise Exception(msg)
    except Exception as err:  # pylint:disable=broad-except
        success = False
        msg = "Failed to load image '{}'. Original Error: {}".format(filename, str(err))
        logger.error(msg)
        if raise_error:
            raise Exception(msg)
    logger.trace("Loaded image: '%s'. Success: %s", filename, success)
    retval = (image, sha1(image).hexdigest()) if with_hash else image
    return retval


def read_image_batch(filenames):
    """ Load a batch of images from the given file locations.

    Leverages multi-threading to load multiple images from disk at the same time
    leading to vastly reduced image read times.

    Notes
    -----
    Images are loaded concurrently, so the order of the returned batch will likely not be the same
    as the order of the input filenames. Filenames are returned with the batch in the correct order
    corresponding to the returned batch.

    Parameters
    ----------
    filenames: list
        A list of ``str`` full paths to the images to be loaded.

    Returns
    -------
    list
        Filenames in the correct order as they are returned
    numpy.ndarray
        The batch of images in `BGR` channel order.

    Notes
    -----
    As the images are compiled into a batch, they must be all of the same dimensions.

    Example
    -------
    >>> image_filenames = ["/path/to/image_1.png", "/path/to/image_2.png", "/path/to/image_3.png"]
    >>> filenames, images = read_image_batch(image_filenames)
    """
    logger.trace("Requested batch: '%s'", filenames)
    executor = futures.ThreadPoolExecutor()
    with executor:
        images = {executor.submit(read_image, filename, raise_error=True): filename
                  for filename in filenames}
        batch = []
        filenames = []
        for future in futures.as_completed(images):
            batch.append(future.result())
            filenames.append(images[future])
        batch = np.array(batch)
    logger.trace("Returning images: (filenames: %s, batch shape: %s)", filenames, batch.shape)
    return filenames, batch


def read_image_hash(filename):
    """ Return the `sha1` hash of an image saved on disk.

    Parameters
    ----------
    filename: str
        Full path to the image to be loaded.

    Returns
    -------
    str
        The :func:`hashlib.hexdigest()` representation of the `sha1` hash of the given image.
    Example
    -------
    >>> image_file = "/path/to/image.png"
    >>> image_hash = read_image_hash(image_file)
    """
    img = read_image(filename, raise_error=True)
    image_hash = sha1(img).hexdigest()
    logger.trace("filename: '%s', hash: %s", filename, image_hash)
    return image_hash


def read_image_hash_batch(filenames):
    """ Return the `sha` hash of a batch of images

    Leverages multi-threading to load multiple images from disk at the same time
    leading to vastly reduced image read times. Creates a generator to retrieve filenames
    with their hashes as they are calculated.

    Notes
    -----
    The order of returned values is non-deterministic so will most likely not be returned in the
    same order as the filenames

    Parameters
    ----------
    filenames: list
        A list of ``str`` full paths to the images to be loaded.

    Yields
    -------
    tuple: (`filename`, :func:`hashlib.hexdigest()` representation of the `sha1` hash of the image)
    Example
    -------
    >>> image_filenames = ["/path/to/image_1.png", "/path/to/image_2.png", "/path/to/image_3.png"]
    >>> for filename, hash in read_image_hash_batch(image_filenames):
    >>>         <do something>
    """
    logger.trace("Requested batch: '%s'", filenames)
    executor = futures.ThreadPoolExecutor()
    with executor:
        logger.debug("Submitting %s items to executor", len(filenames))
        read_hashes = {executor.submit(read_image_hash, filename): filename
                       for filename in filenames}
        logger.debug("Succesfully submitted %s items to executor", len(filenames))
        for future in futures.as_completed(read_hashes):
            retval = (read_hashes[future], future.result())
            logger.trace("Yielding: %s", retval)
            yield retval


def encode_image_with_hash(image, extension):
    """ Encode an image, and get the encoded image back with its `sha1` hash.

    Parameters
    ----------
    image: numpy.ndarray
        The image to be encoded in `BGR` channel order.
    extension: str
        A compatible `cv2` image file extension that the final image is to be saved to.

    Returns
    -------
    image_hash: str
        The :func:`hashlib.hexdigest()` representation of the `sha1` hash of the encoded image
    encoded_image: bytes
        The image encoded into the correct file format

    Example
    -------
    >>> image_file = "/path/to/image.png"
    >>> image = read_image(image_file)
    >>> image_hash, encoded_image = encode_image_with_hash(image, ".jpg")
    """
    encoded_image = cv2.imencode(extension, image)[1]
    image_hash = sha1(cv2.imdecode(encoded_image, cv2.IMREAD_UNCHANGED)).hexdigest()
    return image_hash, encoded_image


def batch_convert_color(batch, colorspace):
    """ Convert a batch of images from one color space to another.

    Converts a batch of images by reshaping the batch prior to conversion rather than iterating
    over the images. This leads to a significant speed up in the convert process.

    Parameters
    ----------
    batch: numpy.ndarray
        A batch of images.
    colorspace: str
        The OpenCV Color Conversion Code suffix. For example for BGR to LAB this would be
        ``'BGR2LAB'``.
        See https://docs.opencv.org/4.1.1/d8/d01/group__imgproc__color__conversions.html for a full
        list of color codes.

    Returns
    -------
    numpy.ndarray
        The batch converted to the requested color space.

    Example
    -------
    >>> images_bgr = numpy.array([image1, image2, image3])
    >>> images_lab = batch_convert_color(images_bgr, "BGR2LAB")

    Notes
    -----
    This function is only compatible for color space conversions that have the same image shape
    for source and destination color spaces.

    If you use :func:`batch_convert_color` with 8-bit images, the conversion will have some
    information lost. For many cases, this will not be noticeable but it is recommended
    to use 32-bit images in cases that need the full range of colors or that convert an image
    before an operation and then convert back.
    """
    logger.trace("Batch converting: (batch shape: %s, colorspace: %s)", batch.shape, colorspace)
    original_shape = batch.shape
    batch = batch.reshape((original_shape[0] * original_shape[1], *original_shape[2:]))
    batch = cv2.cvtColor(batch, getattr(cv2, "COLOR_{}".format(colorspace)))
    return batch.reshape(original_shape)


# ################### #
# <<< VIDEO UTILS >>> #
# ################### #

def count_frames(filename, fast=False):
    """ Count the number of frames in a video file

    There is no guaranteed accurate way to get a count of video frames without iterating through
    a video and decoding every frame.

    :func:`count_frames` can return an accurate count (albeit fairly slowly) or a possibly less
    accurate count, depending on the :attr:`fast` parameter. A progress bar is displayed.

    Parameters
    ----------
    filename: str
        Full path to the video to return the frame count from.
    fast: bool, optional
        Whether to count the frames without decoding them. This is significantly faster but
        accuracy is not guaranteed. Default: ``False``.

    Returns
    -------
    int:
        The number of frames in the given video file.

    Example
    -------
    >>> filename = "/path/to/video.mp4"
    >>> frame_count = count_frames(filename)
    """
    logger.debug("filename: %s, fast: %s", filename, fast)
    assert isinstance(filename, str), "Video path must be a string"

    cmd = [im_ffm.get_ffmpeg_exe(), "-i", filename, "-map", "0:v:0"]
    if fast:
        cmd.extend(["-c", "copy"])
    cmd.extend(["-f", "null", "-"])

    logger.debug("FFMPEG Command: '%s'", " ".join(cmd))
    process = subprocess.Popen(cmd,
                               stderr=subprocess.STDOUT,
                               stdout=subprocess.PIPE,
                               universal_newlines=True,encoding='cp936')
    pbar = None
    duration = None
    init_tqdm = False
    update = 0
    frames = 0
    while True:
        output = process.stdout.readline().strip()
        if output == "" and process.poll() is not None:
            break

        if output.startswith("Duration:"):
            logger.debug("Duration line: %s", output)
            idx = output.find("Duration:") + len("Duration:")
            duration = int(convert_to_secs(*output[idx:].split(",", 1)[0].strip().split(":")))
            logger.debug("duration: %s", duration)
        if output.startswith("frame="):
            logger.debug("frame line: %s", output)
            if not init_tqdm:
                logger.debug("Initializing tqdm")
                pbar = tqdm(desc="Counting Video Frames", leave=False, total=duration, unit="secs")
                init_tqdm = True
            time_idx = output.find("time=") + len("time=")
            frame_idx = output.find("frame=") + len("frame=")
            frames = int(output[frame_idx:].strip().split(" ")[0].strip())
            vid_time = int(convert_to_secs(*output[time_idx:].split(" ")[0].strip().split(":")))
            logger.debug("frames: %s, vid_time: %s", frames, vid_time)
            prev_update = update
            update = vid_time
            pbar.update(update - prev_update)
    if pbar is not None:
        pbar.close()
    return_code = process.poll()
    logger.debug("Return code: %s, frames: %s", return_code, frames)
    return frames


class ImageIO():
    """ Perform disk IO for images or videos in a background thread.

    This is the parent thread for :class:`ImagesLoader` and :class:`ImagesSaver` and should not
    be called directly.

    Parameters
    ----------
    path: str or list
        The path to load or save images to/from. For loading this can be a folder which contains
        images, video file or a list of image files. For saving this must be an existing folder.
    queue_size: int
        The amount of images to hold in the internal buffer.
    args: tuple, optional
        The arguments to be passed to the loader or saver thread. Default: ``None``

    See Also
    --------
    lib.image.ImagesLoader : Background Image Loader inheriting from this class.
    lib.image.ImagesSaver : Background Image Saver inheriting from this class.
    """

    def __init__(self, path, queue_size, args=None):
        logger.debug("Initializing %s: (path: %s, queue_size: %s, args: %s)",
                     self.__class__.__name__, path, queue_size, args)

        self._args = tuple() if args is None else args

        self._location = path
        self._check_location_exists()

        self._queue = queue_manager.get_queue(name=self.__class__.__name__, maxsize=queue_size)
        self._thread = None

    @property
    def location(self):
        """ str: The folder or video that was passed in as the :attr:`path` parameter. """
        return self._location

    def _check_location_exists(self):
        """ Check whether the input location exists.

        Raises
        ------
        FaceswapError
            If the given location does not exist
        """
        if isinstance(self.location, str) and not os.path.exists(self.location):
            raise FaceswapError("The location '{}' does not exist".format(self.location))
        if isinstance(self.location, (list, tuple)) and not all(os.path.exists(location)
                                                                for location in self.location):
            raise FaceswapError("Not all locations in the input list exist")

    def _set_thread(self):
        """ Set the load/save thread """
        logger.debug("Setting thread")
        if self._thread is not None and self._thread.is_alive():
            logger.debug("Thread pre-exists and is alive: %s", self._thread)
            return
        self._thread = MultiThread(self._process,
                                   self._queue,
                                   name=self.__class__.__name__,
                                   thread_count=1)
        logger.debug("Set thread: %s", self._thread)
        self._thread.start()

    def _process(self, queue):
        """ Image IO process to be run in a thread. Override for loader/saver process.

        Parameters
        ----------
        queue: queue.Queue()
            The ImageIO Queue
        """
        raise NotImplementedError

    def close(self):
        """ Closes down and joins the internal threads """
        logger.debug("Received Close")
        if self._thread is not None:
            self._thread.join()
        logger.debug("Closed")


class ImagesLoader(ImageIO):
    """ Perform image loading from a folder of images or a video.

    Images will be loaded and returned in the order that they appear in the folder, or in the video
    to ensure deterministic ordering. Loading occurs in a background thread, caching 8 images at a
    time so that other processes do not need to wait on disk reads.

    See also :class:`ImageIO` for additional attributes.

    Parameters
    ----------
    path: str or list
        The path to load images from. This can be a folder which contains images a video file or a
        list of image files.
    queue_size: int, optional
        The amount of images to hold in the internal buffer. Default: 8.
    load_with_hash: bool, optional
        Set to ``True`` to return the sha1 hash of the image along with the image.
        Default: ``False``.
    fast_count: bool, optional
        When loading from video, the video needs to be parsed frame by frame to get an accurate
        count. This can be done quite quickly without guaranteed accuracy, or slower with
        guaranteed accuracy. Set to ``True`` to count quickly, or ``False`` to count slower
        but accurately. Default: ``True``.
    skip_list: list, optional
        Optional list of frame/image indices to not load. Any indices provided here will be skipped
        when reading images from the given location. Default: ``None``

    Examples
    --------
    Loading from a video file:

    >>> loader = ImagesLoader('/path/to/video.mp4')
    >>> for filename, image in loader.load():
    >>>     <do processing>

    Loading faces with their sha1 hash:

    >>> loader = ImagesLoader('/path/to/faces/folder', load_with_hash=True)
    >>> for filename, image, sha1_hash in loader.load():
    >>>     <do processing>
    """

    def __init__(self, path, queue_size=8, load_with_hash=False, fast_count=True, skip_list=None):
        logger.debug("Initializing %s: (path: %s, queue_size: %s, load_with_hash: %s, "
                     "fast_count: %s, skip_list: %s)", self.__class__.__name__, path, queue_size,
                     load_with_hash, fast_count, skip_list)

        args = (load_with_hash, )
        super().__init__(path, queue_size=queue_size, args=args)
        self._skip_list = set() if skip_list is None else set(skip_list)

        self._is_video = self._check_for_video()

        self._count = None
        self._file_list = None
        self._get_count_and_filelist(fast_count)

    @property
    def count(self):
        """ int: The number of images or video frames in the source location. This count includes
        any files that will ultimately be skipped if a :attr:`skip_list` has been provided. See
        also: :attr:`process_count`"""
        return self._count

    @property
    def process_count(self):
        """ int: The number of images or video frames to be processed (IE the total count less
        items that are to be skipped from the :attr:`skip_list`)"""
        return self._count - len(self._skip_list)

    @property
    def is_video(self):
        """ bool: ``True`` if the input is a video, ``False`` if it is not """
        return self._is_video

    @property
    def file_list(self):
        """ list: A full list of files in the source location. This includes any files that will
        ultimately be skipped if a :attr:`skip_list` has been provided. If the input is a video
        then this is a list of dummy filenames as corresponding to an alignments file """
        return self._file_list

    def add_skip_list(self, skip_list):
        """ Add a skip list to this :class:`ImagesLoader`

        Parameters
        ----------
        skip_list: list
            A list of indices corresponding to the frame indices that should be skipped
        """
        logger.debug(skip_list)
        self._skip_list = set(skip_list)

    def _check_for_video(self):
        """ Check whether the input is a video

        Returns
        -------
        bool: 'True' if input is a video 'False' if it is a folder.

        Raises
        ------
        FaceswapError
            If the given location is a file and does not have a valid video extension.

        """
        if os.path.isdir(self.location):
            retval = False
        elif os.path.splitext(self.location)[1].lower() in _video_extensions:
            retval = True
        else:
            raise FaceswapError("The input file '{}' is not a valid video".format(self.location))
        logger.debug("Input '%s' is_video: %s", self.location, retval)
        return retval

    def _get_count_and_filelist(self, fast_count):
        """ Set the count of images to be processed and set the file list

            If the input is a video, a dummy file list is created for checking against an
            alignments file, otherwise it will be a list of full filenames.

        Parameters
        ----------
        fast_count: bool
            When loading from video, the video needs to be parsed frame by frame to get an accurate
            count. This can be done quite quickly without guaranteed accuracy, or slower with
            guaranteed accuracy. Set to ``True`` to count quickly, or ``False`` to count slower
            but accurately.
        """
        if self._is_video:
            self._count = int(count_frames(self.location, fast=fast_count))
            self._file_list = [self._dummy_video_framename(i) for i in range(self.count)]
        else:
            if isinstance(self.location, (list, tuple)):
                self._file_list = self.location
            else:
                self._file_list = get_image_paths(self.location)
            self._count = len(self.file_list)

        logger.debug("count: %s", self.count)
        logger.trace("filelist: %s", self.file_list)

    def _process(self, queue):
        """ The load thread.

        Loads from a folder of images or from a video and puts to a queue

        Parameters
        ----------
        queue: queue.Queue()
            The ImageIO Queue
        """
        iterator = self._from_video if self._is_video else self._from_folder
        logger.debug("Load iterator: %s", iterator)
        for retval in iterator():
            filename, image = retval[:2]
            if image is None or (not image.any() and image.ndim not in (2, 3)):
                # All black frames will return not numpy.any() so check dims too
                logger.warning("Unable to open image. Skipping: '%s'", filename)
                continue
            logger.trace("Putting to queue: %s", [v.shape if isinstance(v, np.ndarray) else v
                                                  for v in retval])
            queue.put(retval)
        logger.trace("Putting EOF")
        queue.put("EOF")

    def _from_video(self):
        """ Generator for loading frames from a video

        Yields
        ------
        filename: str
            The dummy filename of the loaded video frame.
        image: numpy.ndarray
            The loaded video frame.
        """
        logger.debug("Loading frames from video: '%s'", self.location)
        reader = imageio.get_reader(self.location, "ffmpeg")
        for idx, frame in enumerate(reader):
            if idx in self._skip_list:
                logger.trace("Skipping frame %s due to skip list", idx)
                continue
            # Convert to BGR for cv2 compatibility
            frame = frame[:, :, ::-1]
            filename = self._dummy_video_framename(idx)
            logger.trace("Loading video frame: '%s'", filename)
            yield filename, frame
        reader.close()

    def _dummy_video_framename(self, index):
        """ Return a dummy filename for video files

        Parameters
        ----------
        index: int
            The index number for the frame in the video file

        Notes
        -----
        Indexes start at 0, frame numbers start at 1, so index is incremented by 1
        when creating the filename

        Returns
        -------
        str: A dummied filename for a video frame """
        vidname = os.path.splitext(os.path.basename(self.location))[0]
        return "{}_{:06d}.png".format(vidname, index + 1)

    def _from_folder(self):
        """ Generator for loading images from a folder

        Yields
        ------
        filename: str
            The filename of the loaded image.
        image: numpy.ndarray
            The loaded image.
        sha1_hash: str, optional
            The sha1 hash of the loaded image. Only yielded if :class:`ImageIO` was
            initialized with :attr:`load_with_hash` set to ``True`` and the :attr:`location`
            is a folder of images.
        """
        with_hash = self._args[0]
        logger.debug("Loading images from folder: '%s'. with_hash: %s", self.location, with_hash)
        for idx, filename in enumerate(self.file_list):
            if idx in self._skip_list:
                logger.trace("Skipping frame %s due to skip list")
                continue
            image_read = read_image(filename, raise_error=False, with_hash=with_hash)
            if with_hash:
                retval = filename, *image_read
            else:
                retval = filename, image_read
            if retval[1] is None:
                logger.debug("Image not loaded: '%s'", filename)
                continue
            yield retval

    def load(self):
        """ Generator for loading images from the given :attr:`location`

        If :class:`ImageIO` was initialized with :attr:`load_with_hash` set to ``True`` then
        the sha1 hash of the image is added as the final item in the output `tuple`.

        Yields
        ------
        filename: str
            The filename of the loaded image.
        image: numpy.ndarray
            The loaded image.
        sha1_hash: str, optional
            The sha1 hash of the loaded image. Only yielded if :class:`ImageIO` was
            initialized with :attr:`load_with_hash` set to ``True`` and the :attr:`location`
            is a folder of images.
        """
        logger.debug("Initializing Load Generator")
        self._set_thread()
        while True:
            self._thread.check_and_raise_error()
            try:
                retval = self._queue.get(True, 1)
            except QueueEmpty:
                continue
            if retval == "EOF":
                logger.trace("Got EOF")
                break
            logger.trace("Yielding: %s", [v.shape if isinstance(v, np.ndarray) else v
                                          for v in retval])
            yield retval
        logger.debug("Closing Load Generator")
        self._thread.join()


class ImagesSaver(ImageIO):
    """ Perform image saving to a destination folder.

    Images are saved in a background ThreadPoolExecutor to allow for concurrent saving.
    See also :class:`ImageIO` for additional attributes.

    Parameters
    ----------
    path: str
        The folder to save images to. This must be an existing folder.
    queue_size: int, optional
        The amount of images to hold in the internal buffer. Default: 8.
    as_bytes: bool, optional
        ``True`` if the image is already encoded to bytes, ``False`` if the image is a
        :class:`numpy.ndarray`. Default: ``False``.

    Examples
    --------

    >>> saver = ImagesSaver('/path/to/save/folder')
    >>> for filename, image in <image_iterator>:
    >>>     saver.save(filename, image)
    >>> saver.close()
    """

    def __init__(self, path, queue_size=8, as_bytes=False):
        logger.debug("Initializing %s: (path: %s, queue_size: %s, as_bytes: %s)",
                     self.__class__.__name__, path, queue_size, as_bytes)

        super().__init__(path, queue_size=queue_size)
        self._as_bytes = as_bytes

    def _check_location_exists(self):
        """ Check whether the output location exists and is a folder

        Raises
        ------
        FaceswapError
            If the given location does not exist or the location is not a folder
        """
        if not isinstance(self.location, str):
            raise FaceswapError("The output location must be a string not a "
                                "{}".format(type(self.location)))
        super()._check_location_exists()
        if not os.path.isdir(self.location):
            raise FaceswapError("The output location '{}' is not a folder".format(self.location))

    def _process(self, queue):
        """ Saves images from the save queue to the given :attr:`location` inside a thread.

        Parameters
        ----------
        queue: queue.Queue()
            The ImageIO Queue
        """
        executor = futures.ThreadPoolExecutor(thread_name_prefix=self.__class__.__name__)
        while True:
            item = queue.get()
            if item == "EOF":
                logger.debug("EOF received")
                break
            logger.trace("Submitting: '%s'", item[0])
            executor.submit(self._save, *item)
        executor.shutdown()

    def _save(self, filename, image):
        """ Save a single image inside a ThreadPoolExecutor

        Parameters
        ----------
        filename: str
            The filename of the image to be saved. NB: Any folders passed in with the filename
            will be stripped and replaced with :attr:`location`.
        image: numpy.ndarray
            The image to be saved
        """
        filename = os.path.join(self.location, os.path.basename(filename))
        try:
            if self._as_bytes:
                with open(filename, "wb") as out_file:
                    out_file.write(image)
            else:
                cv2.imwrite(filename, image)
            logger.trace("Saved image: '%s'", filename)
        except Exception as err:  # pylint: disable=broad-except
            logger.error("Failed to save image '%s'. Original Error: %s", filename, err)

    def save(self, filename, image):
        """ Save the given image in the background thread

        Ensure that :func:`close` is called once all save operations are complete.

        Parameters
        ----------
        filename: str
            The filename of the image to be saved
        image: numpy.ndarray
            The image to be saved
        """
        self._set_thread()
        logger.trace("Putting to save queue: '%s'", filename)
        self._queue.put((filename, image))

    def close(self):
        """ Signal to the Save Threads that they should be closed and cleanly shutdown
        the saver """
        logger.debug("Putting EOF to save queue")
        self._queue.put("EOF")
        super().close()

User avatar
torzdf
Posts: 719
Joined: Fri Jul 12, 2019 12:53 am
Answers: 101
Has thanked: 19 times
Been thanked: 147 times

Re: UnicodeDecodeError: 'gbk' codec can't decode byte 0xa5 in position 230: illegal multibyte sequence

Post by torzdf »

Can you post the raw output from this command:

Code: Select all

C:\ProgramData\Anaconda3\envs\faceswap\Library\bin\ffmpeg.exe -i C:\Healthy\faceswap\input\input_vid1.mp4 -map 0:v:0 -c copy -f null -
My word is final

User avatar
adventurer
Posts: 3
Joined: Thu Jan 16, 2020 4:21 am

Re: UnicodeDecodeError: 'gbk' codec can't decode byte 0xa5 in position 230: illegal multibyte sequence

Post by adventurer »

Here is my raw output with the command:

Code: Select all

ffmpeg version 4.2 Copyright (c) 2000-2019 the FFmpeg developers
  built with gcc 9.1.1 (GCC) 20190807
  configuration: --disable-static --enable-shared --enable-gpl --enable-version3 --enable-sdl2 --enable-fontconfig --enable-gnutls --enable-iconv --enable-libass --enable-libdav1d --enable-libbluray --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libtheora --enable-libtwolame --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libzimg --enable-lzma --enable-zlib --enable-gmp --enable-libvidstab --enable-libvorbis --enable-libvo-amrwbenc --enable-libmysofa --enable-libspeex --enable-libxvid --enable-libaom --enable-libmfx --enable-amf --enable-ffnvcodec --enable-cuvid --enable-d3d11va --enable-nvenc --enable-nvdec --enable-dxva2 --enable-avisynth --enable-libopenmpt
  libavutil      56. 31.100 / 56. 31.100
  libavcodec     58. 54.100 / 58. 54.100
  libavformat    58. 29.100 / 58. 29.100
  libavdevice    58.  8.100 / 58.  8.100
  libavfilter     7. 57.100 /  7. 57.100
  libswscale      5.  5.100 /  5.  5.100
  libswresample   3.  5.100 /  3.  5.100
  libpostproc    55.  5.100 / 55.  5.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'C:\Healthy\faceswap\input\input_vid1.mp4':
  Metadata:
    major_brand     : mp42
    minor_version   : 0
    compatible_brands: mp41isom
    creation_time   : 2020-01-15T15:58:52.000000Z
    artist          : Microsoft Game DVR
    title           : 銆愯浆銆戦灎濠х鏃跺皻YA閲囪瑙嗛_鍝斿摡鍝斿摡 (銈?銈?銇ゃ儹 骞叉澂~-bilibili - Google Chrome
  Duration: 00:01:07.27, start: 0.000000, bitrate: 3623 kb/s
    Stream #0:0(und): Video: h264 (Main) (avc1 / 0x31637661), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], 3503 kb/s, 30 fps, 30 tbr, 30k tbn, 60 tbc (default)
    Metadata:
      creation_time   : 2020-01-15T16:36:56.000000Z
      handler_name    : VideoHandler
      encoder         : AVC Coding
    Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 117 kb/s (default)
    Metadata:
      creation_time   : 2020-01-15T16:36:56.000000Z
      handler_name    : SoundHandler
Output #0, null, to 'pipe:':
  Metadata:
    major_brand     : mp42
    minor_version   : 0
    compatible_brands: mp41isom
    title           : 銆愯浆銆戦灎濠х鏃跺皻YA閲囪瑙嗛_鍝斿摡鍝斿摡 (銈?銈?銇ゃ儹 骞叉澂~-bilibili - Google Chrome
    artist          : Microsoft Game DVR
    encoder         : Lavf58.29.100
    Stream #0:0(und): Video: h264 (Main) (avc1 / 0x31637661), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], q=2-31, 3503 kb/s, 30 fps, 30 tbr, 30k tbn, 30k tbc (default)
    Metadata:
      creation_time   : 2020-01-15T16:36:56.000000Z
      handler_name    : VideoHandler
      encoder         : AVC Coding
Stream mapping:
  Stream #0:0 -> #0:0 (copy)
Press [q] to stop, [?] for help
frame= 2018 fps=0.0 q=-1.0 Lsize=N/A time=00:01:07.23 bitrate=N/A speed= 723x
video:28767kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
Thank you!

User avatar
torzdf
Posts: 719
Joined: Fri Jul 12, 2019 12:53 am
Answers: 101
Has thanked: 19 times
Been thanked: 147 times

Re: UnicodeDecodeError: 'gbk' codec can't decode byte 0xa5 in position 230: illegal multibyte sequence

Post by torzdf »

The problem looks to be the internal title of the video....

Specifically, this line:
銆愯浆銆戦灎濠х鏃跺皻YA閲囪瑙嗛_鍝斿摡鍝斿摡 (銈?銈?銇ゃ儹 骞叉澂~-bilibili - Google Chrome

If you look at that line, you will see that some characters cannot be converted properly. You should find a way to edit the meta data of your video and change/strip the title.
My word is final

User avatar
adventurer
Posts: 3
Joined: Thu Jan 16, 2020 4:21 am

Re: UnicodeDecodeError: 'gbk' codec can't decode byte 0xa5 in position 230: illegal multibyte sequence

Post by adventurer »

Thanks for your kindly help!

Locked