Faceswap installer broken on GCE (and possibly elsewhere)?

Want to use Faceswap in The Cloud? This is not directly supported by the Devs, but you may find community support here


Forum rules

Read the FAQs and search the forum before posting a new topic.

NB: The Devs do not directly support using Cloud based services, but you can find community support here.

Please mark any answers that fixed your problems so others can find the solutions.

Locked
User avatar
Replicon
Posts: 50
Joined: Mon Mar 22, 2021 4:24 pm
Been thanked: 2 times

Faceswap installer broken on GCE (and possibly elsewhere)?

Post by Replicon »

I created a fresh installed image using my normal flow (basically as shown in the instructions here).

The installation process itself looked weird (It "failed with initial frozen solve", and after recovering, it spent a bunch of time downloading like 1.5GB of cuda stuff).

The install succeeded, but when I ran an extract, it eventually crashed with:

Code: Select all

tensorflow.python.framework.errors_impl.InternalError: cudaGetDevice() failed. Status: initialization error

Did something change with the package repos or something? I'm not too familiar with Conda, but It definitely looked like the install process wasn't behaving right. I tried this twice, and it misbehaved similarly.

Some relevant info from the crash log follows; I can give you more, but it likely wouldn't help:

Code: Select all

============ System Information ============
encoding:            UTF-8
git_branch:          Not Found
git_commits:         Not Found
gpu_cuda:            No global version found. Check Conda packages for Conda Cuda
gpu_cudnn:           No global version found. Check Conda packages for Conda cuDNN
gpu_devices:         GPU_0: Tesla T4
gpu_devices_active:  GPU_0
gpu_driver:          440.100
gpu_vram:            GPU_0: 15109MB
os_machine:          x86_64
os_platform:         Linux-5.4.0-1021-gcp-x86_64-with-glibc2.17
os_release:          5.4.0-1021-gcp
py_command:          /home/[[REDACTED]]/faceswap/faceswap.py extract -i /home/[[REDACTED]].mp4 -o /home/[[REDACTED]] -D s3fd -A fan -nm hist -rf 8 -min 0 -l 0.4 -sz 512 -een 3 -si 0 -L INFO
py_conda_version:    Conda is used, but version not found
py_implementation:   CPython
py_version:          3.8.11
py_virtual_env:      True
sys_cores:           4
sys_processor:       x86_64
sys_ram:             Total: 15005MB, Available: 14245MB, Used: 482MB, Free: 13759MB

=============== Pip Packages ===============
absl-py==0.13.0
astunparse==1.6.3
cachetools==4.2.2
certifi==2021.5.30
charset-normalizer==2.0.6
clang==5.0
cycler==0.10.0
fastcluster==1.1.26
ffmpy==0.2.3
flatbuffers==1.12
gast==0.4.0
google-auth==1.35.0
google-auth-oauthlib==0.4.6
google-pasta==0.2.0
grpcio==1.40.0
h5py==3.1.0
idna==3.2
imageio @ file:///tmp/build/80754af9/imageio_1617700267927/work
imageio-ffmpeg @ file:///home/conda/feedstock_root/build_artifacts/imageio-ffmpeg_1629987409325/work
joblib @ file:///tmp/build/80754af9/joblib_1613502643832/work
keras==2.6.0
Keras-Preprocessing==1.1.2
kiwisolver @ file:///tmp/build/80754af9/kiwisolver_1612282420641/work
Markdown==3.3.4
matplotlib @ file:///tmp/build/80754af9/matplotlib-base_1592846008246/work
mkl-fft==1.3.0
mkl-random==1.1.1
mkl-service==2.3.0
numpy @ file:///tmp/build/80754af9/numpy_and_numpy_base_1603570489231/work
nvidia-ml-py==11.470.66
oauthlib==3.1.1
olefile @ file:///Users/ktietz/demo/mc3/conda-bld/olefile_1629805411829/work
opencv-python==4.5.3.56
opt-einsum==3.3.0
Pillow @ file:///tmp/build/80754af9/pillow_1625655817137/work
protobuf==3.18.0
psutil @ file:///tmp/build/80754af9/psutil_1612298023621/work
pyasn1==0.4.8
pyasn1-modules==0.2.8
pyparsing @ file:///home/linux1/recipes/ci/pyparsing_1610983426697/work
python-dateutil @ file:///tmp/build/80754af9/python-dateutil_1626374649649/work
requests==2.26.0
requests-oauthlib==1.3.0
rsa==4.7.2
scikit-learn @ file:///tmp/build/80754af9/scikit-learn_1621370412049/work
scipy @ file:///tmp/build/80754af9/scipy_1616703172749/work
sip==4.19.13
six==1.15.0
tensorboard==2.6.0
tensorboard-data-server==0.6.1
tensorboard-plugin-wit==1.8.0
tensorflow-estimator==2.6.0
tensorflow-gpu==2.6.0
termcolor==1.1.0
threadpoolctl @ file:///Users/ktietz/demo/mc3/conda-bld/threadpoolctl_1629802263681/work
tornado @ file:///tmp/build/80754af9/tornado_1606942300299/work
tqdm @ file:///tmp/build/80754af9/tqdm_1631818572807/work
typing-extensions==3.7.4.3
urllib3==1.26.6
Werkzeug==2.0.1
wrapt==1.12.1

============== Conda Packages ==============
Could not get package list
User avatar
Replicon
Posts: 50
Joined: Mon Mar 22, 2021 4:24 pm
Been thanked: 2 times

Re: Faceswap installer broken on GCE (and possibly elsewhere)?

Post by Replicon »

OK.

So I intercepted the installer, and added a

Code: Select all

git checkout c7d85f89e69c74e97bf7485b064c07487d31faae

in the right place.

This is to roll back the Tensorflow 2.6 change, which looked like a really suspicious potential root cause.

And it works again!

I was wrong about the "failed frozen..." stuff, which happens regardless, but I was right that it did't need to download 1.5GB of Cuda libraries... Which leads me to think that stuff was already set up on the GCE image, with compatible drivers.

It may be that the GCE base image needs to be updated, with drivers and libraries that make everything compatible again, so fresh installs can keep getting the latest and greatest.

User avatar
torzdf
Posts: 2636
Joined: Fri Jul 12, 2019 12:53 am
Answers: 156
Has thanked: 128 times
Been thanked: 614 times

Re: Faceswap installer broken on GCE (and possibly elsewhere)?

Post by torzdf »

This kind of thing doesn't surprise me too much. Unfortunately the workaround to getting 30xx support into faceswap is a bit messy, so I'm not surprised there are conflicts.

[mention]pfakanator[/mention] may be able to update this image, or (hopefully) this issue will resolve in time when Conda get 30xx support working properly with their builds and I can move our install back to a less hacky solution.

My word is final

User avatar
Replicon
Posts: 50
Joined: Mon Mar 22, 2021 4:24 pm
Been thanked: 2 times

Re: Faceswap installer broken on GCE (and possibly elsewhere)?

Post by Replicon »

Hey, welcome back!

I'm not sure I understand where we're at right now. What's the current "hacky" solution? I'm not aware of running RTX 30xx on the cloud, but I don't know much about GPU lingo... I see it says it's RTX-capable, and maybe that's what you're referring to.

Are we talking about two separate problems? If Conda figures out their build issues, will the latest faceswap versions (with tensorflow 2.6) start working on the provided GCE image, or is it just that the latest versions of software used by faceswap are simply not backwards compatible with the aging drivers on the GCE image? Even before, one thing I had to do with my GCE install was to disable auto-updates, because as soon as things get updated (or I run update/upgrade and reboot), it starts to fail to detect the video card.

The answer might just be to install the latest drivers on a fresh image every time, as part of the overall faceswap install.

While I'm sure it's probably a bit more complicated than literally running "apt-get install <short-list-of-packages>" on a clean base, I expect it to be really well documented, or nobody would use them.

User avatar
torzdf
Posts: 2636
Joined: Fri Jul 12, 2019 12:53 am
Answers: 156
Has thanked: 128 times
Been thanked: 614 times

Re: Faceswap installer broken on GCE (and possibly elsewhere)?

Post by torzdf »

Honestly, I don't know, sorry. I don't use GCE/cloud at all.

My word is final

User avatar
nigelwang
Posts: 1
Joined: Sun Oct 24, 2021 6:28 am

Re: Faceswap installer broken on GCE (and possibly elsewhere)?

Post by nigelwang »

Hi, running into the same issue. Can you be more specific how to fix this issue? Not sure how I can change the installer.
Thank you!!

Replicon wrote: Tue Sep 21, 2021 3:07 pm

OK.

So I intercepted the installer, and added a

Code: Select all

git checkout c7d85f89e69c74e97bf7485b064c07487d31faae

in the right place.

This is to roll back the Tensorflow 2.6 change, which looked like a really suspicious potential root cause.

And it works again!

I was wrong about the "failed frozen..." stuff, which happens regardless, but I was right that it did't need to download 1.5GB of Cuda libraries... Which leads me to think that stuff was already set up on the GCE image, with compatible drivers.

It may be that the GCE base image needs to be updated, with drivers and libraries that make everything compatible again, so fresh installs can keep getting the latest and greatest.

User avatar
Replicon
Posts: 50
Joined: Mon Mar 22, 2021 4:24 pm
Been thanked: 2 times

Re: Faceswap installer broken on GCE (and possibly elsewhere)?

Post by Replicon »

nigelwang wrote: Sun Oct 24, 2021 6:40 am

Hi, running into the same issue. Can you be more specific how to fix this issue? Not sure how I can change the installer.
Thank you!!

Haven't logged in in a while so only seeing this now.

If you inspect the setup script on the image setup, all it does is download and run the install script.

If you download and inspect the install script, it downloads the latest faceswap from github using the git CLI.

The code change I linked broke faceswap on the provided cloud base image, possibly because of driver incompatibilities.

So, to temporarily mitigate the issue, you can hack up the setup script to grab the state of the faceswap codebase from just BEFORE the breaking change (which is the git checkout command I linked).

I honestly don't remember exactly how and where I made the change, but I really just replaced or added a git command to make it so it's using that older code.

Like I said, the more correct way to solve this will likely be to just update the nvidia drivers on the base image... unfortunately, doing a basic 'upgrade' just breaks it, and causes it to not see the nvidia driver at all, and I haven't yet figured out how to do an upgrade that won't completely break the image. It could be as simple as running dist-upgrade instead of upgrade, but... meh I dunno, didn't play around with it too much. If I'm bored for a couple of days with nothing to do, maybe I'll try to script a base image setup from scratch that creates an image that works consistently, even through upgrades.

User avatar
Replicon
Posts: 50
Joined: Mon Mar 22, 2021 4:24 pm
Been thanked: 2 times

Re: Faceswap installer broken on GCE (and possibly elsewhere)?

Post by Replicon »

Just so happens I wanted to kick one off, so I sshed into my image to find my old setup script.

When you edit your setup script, you want to change the clone_faceswap function to look something like this:

Code: Select all

clone_faceswap() {
    # Clone the faceswap repo
    delete_faceswap
    info "Downloading Faceswap..."
    yellow ; git clone "$DL_FACESWAP" "$DIR_FACESWAP"; (cd $DIR_FACESWAP && git checkout c7d85f89e69c74e97bf7485b064c07487d31faae)
}

Again, this is just a hack to get going again. If you end up finding out the right incantation to properly upgrade drivers on the image without hosing everything, please do reply here. :)

Locked