Just a little note to anyone trying to get the 30 series to work.
This is unsupported. Feels like iron chef type stuff to get the software to work.
Its been said many times before, can't do a whole lot until they fix the upstream dependencies like tensorflow and Nvidia quit changing fiddling with her driver so much.
I'm personally now having trouble even getting Ubuntu to install with a 3060 TI as a primary display
So trust me, all of you that are having trouble, we all feel your frustrations
[Guide] Using Faceswap on Nvidia RTX 30xx cards
Read the FAQs and search the forum before posting a new topic.
Please mark any answers that fixed your problems so others can find the solutions.
Re: [Guide] Using Faceswap on Nvidia RTX 30xx cards
I dunno what I'm doing
2X RTX 3090 : RTX 3080 : RTX: 2060 : 2x RTX 2080 Super : Ghetto 1060
Re: [Guide] Using Faceswap on Nvidia RTX 30xx cards
No.
Most likely there is an issue with the Cuda/cuDNN install.
If you run in "Verbose" mode, it will generate a load of information, which may tell you why it's not working.
My word is final
Re: [Guide] Using Faceswap on Nvidia RTX 30xx cards
Thanks for the tip! I checked verbose mode and saw this warning:
2021-01-31 17:00:22.981621: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'cusolver64_10.dll'; dlerror: cusolver64_10.dll not found
I had installed cuda 11.1 and it only came with cusolver64_11.dll.
I duplicated this library and renamed the duplicate "cusolver64_10.dll" and faceswap now works!
Re: [Guide] Using Faceswap on Nvidia RTX 30xx cards
I would highly recommend sticking with Cuda 11.0 then. As per original post:
To use a version for Cuda 11.1 then you will need to compile Tensorflow yourself, which is well outside of the scope of these instructions. (Compiling Tensorflow from source: https://www.tensorflow.org/install/source). Alternatively you could try some pre-compiled wheels (Google is your friend. your mileage may vary), although we cannot vouch for 3rd party compiled versions of Tensorflow.
Your solution may work (in which case, great!) but you may also come unstuck at some point.
My word is final
Re: [Guide] Using Faceswap on Nvidia RTX 30xx cards
For RTX A6000 on Windows:
Install cuda_11.2.0_460.89_win10
Install faceswap
Run anaconda prompt
Code: Select all
conda acticate faceswap
conda remove tensorflow
conda install broli
conda install urllib3
pip install tensorflow-gpu==2.4
Download cuDDN compatible with CUDA 11.2 (need to register with nvidia) : cudnn-11.2-windows-x64-v8.1.0.77
Copy extracted files into C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.2 (DLL into bin)
You should be fine!
Re: [Guide] Using Faceswap on Nvidia RTX 30xx cards
How do you turn on Verbose Mode?
Re: [Guide] Using Faceswap on Nvidia RTX 30xx cards
461.40
cuda_11.0.2 (with Visual Studio 2019)
cudnn-11.0-v8.0.2.39
conda install -c anaconda urllib3(brotli Error Resolution during Tensorflow 2.4 Installation with cmd)
It works!
However, when entering the train process, vram utilization is close to 99%.
And if even a slight movement is detected on the screen, the gpu load increase and throttling is accompanied.
Is there a way to limit vram utilization or is there something else that causes it?
Re: [Guide] Using Faceswap on Nvidia RTX 30xx cards
I have an RTX 3070.
I keep getting this error while extracting:
Code: Select all
E tensorflow/stream_executor/cuda/cuda_blas.cc:226] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
But if I simply ignore this error, the extraction works, but then I get this error:
Code: Select all
tensorflow.python.framework.errors_impl.InternalError: dnn PoolForward launch failed
And face images are generated but alignments are not generated.
Does anyone know how to fix this?
Edit: Managed to fix by lowering the batch size to 3 (from 5), but this is pretty strange because I have 8 gigs of GPU memory, so there shouldn't be a memory problem.
- noobynoobnoob
- Posts: 10
- Joined: Fri Feb 26, 2021 12:01 pm
- Has thanked: 6 times
- Been thanked: 1 time
extraction speeds
Hi,
What is a reasonable extraction speed with the following hardware?
ryzen 5 3600
rtx 3080
I am getting 1.6it/s and want to check if this is normal or if something is going wrong. seems slow!
CPU not being utilized. Missing Values?
I've noticed that I may not have been using my GPU this entire time. I have tried the "30xx Series configuration" where I would delete the config file and set the backend myself and currently trying again after a fresh install of faceswap.
I assume I've done everything correctly thus far as I have gotten to the training part but my EG/s are very low. I assume it has to do with the "git_branch" not having a value and I can't find any documentation that would help me with that.
Any push in the right direction would be very much appreciated.
Code: Select all
============ System Information ============
encoding: cp1252
git_branch: Not Found
git_commits: Not Found
gpu_cuda: 11.0
gpu_cudnn: 8.0.1
gpu_devices: GPU_0: GeForce RTX 3070
gpu_devices_active: GPU_0
gpu_driver: 461.72
gpu_vram: GPU_0: 8192MB
os_machine: AMD64
os_platform: Windows-10-10.0.19041-SP0
os_release: 10
py_command: C:\Users\STARBUSTER\faceswap/faceswap.py gui
py_conda_version: conda 4.9.2
py_implementation: CPython
py_version: 3.8.8
py_virtual_env: True
sys_cores: 12
sys_processor: Intel64 Family 6 Model 165 Stepping 3, GenuineIntel
sys_ram: Total: 16314MB, Available: 9813MB, Used: 6501MB, Free: 9813MB
=============== Pip Packages ===============
absl-py @ file:///tmp/build/80754af9/absl-py_1607439979954/work
aiohttp @ file:///C:/ci/aiohttp_1614361024229/work
astunparse==1.6.3
async-timeout==3.0.1
attrs @ file:///tmp/build/80754af9/attrs_1604765588209/work
blinker==1.4
brotlipy==0.7.0
cachetools @ file:///tmp/build/80754af9/cachetools_1611600262290/work
certifi==2020.12.5
cffi @ file:///C:/ci/cffi_1613247279197/work
chardet @ file:///C:/ci/chardet_1605303225733/work
click @ file:///home/linux1/recipes/ci/click_1610990599742/work
coverage @ file:///C:/ci/coverage_1614615074147/work
cryptography @ file:///C:/ci/cryptography_1613401566183/work
cycler==0.10.0
Cython @ file:///C:/ci/cython_1614014958194/work
fastcluster==1.1.26
ffmpy==0.2.3
gast @ file:///tmp/build/80754af9/gast_1597433534803/work
google-auth @ file:///tmp/build/80754af9/google-auth_1614883971544/work
google-auth-oauthlib @ file:///tmp/build/80754af9/google-auth-oauthlib_1614894617465/work
google-pasta==0.2.0
grpcio @ file:///C:/ci/grpcio_1614884412260/work
h5py==2.10.0
idna @ file:///home/linux1/recipes/ci/idna_1610986105248/work
imageio @ file:///tmp/build/80754af9/imageio_1594161405741/work
imageio-ffmpeg @ file:///home/conda/feedstock_root/build_artifacts/imageio-ffmpeg_1609799311556/work
importlib-metadata @ file:///tmp/build/80754af9/importlib-metadata_1602276842396/work
joblib @ file:///tmp/build/80754af9/joblib_1613502643832/work
Keras-Applications @ file:///tmp/build/80754af9/keras-applications_1594366238411/work
Keras-Preprocessing @ file:///tmp/build/80754af9/keras-preprocessing_1612283640596/work
kiwisolver @ file:///C:/ci/kiwisolver_1612282606037/work
Markdown @ file:///C:/ci/markdown_1614364121613/work
matplotlib @ file:///C:/ci/matplotlib-base_1592837548929/work
mkl-fft==1.3.0
mkl-random==1.1.1
mkl-service==2.3.0
multidict @ file:///C:/ci/multidict_1607362065515/work
numpy @ file:///C:/ci/numpy_and_numpy_base_1603466732592/work
nvidia-ml-py3 @ git+https://github.com/deepfakes/nvidia-ml-py3.git@6fc29ac84b32bad877f078cb4a777c1548a00bf6
oauthlib==3.1.0
olefile==0.46
opencv-python==4.5.1.48
opt-einsum==3.1.0
pathlib==1.0.1
Pillow @ file:///C:/ci/pillow_1614711642219/work
protobuf==3.14.0
psutil @ file:///C:/ci/psutil_1612298324802/work
pyasn1==0.4.8
pyasn1-modules==0.2.8
pycparser @ file:///tmp/build/80754af9/pycparser_1594388511720/work
PyJWT==1.7.1
pyOpenSSL @ file:///tmp/build/80754af9/pyopenssl_1608057966937/work
pyparsing @ file:///home/linux1/recipes/ci/pyparsing_1610983426697/work
pyreadline==2.1
PySocks @ file:///C:/ci/pysocks_1605287845585/work
python-dateutil @ file:///home/ktietz/src/ci/python-dateutil_1611928101742/work
pywin32==227
requests @ file:///tmp/build/80754af9/requests_1608241421344/work
requests-oauthlib==1.3.0
rsa @ file:///tmp/build/80754af9/rsa_1614366226499/work
scikit-learn @ file:///C:/ci/scikit-learn_1614446896245/work
scipy @ file:///C:/ci/scipy_1614023125644/work
sip==4.19.13
six @ file:///C:/ci/six_1605187374963/work
tensorboard @ file:///home/builder/ktietz/aggregate/tensorflow_recipes/ci_te/tensorboard_1614593728657/work/tmp_pip_dir
tensorboard-plugin-wit==1.6.0
tensorflow==2.3.0
tensorflow-estimator @ file:///tmp/build/80754af9/tensorflow-estimator_1599136169057/work/whl_temp/tensorflow_estimator-2.3.0-py2.py3-none-any.whl
termcolor==1.1.0
threadpoolctl @ file:///tmp/tmp9twdgx9k/threadpoolctl-2.1.0-py3-none-any.whl
tornado @ file:///C:/ci/tornado_1606942392901/work
tqdm @ file:///tmp/build/80754af9/tqdm_1611857934208/work
typing-extensions @ file:///tmp/build/80754af9/typing_extensions_1611751222202/work
urllib3 @ file:///tmp/build/80754af9/urllib3_1611694770489/work
Werkzeug @ file:///home/ktietz/src/ci/werkzeug_1611932622770/work
win-inet-pton @ file:///C:/ci/win_inet_pton_1605306167264/work
wincertstore==0.2
wrapt==1.12.1
yarl @ file:///C:/ci/yarl_1606940076464/work
zipp @ file:///tmp/build/80754af9/zipp_1604001098328/work
============== Conda Packages ==============
# packages in environment at C:\Users\STARBUSTER\MiniConda3\envs\faceswa:
#
# Name Version Build Channel
_tflow_select 2.3.0 gpu
absl-py 0.11.0 pyhd3eb1b0_1
aiohttp 3.7.4 py38h2bbff1b_1
astunparse 1.6.3 py_0
async-timeout 3.0.1 py38haa95532_0
attrs 20.3.0 pyhd3eb1b0_0
blas 1.0 mkl
blinker 1.4 py38haa95532_0
brotlipy 0.7.0 py38h2bbff1b_1003
ca-certificates 2021.1.19 haa95532_0
cachetools 4.2.1 pyhd3eb1b0_0
certifi 2020.12.5 py38haa95532_0
cffi 1.14.5 py38hcd4344a_0
chardet 3.0.4 py38haa95532_1003
click 7.1.2 pyhd3eb1b0_0
coverage 5.5 py38h2bbff1b_2
cryptography 3.3.1 py38hcd4344a_1
cudatoolkit 10.1.243 h74a9793_0
cudnn 7.6.5 cuda10.1_0
cycler 0.10.0 py38_0
cython 0.29.22 py38hd77b12b_0
fastcluster 1.1.26 py38h251f6bf_2 conda-forge
ffmpeg 4.3.1 ha925a31_0 conda-forge
ffmpy 0.2.3 pypi_0 pypi
freetype 2.10.4 hd328e21_0
gast 0.4.0 py_0
git 2.23.0 h6bb4b03_0
google-auth 1.27.1 pyhd3eb1b0_0
google-auth-oauthlib 0.4.3 pyhd3eb1b0_0
google-pasta 0.2.0 py_0
grpcio 1.36.1 py38hc60d5dd_1
h5py 2.10.0 py38h5e291fa_0
hdf5 1.10.4 h7ebc959_0
icc_rt 2019.0.0 h0cc432a_1
icu 58.2 ha925a31_3
idna 2.10 pyhd3eb1b0_0
imageio 2.9.0 py_0
imageio-ffmpeg 0.4.3 pyhd8ed1ab_0 conda-forge
importlib-metadata 2.0.0 py_1
intel-openmp 2020.2 254
joblib 1.0.1 pyhd3eb1b0_0
jpeg 9b hb83a4c4_2
keras-applications 1.0.8 py_1
keras-preprocessing 1.1.2 pyhd3eb1b0_0
kiwisolver 1.3.1 py38hd77b12b_0
libpng 1.6.37 h2a8f88b_0
libprotobuf 3.14.0 h23ce68f_0
libtiff 4.1.0 h56a325e_1
lz4-c 1.9.3 h2bbff1b_0
markdown 3.3.4 py38haa95532_0
matplotlib 3.2.2 0
matplotlib-base 3.2.2 py38h64f37c6_0
mkl 2020.2 256
mkl-service 2.3.0 py38h196d8e1_0
mkl_fft 1.3.0 py38h46781fe_0
mkl_random 1.1.1 py38h47e9c7a_0
multidict 5.1.0 py38h2bbff1b_2
numpy 1.19.2 py38hadc3359_0
numpy-base 1.19.2 py38ha3acd2a_0
nvidia-ml-py3 7.352.1 pypi_0 pypi
oauthlib 3.1.0 py_0
olefile 0.46 py_0
opencv-python 4.5.1.48 pypi_0 pypi
openssl 1.1.1j h2bbff1b_0
opt_einsum 3.1.0 py_0
pathlib 1.0.1 py_1
pillow 8.1.1 py38h4fa10fc_0
pip 21.0.1 py38haa95532_0
protobuf 3.14.0 py38hd77b12b_1
psutil 5.8.0 py38h2bbff1b_1
pyasn1 0.4.8 py_0
pyasn1-modules 0.2.8 py_0
pycparser 2.20 py_2
pyjwt 1.7.1 py38_0
pyopenssl 20.0.1 pyhd3eb1b0_1
pyparsing 2.4.7 pyhd3eb1b0_0
pyqt 5.9.2 py38ha925a31_4
pyreadline 2.1 py38_1
pysocks 1.7.1 py38haa95532_0
python 3.8.8 hdbf39b2_4
python-dateutil 2.8.1 pyhd3eb1b0_0
python_abi 3.8 1_cp38 conda-forge
pywin32 227 py38he774522_1
qt 5.9.7 vc14h73c81de_0
requests 2.25.1 pyhd3eb1b0_0
requests-oauthlib 1.3.0 py_0
rsa 4.7.2 pyhd3eb1b0_1
scikit-learn 0.24.1 py38hf11a4ad_0
scipy 1.6.1 py38h14eb087_0
setuptools 52.0.0 py38haa95532_0
sip 4.19.13 py38ha925a31_0
six 1.15.0 py38haa95532_0
sqlite 3.33.0 h2a8f88b_0
tensorboard 2.4.0 pyhc547734_0
tensorboard-plugin-wit 1.6.0 py_0
tensorflow 2.3.0 mkl_py38h1fcfbd6_0
tensorflow-base 2.3.0 gpu_py38h7339f5a_0
tensorflow-estimator 2.3.0 pyheb71bc4_0
tensorflow-gpu 2.3.0 he13fc11_0
termcolor 1.1.0 py38haa95532_1
threadpoolctl 2.1.0 pyh5ca1d4c_0
tk 8.6.10 he774522_0
tornado 6.1 py38h2bbff1b_0
tqdm 4.56.0 pyhd3eb1b0_0
typing-extensions 3.7.4.3 hd3eb1b0_0
typing_extensions 3.7.4.3 pyh06a4308_0
urllib3 1.26.3 pyhd3eb1b0_0
vc 14.2 h21ff451_1
vs2015_runtime 14.27.29016 h5e58377_2
werkzeug 1.0.1 pyhd3eb1b0_0
wheel 0.36.2 pyhd3eb1b0_0
win_inet_pton 1.1.0 py38haa95532_0
wincertstore 0.2 py38_0
wrapt 1.12.1 py38he774522_1
xz 5.2.5 h62dcd97_0
yarl 1.6.3 py38h2bbff1b_0
zipp 3.4.0 pyhd3eb1b0_0
zlib 1.2.11 h62dcd97_4
zstd 1.4.5 h04227a9_0
================= Configs ==================
--------- .faceswap ---------
backend: nvidia
--------- convert.ini ---------
[color.color_transfer]
clip: True
preserve_paper: True
[color.manual_balance]
colorspace: HSV
balance_1: 0.0
balance_2: 0.0
balance_3: 0.0
contrast: 0.0
brightness: 0.0
[color.match_hist]
threshold: 99.0
[mask.box_blend]
type: gaussian
distance: 11.0
radius: 5.0
passes: 1
[mask.mask_blend]
type: normalized
kernel_size: 3
passes: 4
threshold: 4
erosion: 0.0
[scaling.sharpen]
method: none
amount: 150
radius: 0.3
threshold: 5.0
[writer.ffmpeg]
container: mp4
codec: libx264
crf: 23
preset: medium
tune: none
profile: auto
level: auto
skip_mux: False
[writer.gif]
fps: 25
loop: 0
palettesize: 256
subrectangles: False
[writer.opencv]
format: png
draw_transparent: False
jpg_quality: 75
png_compress_level: 3
[writer.pillow]
format: png
draw_transparent: False
optimize: False
gif_interlace: True
jpg_quality: 75
png_compress_level: 3
tif_compression: tiff_deflate
--------- extract.ini ---------
[global]
allow_growth: False
[align.fan]
batch-size: 12
[detect.cv2_dnn]
confidence: 50
[detect.mtcnn]
minsize: 20
threshold_1: 0.6
threshold_2: 0.7
threshold_3: 0.7
scalefactor: 0.709
batch-size: 8
[detect.s3fd]
confidence: 70
batch-size: 4
[mask.unet_dfl]
batch-size: 8
[mask.vgg_clear]
batch-size: 6
[mask.vgg_obstructed]
batch-size: 2
--------- gui.ini ---------
[global]
fullscreen: False
tab: extract
options_panel_width: 30
console_panel_height: 20
icon_size: 14
font: default
font_size: 9
autosave_last_session: prompt
timeout: 120
auto_load_model_stats: True
--------- train.ini ---------
[global]
centering: face
coverage: 68.75
icnr_init: False
conv_aware_init: False
optimizer: adam
learning_rate: 5e-05
reflect_padding: False
allow_growth: False
mixed_precision: False
convert_batchsize: 16
[global.loss]
loss_function: ssim
mask_loss_function: mse
l2_reg_term: 100
eye_multiplier: 3
mouth_multiplier: 2
penalized_mask_loss: True
mask_type: extended
mask_blur_kernel: 3
mask_threshold: 4
learn_mask: False
[model.dfaker]
output_size: 128
[model.dfl_h128]
lowmem: False
[model.dfl_sae]
input_size: 128
clipnorm: True
architecture: df
autoencoder_dims: 0
encoder_dims: 42
decoder_dims: 21
multiscale_decoder: False
[model.dlight]
features: best
details: good
output_size: 256
[model.original]
lowmem: False
[model.realface]
input_size: 64
output_size: 128
dense_nodes: 1536
complexity_encoder: 128
complexity_decoder: 512
[model.unbalanced]
input_size: 128
lowmem: False
clipnorm: True
nodes: 1024
complexity_encoder: 128
complexity_decoder_a: 384
complexity_decoder_b: 512
[model.villain]
lowmem: False
[trainer.original]
preview_images: 14
zoom_amount: 5
rotation_range: 10
shift_range: 5
flip_chance: 50
color_lightness: 30
color_ab: 8
color_clahe_chance: 50
color_clahe_max_size: 4
Re: [Guide] Using Faceswap on Nvidia RTX 30xx cards
Hello. Hi!
Anyone here anymore? Has anyone tried RTX 30XX with the new Tensorflow 2.4.1 from Anaconda?
Re: [Guide] Using Faceswap on Nvidia RTX 30xx cards
I think Bryan said something about it was only compiled for cuda 10.0 and we need 11?
I'm at work and often have to look this stuff up before I sound informed.
I should mention that by following the first post in this form I've installed for the 30 series no problem on Windows.
Linux decided it doesn't want to behave for me, unlike other people.
I dunno what I'm doing
2X RTX 3090 : RTX 3080 : RTX: 2060 : 2x RTX 2080 Super : Ghetto 1060
Re: [Guide] Using Faceswap on Nvidia RTX 30xx cards
Ahh looking closer it appears that only the Linux version was bumped to 2.4.1.
linux-64 v2.4.1
win-64 v2.3.0
osx-64 v2.0.0
Re: [Guide] Using Faceswap on Nvidia RTX 30xx cards
I found my way but that was not so easy...
Linux Fedora 32, RTX 3070 FE
Don't expect the conda tensorflow version to be OK on Linux, even if it's a 2.4 version, the cudnn version is 7.1, not 8.x - you need 8.x version ! So...
First, I've installed CUDA from rpmfusion (CUDA 11.1) and the RPM package from Nvidia website ==> "cuDNN Runtime Library for RedHat/Centos 8.1 x84_64 (RPM)"
Then, I made exactly what is explained in the first post:
- install with the faceswap_setup bash file
- select CPU
- conda activate faceswap
- conda remove tensorflow
- pip install tensorflow-gpu=2.4.1 (not 2.4, it will core dump)
- remove config/.faceswap file
But... I had an error...
Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
EDIT:
You should change preferences in Extract AND Training:
- Allow Growth to True
- (Optinal) Mixed Precision to True
A training is in progress, so I will edit this post to tell you if this option fixes othrer models.
Note: it's very fast, faster than my old 1060
Re: [Guide] Using Faceswap on Nvidia RTX 30xx cards
That's great you got it to work !
"Allow growth" often is needed on these Nvidia cards I don't know why, it seems random. More often on 20X0 and 30X0.
It's no big deal.
I dunno what I'm doing
2X RTX 3090 : RTX 3080 : RTX: 2060 : 2x RTX 2080 Super : Ghetto 1060
Re: [Guide] Using Faceswap on Nvidia RTX 30xx cards
Yes, I did it for my 1060 card earlier, but didn't remembered to check it on for my fresh new 3070 FE.
So the solution is:
- install Cuda >= 11.1
- install CuDnn >= 8.1
- make the installtion for "CPU"
- conda activate faceswap
- remove tensorflow with "conda remove tensorflow"
- install tensorflow-gpu==2.4.1
- remove config/.faceswap*
- start faceswap
- go in Settings > Configure Settings
- in Extract: Allow Growth checked
- in Train, set Allow Growth and optionally Mixed Precision
Note: Tkinter is not very "pretty" on Linux, I made some stuffs to add ttkthemes and the necessary preferences entry, so I have now a nicer interface. I will try to send a PR.
Re: [Guide] Using Faceswap on Nvidia RTX 30xx cards
[mention]freedzy[/mention] is this a single card install, or do you have other cards installed? Like a 20x0 series as well?
I dunno what I'm doing
2X RTX 3090 : RTX 3080 : RTX: 2060 : 2x RTX 2080 Super : Ghetto 1060
Re: [Guide] Using Faceswap on Nvidia RTX 30xx cards
That's mainly not what I did. I changed the code to use ttkthemes wich gives very pretty themes and a selectable theme list in configuration.