Troubleshooting

Troubleshooting

crYOLO is very slow and does not use the GPUs to the full potential
So far, this problem was only reported for CentOS/RockyLinux but other distributions might be affected as well.
The problem was related to problematic entries in the LD_LIBRARY_PATH. Please check if there are many entries in your path:
echo $LD_LIBRARY_PATH
Especially CUDA entries from other software packages are a problem. If the LD_LIBRARY_PATH is not empty, try if the following prefix fixes your problem. In front of the cryolo_train.py / cryolo_predict.py command you put LD_LIBRARY_PATH='', e.g.:
LD_LIBRARY_PATH='' cryolo_train.py -c your_config.json -w 5
If that is working you can make the fix permanent. The following instructions will set your LD_LIBRARY_PATH to an empty value when activating the cryolo environment. It will restore the old LD_LIBRARY_PATH, once the environment gets deactivated. I assume that your cryolo environment is named cryolo. Do the following:
conda activate cryolo
conda env config vars set LD_LIBRARY_PATH=
conda deactivate cryolo
conda activate cryolo
If you now run
echo $LD_LIBRARY_PATH
it should be empty. If you do
conda deactivate
and run
echo $LD_LIBRARY_PATH
you should see all your libraries again.
Now, the fix should be permanent and you don’t need to add LD_LIBRARY_PATH='' in front of the crYOLO commands.

crYOLO crashed with glibc errors

Note
This problem should be solved with crYOLO 1.8.2

Alternative A:
Within your crYOLO environment run:
pip install nvidia-tensorflow==1.5.5+nv22.1
Alternative B:
The current CUDA 11 instructions need a quite recent glibc version (>=2.29). Not all systems provide such a
recent version. However, you can manually compile it and force crYOLO to use it:

Download and compile a recent glibc (>= 2.29)

wget http://ftp.gnu.org/gnu/libc/glibc-2.34.tar.xz
tar xvf glibc-2.34.tar.xz
mkdir glibc-2.34/build
cd glibc-2.34/build
sudo mkdir /opt/glibc-2.34
../configure --prefix=/opt/glibc-2.34
make -j 8
sudo make install

Add environment variable for the cryolo environment ( I assume the environment name is “cryolo”):

conda activate cryolo
conda env config vars set LD_PRELOAD=/opt/glibc-2.34/lib/libm.so.6

Reload your environment

conda deactivate
conda activate cryolo
Now you should be able to run cryolo with CUDA 11.
Thanks to Wolfgang Lugmayr for the instructions!

crYOLO freezes

Note
Since crYOLO 1.7.4 this problem is solved. Multithreading replaced multiprocessing.

On some machines crYOLO freezes during or at the end of training. The problem comes together with
multiprocessing and is deeply in one of the libraries we use. You can solve it by using
multithreading instead of multiprocessing. There for you can either use the option --use_multithreading
or make it a permanent change by changing the environment variables in your crYOLO environment:
conda activate cryolo
conda env config vars set CRYOLO_MP_START="fork"
conda env config vars set CRYOLO_USE_MULTITHREADING="True"
You need to reactivate your environment to make the changes working by
conda activate cryolo
Now you use multithreading instead of multiprocessing.

crYOLO has memory problems
crYOLO can crash during training because of memory problems.
In those cases you can try the following:

Reduce the batch_size. I recommend to reduce it by 1 stepwise. I would not choose a value below 3. You find the batch_size in your configuration file or in the Training options tab of the config Action
Reduce the input_size. Instead of 1024 you can choose any multiple of 32. Therefore 31*32=992 would next smaller input size. Don’t go too low (< 768) as you might become problem with very small particles. You find the  input_size in your configuration file or in the Model options tab of the config Action.

I need more help
Find help at our mailing list!