Skip to content
This repository has been archived by the owner on Jan 16, 2023. It is now read-only.

Don't suppress errors when using threads #210

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

lepilepi
Copy link

@lepilepi lepilepi commented Oct 14, 2021

When I was using collectfast in a CI system, some of the assets were not copied.
Sometimes it copied all, sometimes just a subset without any errors.

This is what I experienced:

~ $ python manage.py collectstatic --noinput --disable-collectfast
1249 static files copied, 59 unmodified.
~ $
~ $ echo 'Deleting the files from AWS now'
Deleting the files from AWS now
~ $
~ $ python manage.py collectstatic --noinput
790 static files copied.
~ $ python manage.py collectstatic --noinput
518 static files copied.
~ $ python manage.py collectstatic --noinput
0 static files copied.
~ $ python manage.py collectstatic --noinput
0 static files copied.
~ $
~ $ echo 'Deleting the files from AWS now'
Deleting the files from AWS now
~ $
~ $ python manage.py collectstatic --noinput
697 static files copied.
~ $ python manage.py collectstatic --noinput
611 static files copied.
~ $ python manage.py collectstatic --noinput
0 static files copied.
~ $ python manage.py collectstatic --noinput
0 static files copied.
~ $
~ $

It turned out, that the redis cache I used has some connection limits, so in the background I was having redis.exceptions.ConnectionError: max number of clients reached errors without being notified.

The collectstatic build step succeeded all times while having these errors and not copying some files.

The proposed fix simply waits for all the results and re-raises any errors, so the collectstatic command will fail loudly when an error happens in at least one thread.

(I wanted to write a test for this, but wasn't able to get the test suite running so I gave it up.)

@antonagestam
Copy link
Owner

@lepilepi Hmm, yeah looks like django-storages has removed .gzip from the S3 class. So that would have to be fixed in a separate PR first.

The expected behavior for failures should be to always fall back to the builtin collectstatic command instead of falling over. It sounds like something else has happened since your files weren't copied. Step 1 would probably be to reproduce that with a test.

I have limited time to work on this project at the moment and probably won't be able to help out much, but will review PRs with passing tests.

@lepilepi
Copy link
Author

I don't think anything else happened: I simply had redis.exceptions.ConnectionError errors in some of the threads, which was entirely swallowed by the ThreadPoolExecutor().map() so in the end there were no errors displayed, marked as success while missing several files.
If any other error might happen in those threads. those will be masked too, resulting as success.

At the moment, I simply reduced the number of the threads in COLLECTFAST_THREADS, hoping with crossed fingers that we won't hit it again. If so, we won't even notice it during the build.

I don't have much time to work on this either unfortunately. If I could run the test in local properly, I could add one to reproduce this issue easily but I don't have the time to figure out and fix the already failing tests in master, neither to figure out the why I can't run the tests in local based on the README (and update that as well if needed). (Something might be with the dependency versions).

@antonagestam
Copy link
Owner

which was entirely swallowed by the ThreadPoolExecutor().map() so in the end there were no errors displayed, marked as success while missing several files.

Yeah no I agree, I'm saying that's not to be expected. What would be expected is for those to have been retried using builtin collectstatic and whatever behavior is implemented there (which I assume is to fail loudly, but it's honestly too long ago for me to remember).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants