Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add TODO file cleanup to avoid a single package blocking the entire sync process #1434

Open
89ao opened this issue Apr 24, 2023 · 5 comments
Labels
bug Something isn't working help wanted Extra attention is needed

Comments

@89ao
Copy link
Contributor

89ao commented Apr 24, 2023

As shown in the following log, I initially found that the packages "oreo" and "spanishconjugator" were not updated, but after checking the log, I found that "oreo" was missing and "spanishconjugator" failed to pass the verification.
The problem is that the failure of the subtasks' updates should not hinder the overall task's operation. Otherwise, the task will be stuck in a loop at these two packages forever.

# cat /yum/pip/todo
17825673
oreo4 17825509
spanishconjugator 17825562
2023-04-24 14:05:42 bandersnatch.package: INFO Fetching metadata for package: oreo4 (serial 17825509)
2023-04-24 14:05:42 bandersnatch.package: INFO Fetching metadata for package: spanishconjugator (serial 17825562)
2023-04-24 14:05:42 bandersnatch.package: ERROR Stale serial for package spanishconjugator - Attempt 1
2023-04-24 14:05:42 bandersnatch.package: INFO oreo4 no longer exists on PyPI
2023-04-24 14:05:43 bandersnatch.package: INFO Fetching metadata for package: spanishconjugator (serial 17825562)
2023-04-24 14:05:43 bandersnatch.package: ERROR Stale serial for package spanishconjugator - Attempt 2
2023-04-24 14:05:45 bandersnatch.package: INFO Fetching metadata for package: spanishconjugator (serial 17825562)
2023-04-24 14:05:45 bandersnatch.package: ERROR Stale serial for package spanishconjugator - Attempt 3
2023-04-24 14:05:45 bandersnatch.package: ERROR Stale serial for spanishconjugator (17825562) not updating. Giving up.
2023-04-24 14:05:45 bandersnatch.mirror: ERROR Error syncing package: spanishconjugator@17825562
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/bandersnatch/package.py", line 61, in update_metadata
    self._metadata = await master.get_package_metadata(
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/bandersnatch/master.py", line 220, in get_package_metadata
    metadata_response = await metadata_generator.asend(None)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/bandersnatch/master.py", line 138, in get
    await self.check_for_stale_cache(path, required_serial, got_serial)
  File "/usr/local/lib/python3.11/site-packages/bandersnatch/master.py", line 117, in check_for_stale_cache
    raise StalePage(
bandersnatch.master.StalePage: Expected PyPI serial 17825562 for request https://pypi.org//pypi/spanishconjugator/json but got 17825558. We can no longer issue a PURGE. Report issue to PyPA Warehouse GitHub if it persists ...

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/bandersnatch/mirror.py", line 129, in package_syncer
    await package.update_metadata(self.master, attempts=3)
  File "/usr/local/lib/python3.11/site-packages/bandersnatch/package.py", line 86, in update_metadata
    raise error_class(package_name=self.name, attempts=attempts)
bandersnatch.errors.StaleMetadata: Stale serial for spanishconjugator after 3 attempts
2023-04-24 14:05:45 bandersnatch.simple: INFO Generating global index page.
2023-04-24 14:05:49 bandersnatch.mirror: INFO 0 packages had changes

Please help me resolve these issues.

@cooperlees
Copy link
Contributor

Howdy - Can we also get your bandersnatch.conf added to the PR so that your usage can be confirmed / added to tests etc. etc. please.

@cooperlees cooperlees added bug Something isn't working help wanted Extra attention is needed labels Apr 24, 2023
@89ao
Copy link
Contributor Author

89ao commented Apr 25, 2023

@cooperlees here is my bandersnach.conf

[mirror]
directory = /opt/bandersnatch
storage-backend = filesystem
master = https://pypi.org/
json = true
timeout = 300
workers = 3
hash-index = false
stop-on-error = false
delete-packages = true
compare-method = stat
log-config = /conf/bandersnatch-log.conf


[plugins]
enabled =
    blocklist_project
    blocklist_release
    regex_project


[blocklist]
packages =
    uselesscapitalquiz
    tf-nightly-gpu
    tf-nightly
    tensorflow-io-nightly
    tf-nightly-cpu
    pyagrum-nightly
    appium
[filter_regex]
packages =
    .+-nightly.*

@89ao
Copy link
Contributor Author

89ao commented May 31, 2023

issue happens again,when i check banderlogfile.log,It seems to be no error,but the /yum/pip/todo file always keep these "no longer exist" packages ,this leads to the inability to perform new synchronization tasks.

Recently our pypi repo has been out of sync many times. Please help me solve this problem. @cooperlees

bandersnatch version:6.0.0

[root@VM_21_104_centos /data/home/motorao/bandersnatch]# tail -n 10 /yum/pip/banderlogfile.log
2023-05-31 19:02:01 bandersnatch.package: INFO zhanlan1 no longer exists on PyPI
2023-05-31 19:02:01 bandersnatch.package: INFO Fetching metadata for package: zlkj (serial 18121307)
2023-05-31 19:02:01 bandersnatch.package: INFO zhanlanpkg no longer exists on PyPI
2023-05-31 19:02:01 bandersnatch.package: INFO Fetching metadata for package: zwhrce (serial 18119235)
2023-05-31 19:02:01 bandersnatch.package: INFO zhanlanu no longer exists on PyPI
2023-05-31 19:02:01 bandersnatch.package: INFO zlkj no longer exists on PyPI
2023-05-31 19:02:01 bandersnatch.package: INFO zwhrce no longer exists on PyPI
2023-05-31 19:02:01 bandersnatch.simple: INFO Generating global index page.
2023-05-31 19:02:06 bandersnatch.mirror: INFO 0 packages had changes
2023-05-31 19:02:06 bandersnatch.mirror: INFO Writing diff file to mirrored-files

todo.zip

@cooperlees
Copy link
Contributor

cooperlees commented Jun 1, 2023

So, what I believe is happening here is you've ran a sync and it failed somehow, but during the time between your sync and the next, some of these packages in your todo file got deleted from PyPI. So it seems we get stuck into this loop of always trying to see if it has "come back".

I think we could introduce behavior to get out of this loop. But since this has been the behavior for a long time I think we have to gate it via a config option or CLI option.

I'd accept adding a config/CLI option (like --cleanup-todo) to allow deleting of these packages from the todo list if they raise PackageNotFound (aka, are not found on PyPI anymore).

  • The config option could be a boolean too cleanup_todo = false/true - Default: false

Your manual workaround for now is to just remove all the "no longer exists on PyPI" packages from your todo file or just remove your todo file.

@89ao
Copy link
Contributor Author

89ao commented Jun 1, 2023

Looks good to me , looking forward to the update. :)

@cooperlees cooperlees changed the title some package cannot be updated ,which is blocking the process of the whole syncing task Add TODO file cleanup to avoid a single package blocking the entire sync process Sep 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants