Skip to content
This repository has been archived by the owner on Oct 18, 2022. It is now read-only.

Workaround for Too Many Requests (HTTP error 429) #29

Open
mitm001 opened this issue Jul 17, 2020 · 6 comments
Open

Workaround for Too Many Requests (HTTP error 429) #29

mitm001 opened this issue Jul 17, 2020 · 6 comments

Comments

@mitm001
Copy link

mitm001 commented Jul 17, 2020

I use Antora for building my doc site. When it builds pages, it adds the same table of contents and header to every page. Every link has the same class of nav-link. As my site is over 250+ pages, this means that there are thousands of these duplicated links.

Sites will start timing things out after so many hits so I get thousands of these "Too Many Requests (HTTP error 429)" with default of 512 concurrent HTTP requests. I reduced this down to 32 to slow things down and this reduces the errors down to the hundreds.

I skip the links that are never going to change in the header using a regex but the ones in the TOC are always changing.

Are there any other configurations I could take advantage of to reduce these errors from the TOC? Like maybe skipping based off a class in the href?

@peter-evans
Copy link
Owner

Hi @mitm001 I'm afraid I don't know of any configurations that could help you. As you know, this action is a simple wrapper around Liche. It would probably be best to raise this issue there instead. Perhaps it's related to this issue raviqqe/liche#37.

@mitm001
Copy link
Author

mitm001 commented Jul 18, 2020

Yep, thats what I will do.

@ionut-arm
Copy link

Hi,

We're using your Github Action for our documentation as well (thanks!) and have started seeing this problem with github.io links - looking at the Liche repo I noticed a CLI option:

-c, --concurrency <num-requests>  Set max number of concurrent HTTP requests. [default: 512]

If that was configurable through the GA yaml it could probably help with the TMR error. Sure, if you have thousands of links to check you'll end up with a long run, but that's kinda what rate limiting is looking to do...

I'm not sure if that issue number 37 applies to us because rate limiting on Github's side sounds deterministic, while our errors are not.

@peter-evans
Copy link
Owner

Hi @ionut-arm

Good point. Liche arguments are configurable via the args input, so I think the following example should work. I don't know what a suitable number of concurrent requests to try and avoid this issue are, though. That would just require some experimentation.

    - name: Link Checker
      uses: peter-evans/link-checker@v1
      with:
        args: -v -r -c 48 *

@MichaIng
Copy link

Even with concurrency 1 it fails, as it seems to be not (only) about the amount of concurrent connections but about the number of connections in a specific time range: raviqqe/liche#42
Probably due to keep alive requests. Basically it would require to add a delay between checking the same host another time 🤔.

@peter-evans
Copy link
Owner

Liche was recently deprecated and as a result I've also decided to deprecate this action in favour of lychee-action, which is a fork of this project based on lychee. Please consider using that action.

According to the readme:

For GitHub links, it can optionally use a GITHUB_TOKEN to avoid getting blocked by the rate limiter.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants