You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello guys, recently I was using crawler to crawl some stuff and it was taking quite a lot of time, so I decided to use async mode. While using the async mode I've noticed a lot of duplicates in my results, especially number of duplicates was matching the number of threads I was launching my crawler.
As you can see, there are duplicates in results. Maybe I'm doing something wrong, not setting up crawler properly, but still I highly doubt if this is a intended behaviour. Anyways, would appreciate any help.
The text was updated successfully, but these errors were encountered:
Because c.OnResponse is executed 5 times in the loop, and each time the incoming parameters are added to c.responseCallbacks in the form of an append, each goroutine executes all the functions in c.responseCallbacks when it completes the request.
Hello guys, recently I was using crawler to crawl some stuff and it was taking quite a lot of time, so I decided to use async mode. While using the async mode I've noticed a lot of duplicates in my results, especially number of duplicates was matching the number of threads I was launching my crawler.
Here is a quick example, let's take an example from official docs - https://github.com/gocolly/colly/blob/master/_examples/rate_limit/rate_limit.go
If we would launch this code, we can see the results:
A lot of text here with http body response
As you can see, there are duplicates in results. Maybe I'm doing something wrong, not setting up crawler properly, but still I highly doubt if this is a intended behaviour. Anyways, would appreciate any help.
The text was updated successfully, but these errors were encountered: