-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Async cannot scale #39
Comments
That's an issue with the underlying getdns library, which can handle a larger number of asynchronous queries on a single context but which will also start getting timeouts on a very large number. I'll discuss it with the team on Monday. |
I am concerned that the problem does not have to do with the number of domains, rather the wait time between initializing the requests table and submitting them. Most async libraries I have worked with, usually submit the requests as they come (in a first come first served fashion), like for example Thank you for your attention! You have been very helpful and I really appreciate your commitment on the project! |
I actually do hand off the requests to libgetdns as soon as they come in and they are dispatched immediately. If you jump into wireshark or some such and run the Python interpreter interactively, you can watch the queries go out immediately - there's no waiting around. That said, there clearly are some scaling issues. I should add that one thing that's been on the back burner but should probably be moved up is exposing a file descriptor on a Context() that can be polled by external async libraries like Twisted, etc. But I am relatively certain that it won't improve the number of queries you can spin off without getting timeouts. |
I'd like to second on this issue, and also point in your comment above getdns sends all the queries it has in one go, despite the limit_outstanding_queries parameter. I've been playing with different values and capturing the queries, and always sends the full list in one go. In the same way as @panagiotious I'm trying to resolver millions of names against a local resolver. |
Hi Sebastian and panagiotious, this issue is indeed related to the underlying C - library. I've imported this issue there. I don't think you'll be subscribed automatically, you probably need to leave a comment there first. Here is the new issue: getdnsapi/getdns#257 For completeness I'll include my response to the issue here too: Indeed, the limit_outstanding_queries parameter affects full recursion only. We simply forgot/missed implementation in stub resolution mode. This needs to be addressed quickly (before the 1.1 release). Also, note that using an external eventloop is strongly advised when using getdns with many simultaneous queries. The default eventloop is based on select and can handle only a limited amount of simultaneous queries. This is documented in the README.md (of the C-library) as known issues b.t.w. Neilcook has a pull request currently that replaces select with poll in the default eventloop extension. I intent to polish it up a little bit before merging (use of custom memory functions, turn it into another eventloop extension for platforms that don't have poll), but it might be worthwhile to try it out already if you want to schedule many simultaneous queries without using an external event library right now. @panagiotious Indeed, since UDP has no buffer for outgoing messages it is conventional (in other asynchronous libraries) to write a message out immediately. This does not work for TCP and TLS which need handshakes etc before the socket can be written to. I don't think it matters much to be honest, but I'm willing to have a look if writing to UDP sockets immediately can be implemented without too much difficulty. |
Hello,
After spending a several hours debugging, rewriting and implementing a few ideas, I think I can safely conclude that the asynchronous functionality is not scaling. When items are added to the
Context
object as calls foraddress
, the timeout counter starts; that's devastating! When resolving a handful of domains and not care about precision, it is not an issue, but when the resolution requests are of the order of millions, it is impossible to scale.In the trivial case, resolving 10,000 domain names, ends up with a few hundreds timing out. The more QNAMEs added, the highest the timeout events. Here is a small example:
The text was updated successfully, but these errors were encountered: