Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stubby: truncating large answers #16347

Closed
raidenii opened this issue Aug 12, 2021 · 13 comments
Closed

stubby: truncating large answers #16347

raidenii opened this issue Aug 12, 2021 · 13 comments

Comments

@raidenii
Copy link

raidenii commented Aug 12, 2021

Maintainer: @jonathanunderwood
Environment: x86-64, OpenWrt 19.07.8

Description:

Stubby 0.3.0

Stubby on openwrt specifically appears to truncate large answers; does not happen on other Linux distros. An example:

root@openwrt:~# drill -p 5353 @127.0.0.1 mobileappcommunicator.auth.microsoft.com
;; ->>HEADER<<- opcode: QUERY, rcode: NOERROR, id: 20347
;; flags: qr tc rd ra ; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0 
;; QUESTION SECTION:
;; mobileappcommunicator.auth.microsoft.com.    IN      A

;; ANSWER SECTION:

;; AUTHORITY SECTION:

;; ADDITIONAL SECTION:

;; Query time: 301 msec
;; SERVER: 127.0.0.1
;; WHEN: Thu Aug 12 15:34:22 2021
;; MSG SIZE  rcvd: 58

;; WARNING: The answer packet was truncated; you might want to
;; query again with TCP (-t argument), or EDNS0 (-b for buffer size)

If specifying with -t the answers were returned properly. However on other Linux distros (e.g., Arch), without -t the query still went through fine.

@jamesmacwhite
Copy link
Contributor

jamesmacwhite commented Aug 13, 2021

@raidenii Just an FYI there is no active maintainer for stubby anymore currently, the version and Makefile info in openwrt-19.07 is slightly old compared to master.

Interestingly, as I also use stubby I tried the same command:

root@linksys-wrt3200acm:~# drill -p 5453 @127.0.0.1 mobileappcommunicator.auth.microsoft.com
;; ->>HEADER<<- opcode: QUERY, rcode: SERVFAIL, id: 46542
;; flags: qr rd ; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;; mobileappcommunicator.auth.microsoft.com.    IN      A

;; ANSWER SECTION:

;; AUTHORITY SECTION:

;; ADDITIONAL SECTION:

;; Query time: 5000 msec
;; SERVER: 127.0.0.1
;; WHEN: Fri Aug 13 08:54:32 2021
;; MSG SIZE  rcvd: 58

I do not get a truncated answer.

Edit: Actually after running the command a few times, the answer was then truncated, however the first initial DNS lookups did not do this.

@jamesmacwhite
Copy link
Contributor

Possibly related: getdnsapi/getdns#495

@dhewg
Copy link
Contributor

dhewg commented Aug 13, 2021

root@linksys-wrt3200acm:~# drill -p 5453 @127.0.0.1 mobileappcommunicator.auth.microsoft.com
;; ->>HEADER<<- opcode: QUERY, rcode: SERVFAIL, id: 46542

But that's an actual error.

Without EDNS a truncated UDP answer can be expected, clients are supposed to use EDNS or retry over TCP. You know, like drill tells you at the end ;)

First random search result: https://dnsinstitute.com/documentation/dnssec-guide/ch03s05.html

@jamesmacwhite
Copy link
Contributor

Ah yes, my apologies, please ignore me!

@dhewg
Copy link
Contributor

dhewg commented Aug 13, 2021

But for the record, I don't see that issue on master:

# drill -p 5453 @127.0.0.1 mobileappcommunicator.auth.microsoft.com
;; ->>HEADER<<- opcode: QUERY, rcode: NOERROR, id: 2125
;; flags: qr rd ra ; QUERY: 1, ANSWER: 10, AUTHORITY: 0, ADDITIONAL: 0 
;; QUESTION SECTION:
;; mobileappcommunicator.auth.microsoft.com.	IN	A

;; ANSWER SECTION:
mobileappcommunicator.auth.microsoft.com.	300	IN	CNAME	prda.aadg.msidentity.com.
prda.aadg.msidentity.com.	184	IN	CNAME	www.tm.a.prd.aadg.akadns.net.
www.tm.a.prd.aadg.akadns.net.	184	IN	A	40.126.31.143
www.tm.a.prd.aadg.akadns.net.	184	IN	A	40.126.31.135
www.tm.a.prd.aadg.akadns.net.	184	IN	A	40.126.31.141
www.tm.a.prd.aadg.akadns.net.	184	IN	A	20.190.159.134
www.tm.a.prd.aadg.akadns.net.	184	IN	A	40.126.31.1
www.tm.a.prd.aadg.akadns.net.	184	IN	A	20.190.159.136
www.tm.a.prd.aadg.akadns.net.	184	IN	A	20.190.159.138
www.tm.a.prd.aadg.akadns.net.	184	IN	A	40.126.31.139

;; AUTHORITY SECTION:

;; ADDITIONAL SECTION:

;; Query time: 157 msec
;; EDNS: version 0; flags: ; udp: 512
;; SERVER: 127.0.0.1
;; WHEN: Fri Aug 13 13:07:49 2021
;; MSG SIZE  rcvd: 274

I'm not sure if that's because a version difference, at least there's an EDNS comment.

# drill -v
drill version 1.7.1 (ldns version 1.7.1)
# stubby -V
Stubby 0.4.0

@jamesmacwhite
Copy link
Contributor

I believe the version of stubby in master is 0.4.0 where as both 21.02 and 19.07 it's 0.3.0, could do a PR to backport 0.4.0 to these branches?

@jamesmacwhite
Copy link
Contributor

I've created a PR for 0.4.0 on 21.02, given it will be the new stable soon.

#16358

@raidenii
Copy link
Author

raidenii commented Aug 14, 2021

Thanks James, although I doubt this might be openwrt specific... tried on a Debian Buster host with even older stubby:

user@host:~$ drill -v
drill version 1.7.0 (ldns version 1.7.0)
Written by NLnet Labs.

Copyright (c) 2004-2008 NLnet Labs.
Licensed under the revised BSD license.
There is NO warranty; not even for MERCHANTABILITY or FITNESS
FOR A PARTICULAR PURPOSE.
user@host:~$ stubby -V
0.2.5
user@host:~$ drill -p 5353 @127.0.0.1 mobileappcommunicator.auth.microsoft.com
;; ->>HEADER<<- opcode: QUERY, rcode: NOERROR, id: 4344
;; flags: qr rd ra ; QUERY: 1, ANSWER: 10, AUTHORITY: 0, ADDITIONAL: 0 
;; QUESTION SECTION:
;; mobileappcommunicator.auth.microsoft.com.    IN      A

;; ANSWER SECTION:
mobileappcommunicator.auth.microsoft.com.       3600    IN      CNAME   prda.aadg.msidentity.com.
prda.aadg.msidentity.com.       300     IN      CNAME   www.tm.a.prd.aadg.trafficmanager.net.
www.tm.a.prd.aadg.trafficmanager.net.   300     IN      A       40.126.24.82
www.tm.a.prd.aadg.trafficmanager.net.   300     IN      A       40.126.24.146
www.tm.a.prd.aadg.trafficmanager.net.   300     IN      A       40.126.24.84
www.tm.a.prd.aadg.trafficmanager.net.   300     IN      A       20.190.152.19
www.tm.a.prd.aadg.trafficmanager.net.   300     IN      A       40.126.24.147
www.tm.a.prd.aadg.trafficmanager.net.   300     IN      A       20.190.152.20
www.tm.a.prd.aadg.trafficmanager.net.   300     IN      A       40.126.24.81
www.tm.a.prd.aadg.trafficmanager.net.   300     IN      A       40.126.24.149

;; AUTHORITY SECTION:

;; ADDITIONAL SECTION:

;; Query time: 168 msec
;; EDNS: version 0; flags: ; udp: 1232
;; SERVER: 127.0.0.1
;; WHEN: Sat Aug 14 08:52:36 2021
;; MSG SIZE  rcvd: 637

Anyway to check stubby build flags? My guess is that EDNS support is somehow broken is this build.

@jamesmacwhite
Copy link
Contributor

OK, probably worth bumping to 0.4.0 anyway given it's the latest and has been in master for a while.

EDNS issue will need further investigation by the sounds of it.

@raidenii
Copy link
Author

raidenii commented Sep 9, 2021

Seems issue persist with 21.02 and Stubby 0.4:

root@openwrt:/etc/config# stubby -V
Stubby 0.4.0
root@openwrt:/etc/config# drill -p 5353 @127.0.0.1 mobileappcommunicator.auth.microsoft.com
;; ->>HEADER<<- opcode: QUERY, rcode: NOERROR, id: 46637
;; flags: qr tc rd ra ; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0 
;; QUESTION SECTION:
;; mobileappcommunicator.auth.microsoft.com.    IN      A

;; ANSWER SECTION:

;; AUTHORITY SECTION:

;; ADDITIONAL SECTION:

;; Query time: 28 msec
;; SERVER: 127.0.0.1
;; WHEN: Thu Sep  9 13:19:48 2021
;; MSG SIZE  rcvd: 58

;; WARNING: The answer packet was truncated; you might want to
;; query again with TCP (-t argument), or EDNS0 (-b for buffer size)

@raidenii
Copy link
Author

@jamesmacwhite @dhewg What is the getdnsapi version of yours that stubby works as expected? The one in openwrt 21.02 repo (Stubby 0.4) is using an old version of 1.6.0, which I suppose might be related. I tried on my Arch Linux which also has Stubby 0.4 but compiled with getdnsapi 1.7.0, and does not have this issue.

@itzViking
Copy link

itzViking commented Dec 12, 2021

Apart from bumping stubby to 0.4.0, it's also needed to bump getdns to 1.7.0 to enjoy DNS name compression (as per https://getdnsapi.net/releases/getdns-1-7-0/)

With getdns 1.7.0 we get response smaller than 512 bytes thanks to name compression, so no truncation would occur in most situations:

root@OpenWrt:/tmp# stubby -V
Stubby 0.4.0
root@OpenWrt:/tmp# getdns_query -v
Version 1.7.0

root@OpenWrt:/tmp# drill -p 5453 @127.0.0.1 mobileappcommunicator.auth.microsoft.com
;; ->>HEADER<<- opcode: QUERY, rcode: NOERROR, id: 43411
;; flags: qr rd ra ; QUERY: 1, ANSWER: 10, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;; mobileappcommunicator.auth.microsoft.com.    IN      A

;; ANSWER SECTION:
mobileappcommunicator.auth.microsoft.com.       3600    IN      CNAME   prda.aadg.msidentity.com.
prda.aadg.msidentity.com.       300     IN      CNAME   www.tm.a.prd.aadg.trafficmanager.net.
www.tm.a.prd.aadg.trafficmanager.net.   300     IN      A       20.190.144.165
www.tm.a.prd.aadg.trafficmanager.net.   300     IN      A       20.190.144.162
www.tm.a.prd.aadg.trafficmanager.net.   300     IN      A       40.126.16.166
www.tm.a.prd.aadg.trafficmanager.net.   300     IN      A       40.126.16.165
www.tm.a.prd.aadg.trafficmanager.net.   300     IN      A       40.126.16.167
www.tm.a.prd.aadg.trafficmanager.net.   300     IN      A       20.190.144.163
www.tm.a.prd.aadg.trafficmanager.net.   300     IN      A       20.190.144.164
www.tm.a.prd.aadg.trafficmanager.net.   300     IN      A       20.190.144.161

;; AUTHORITY SECTION:

;; ADDITIONAL SECTION:

;; Query time: 336 msec
;; EDNS: version 0; flags: ; udp: 1232
;; SERVER: 127.0.0.1
;; WHEN: Sun Dec 12 23:48:03 2021
;; MSG SIZE  rcvd: 282

With previous getdns 1.6.0 getdns version:
(although 'truncated' is still a valid response, and client should retry in TCP mode at this point..)

root@OpenWrt:/tmp# stubby -V
Stubby 0.4.0
root@OpenWrt:/tmp# getdns_query -v
Version 1.6.0

root@OpenWrt:/tmp# drill -p 5453 @127.0.0.1 mobileappcommunicator.auth.microsoft.com
;; ->>HEADER<<- opcode: QUERY, rcode: NOERROR, id: 2296
;; flags: qr tc rd ra ; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;; mobileappcommunicator.auth.microsoft.com.    IN      A

;; ANSWER SECTION:

;; AUTHORITY SECTION:

;; ADDITIONAL SECTION:

;; Query time: 248 msec
;; SERVER: 127.0.0.1
;; WHEN: Sun Dec 12 23:50:14 2021
;; MSG SIZE  rcvd: 58

;; WARNING: The answer packet was truncated; you might want to
;; query again with TCP (-t argument), or EDNS0 (-b for buffer size)

Pull request submitted to upgrade getdns to 1.7.0: #17317

@raidenii
Copy link
Author

Confirmed working with getdns 1.7.0-1 and stubby 0.4.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants