NSID option on the RIPE Atlas SOA measurements of the root servers
Dear colleagues, RIPE Atlas is currently running a series of DNS SOA built-in measurements* towards all of the root servers from all probes. During the recent DNS measurements hackathon** it became apparent that for some use cases it would be useful to have SOA queries from all probes with the NSID EDNS option set, in order to be able to match up responses with the particular responding instances in an anycasted (or load balanced) setup. We would like to ask for feedback on two alternative options for implementing this change. They are: 1) Enable the NSID option for the existing built-in measurements towards the nine root servers which support it. 2) Start a new set of built-in measurements towards the nine root servers which support NSID. The advantages of option (1) are that it will be possible to compare and contrast the two sets of results, and that historical data for the existing built-ins will remain consistent with the current results. A very simple analysis shows that there is no overall increase in query error rates through enabling NSID, but there are bound to be individual marginal cases where queries fail or produce different results with NSID set but succeed without it. The advantages of option (2) are that there will be no increase in result storage usage -- the current IPv4+IPv6 UDP SOA built-ins towards the nine supporting root servers adds up to about 80 results per second (~1.5% of the total results in the system). This could potentially be mitigated by reducing the frequency for these new measurements, but perhaps more important than the slightly increased load is the potential user confusion caused by generating and providing two very similar sets of measurements. Please let us know if you have any preference for which way we go on this, particularly if you have a (current or future) use case for this kind of data. Kind regards, Chris Amin RIPE NCC * https://atlas.ripe.net/docs/built-in/ ** https://labs.ripe.net/Members/becha/results-dns-measurements-hackathon
Hi Chris, Chris Amin wrote:
RIPE Atlas is currently running a series of DNS SOA built-in measurements* towards all of the root servers from all probes. During the recent DNS measurements hackathon** it became apparent that for some use cases it would be useful to have SOA queries from all probes with the NSID EDNS option set, in order to be able to match up responses with the particular responding instances in an anycasted (or load balanced) setup.
We would like to ask for feedback on two alternative options for implementing this change. They are:
1) Enable the NSID option for the existing built-in measurements towards the nine root servers which support it.
2) Start a new set of built-in measurements towards the nine root servers which support NSID.
The advantages of option (1) are that it will be possible to compare and contrast the two sets of results, and that historical data for the existing built-ins will remain consistent with the current results. A very simple analysis shows that there is no overall increase in query error rates through enabling NSID, but there are bound to be individual marginal cases where queries fail or produce different results with NSID set but succeed without it.
The advantages of option (2) are that there will be no increase in result storage usage -- the current IPv4+IPv6 UDP SOA built-ins towards the nine supporting root servers adds up to about 80 results per second (~1.5% of the total results in the system). This could potentially be mitigated by reducing the frequency for these new measurements, but perhaps more important than the slightly increased load is the potential user confusion caused by generating and providing two very similar sets of measurements.
Please let us know if you have any preference for which way we go on this, particularly if you have a (current or future) use case for this kind of data.
From a research perspective, I would argue that it would at least temporarily make sense to have the two (slightly) different measurements (i.e. the old without and the new one with NSID enabled) running in
I don't quite see how option 2 does not result in an increase in storage requirements, and you seem to contradict this by then talking about reducing the measurement frequency. Perhaps I misunderstand what you're saying here. parallel, just to flesh out whether any significant differences occur. If storage and measurement performance are not a (serious) issue, then running two separate measurements would be preferable, in my opinion, to safeguard continuity of the existing measurements. In my experience such longitudinal datasets keep increasing in value as time progresses, and are more valuable if they have a consistent measurement methodology. Based on that argument, discontinuing or altering and existing measurement should only be done if there are good reasons for it. Best regards, Roland -- -- Roland M. van Rijswijk - Deij -- SURFnet bv -- w: http://www.surf.nl/en/about-surf/subsidiaries/surfnet -- e: roland.vanrijswijk@surfnet.nl
On 24/07/2017 09:04, Roland van Rijswijk - Deij wrote:
Please let us know if you have any preference for which way we go on this, particularly if you have a (current or future) use case for this kind of data.
I don't quite see how option 2 does not result in an increase in storage requirements, and you seem to contradict this by then talking about reducing the measurement frequency. Perhaps I misunderstand what you're saying here.
My mistake, I had the option numbers the wrong way round. Option 2 (new measurements) slightly increases storage requirements, but comes with the kinds of benefits that you advocate for below.
From a research perspective, I would argue that it would at least temporarily make sense to have the two (slightly) different measurements (i.e. the old without and the new one with NSID enabled) running in parallel, just to flesh out whether any significant differences occur. If storage and measurement performance are not a (serious) issue, then running two separate measurements would be preferable, in my opinion, to safeguard continuity of the existing measurements. In my experience such longitudinal datasets keep increasing in value as time progresses, and are more valuable if they have a consistent measurement methodology. Based on that argument, discontinuing or altering and existing measurement should only be done if there are good reasons for it.
ACK
On Thu, Jul 20, 2017 at 02:20:39PM +0200, Chris Amin <camin@ripe.net> wrote a message of 90 lines which said:
it would be useful to have SOA queries from all probes with the NSID EDNS option set, in order to be able to match up responses with the particular responding instances
It is also useful to detect rogue root name servers (quite common with anycast) or transparent DNS proxies. (Measurement #9209448 finds several probes asking a rogue L-root, which has no NSID support, or located behind a middlebox which strips NSID. Check probes 23621,19770, 24890, 26328, 27059, 27080, 27843, 33806, 21570,14272, 13660, 17775, 17841, 26587, 30847, 11410, 23438, 29814, 13719, 21140, 25189, 25197. For some, the SOA serial number is so old that it is probably a rogue root name server. Also, one probe, 28846, finds a server replying with an abnormal NSID, which is not the normal from L-root.)
1) Enable the NSID option for the existing built-in measurements towards the nine root servers which support it.
Why one these? Activating it for all servers would help if the last non-NSID servers switch suddenly to NSID. And it would also be useful to find rogue servers if they have NSID enabled (probe 28846 is behind a proxy which always add dummy NSID replies).
On Thu, Jul 20, 2017 at 02:20:39PM +0200, Chris Amin <camin@ripe.net> wrote a message of 90 lines which said:
it would be useful to have SOA queries from all probes with the NSID EDNS option set, in order to be able to match up responses with the particular responding instances
It is also useful to detect rogue root name servers (quite common with anycast) or transparent DNS proxies. (Measurement #9209448 finds several probes asking a rogue L-root, which has no NSID support, or located behind a middlebox which strips NSID. Check probes 23621,19770, 24890, 26328, 27059, 27080, 27843, 33806, 21570,14272, 13660, 17775, 17841, 26587, 30847, 11410, 23438, 29814, 13719, 21140, 25189, 25197. For some, the SOA serial number is so old that it is probably a rogue root name server. Also, one probe, 28846, finds a server replying with an abnormal NSID, which is not the normal from L-root.)
I also find useful to match id.server./hostname.bind. queries against the NSID results (à-la-nsidenumerator, see flag --id-server, https://github.com/insomniacslk/nsidenumerator )
participants (4)
-
Andrea Barberio
-
Chris Amin
-
Roland van Rijswijk - Deij
-
Stephane Bortzmeyer