Severe performance problems with RIPE whois server
So the new kid on the block for issuing IRR querys is bgpq3: http://snar.spb.ru/prog/bgpq3/ (Thanks to Job Snijders for pointing it out). Here are some comparative results for bgpq3 and irrtoolset using a random large-ish as-set with ~5000 ASNs:
ubuntu1:/home/nick/bgpq3-0.1.16> time bgpq3 AS-NEOT > /dev/null 0.144u 0.032s 0:04.26 3.9% 0+0k 0+0io 0pf+0w
ubuntu1:/home/nick/trunk/src/peval> time ./peval -h whois.radb.net -protocol irrd AS-NEOT > /dev/null 8.600u 0.516s 10:14.14 1.4% 0+0k 0+0io 0pf+0w
ubuntu1:/home/nick/trunk/src/peval> time ./peval -h whois.ripe.net -protocol ripe AS-NEOT > /dev/null 2.900u 0.304s 12:21.72 0.4% 0+0k 0+0io 0pf+0w
Due to its use of pipelining it can return IRRD query results ridiculously quickly, and it's pretty clear that irrtoolset is being completely trashed by comparison because it doesn't support pipelining. The difference between the RIPE and RADB whois server is related to the fact that the RADB server supports recursive as-set expansion. In fact, a more interesting question is not why bgpq3 is 175 times faster than irrtoolset, but rather why it takes as long as 4.26 seconds to complete an inverse query for the as-set in question. If I did ran a SQL query on a single term with indexed lookups and which returned 50k results, I'd wonder why it was so slow. In this case, it turns out that the answer mostly relates to the amount of data returned (~750k), and the rtt to the whois server from my query box (~150ms), rather than as a result of the query latency. Anyway, it should be clear that for large query sets, the RIPE server is extremely slow because of its lack of command pipelining and lack of support for recursive expansion of certain types of objects. This directly affects INEX because we run a full rebuild of our route server configuration three times daily. Our route server as-set (AS-SET-INEX-RS) contains about 5400 ASNs and the consequent serialisation delays are currently causing our rebuild scripts to take slightly more than 30 minutes to complete. This varies depending on client policies: earlier this year, rebuilds were taking about 2 hours. In fact for performance reasons, we need to query whois.radb.net instead of RIPE for the larger AS-SETs. This is because of the recursive as-set performance issue which affects queries to whois.ripe.net In order to solve this problem, what we'd really love to see is: 1. support for command pipelining 2. support for recursive expansion of as-set to a list of ASNs. 3. support for recursive inverse key expansion of an as-set to a list of prefixes These would mean a massive performance gain for INEX's use case (and a bunch of other IXPs around europe) and #3 would mean that the RIPE whois server could leapfrog IRRD in terms of performance. Would it be possible for the DB people to look into this? Nick
Hello Nick, Sure, these seem to be very useful set of improvements. We will look into the implementation of these features and will update the list. All the best, Kaveh. --- Kaveh Ranjbar, RIPE NCC Database Group Manager On Dec 27, 2012, at 12:12 AM, Nick Hilliard <nick@inex.ie> wrote:
In order to solve this problem, what we'd really love to see is:
1. support for command pipelining 2. support for recursive expansion of as-set to a list of ASNs. 3. support for recursive inverse key expansion of an as-set to a list of prefixes
These would mean a massive performance gain for INEX's use case (and a bunch of other IXPs around europe) and #3 would mean that the RIPE whois server could leapfrog IRRD in terms of performance.
Would it be possible for the DB people to look into this?
Hi all, I support Nick's feature requests as it would speed up a lot of tools in the operational community. I'd really love to see server-side expansion of AS-SETs/AUTNUMs. Kind regards, Job On Dec 27, 2012, at 9:05 AM, Kaveh Ranjbar <kranjbar@ripe.net> wrote:
Hello Nick,
Sure, these seem to be very useful set of improvements. We will look into the implementation of these features and will update the list.
All the best, Kaveh.
--- Kaveh Ranjbar, RIPE NCC Database Group Manager
On Dec 27, 2012, at 12:12 AM, Nick Hilliard <nick@inex.ie> wrote:
In order to solve this problem, what we'd really love to see is:
1. support for command pipelining 2. support for recursive expansion of as-set to a list of ASNs. 3. support for recursive inverse key expansion of an as-set to a list of prefixes
These would mean a massive performance gain for INEX's use case (and a bunch of other IXPs around europe) and #3 would mean that the RIPE whois server could leapfrog IRRD in terms of performance.
Would it be possible for the DB people to look into this?
We had this discussion back when dinosaurs roamed the Earth and the conclusion at the time was this would be better done on the client rather than the server side. These came mainly from the fact that: - client side resources are more plentiful than server-side (e.g. CPU) - the client can apply smart filters rather than do full expansion - you can keep the connection open using the -k flag and issue commands in quick succession. A big part of the slowness was due to server process (or thread in the new one) initiation and the -l flag allowed persistent connections. - you can also narrow down the record types that are returned by the server to those who are relevant to policy so that you don't have to parse all the irrelevant info when using RPSL policy (like all the *-c info) The server was designed to make combined use of these last two features very efficient. Of course hardware limitations on servers were quite different than they are now (though a DoS is still very easy if you push more work onto the server). I would however encourage people to try out the above before putting more crud into the server-side Joao On 27 Dec 2012, at 00:12, Nick Hilliard <nick@inex.ie> wrote:
So the new kid on the block for issuing IRR querys is bgpq3:
http://snar.spb.ru/prog/bgpq3/
(Thanks to Job Snijders for pointing it out).
Here are some comparative results for bgpq3 and irrtoolset using a random large-ish as-set with ~5000 ASNs:
ubuntu1:/home/nick/bgpq3-0.1.16> time bgpq3 AS-NEOT > /dev/null 0.144u 0.032s 0:04.26 3.9% 0+0k 0+0io 0pf+0w
ubuntu1:/home/nick/trunk/src/peval> time ./peval -h whois.radb.net -protocol irrd AS-NEOT > /dev/null 8.600u 0.516s 10:14.14 1.4% 0+0k 0+0io 0pf+0w
ubuntu1:/home/nick/trunk/src/peval> time ./peval -h whois.ripe.net -protocol ripe AS-NEOT > /dev/null 2.900u 0.304s 12:21.72 0.4% 0+0k 0+0io 0pf+0w
Due to its use of pipelining it can return IRRD query results ridiculously quickly, and it's pretty clear that irrtoolset is being completely trashed by comparison because it doesn't support pipelining. The difference between the RIPE and RADB whois server is related to the fact that the RADB server supports recursive as-set expansion.
In fact, a more interesting question is not why bgpq3 is 175 times faster than irrtoolset, but rather why it takes as long as 4.26 seconds to complete an inverse query for the as-set in question. If I did ran a SQL query on a single term with indexed lookups and which returned 50k results, I'd wonder why it was so slow. In this case, it turns out that the answer mostly relates to the amount of data returned (~750k), and the rtt to the whois server from my query box (~150ms), rather than as a result of the query latency.
Anyway, it should be clear that for large query sets, the RIPE server is extremely slow because of its lack of command pipelining and lack of support for recursive expansion of certain types of objects.
This directly affects INEX because we run a full rebuild of our route server configuration three times daily. Our route server as-set (AS-SET-INEX-RS) contains about 5400 ASNs and the consequent serialisation delays are currently causing our rebuild scripts to take slightly more than 30 minutes to complete. This varies depending on client policies: earlier this year, rebuilds were taking about 2 hours.
In fact for performance reasons, we need to query whois.radb.net instead of RIPE for the larger AS-SETs. This is because of the recursive as-set performance issue which affects queries to whois.ripe.net
In order to solve this problem, what we'd really love to see is:
1. support for command pipelining 2. support for recursive expansion of as-set to a list of ASNs. 3. support for recursive inverse key expansion of an as-set to a list of prefixes
These would mean a massive performance gain for INEX's use case (and a bunch of other IXPs around europe) and #3 would mean that the RIPE whois server could leapfrog IRRD in terms of performance.
Would it be possible for the DB people to look into this?
Nick
On 27/12/2012 15:44, Joao Damas wrote:
- client side resources are more plentiful than server-side (e.g. CPU)
this will always be true.
- the client can apply smart filters rather than do full expansion
- you can keep the connection open using the -k flag and issue commands in quick succession. A big part of the slowness was due to server process (or thread in the new one) initiation and the -l flag allowed persistent connections.
this sounds like a problem associated with fork() efficiency before the widespread use of vfork() and copy-on-write memory handling techniques.
- you can also narrow down the record types that are returned by the server to those who are relevant to policy so that you don't have to parse all the irrelevant info when using RPSL policy (like all the *-c info)
this is effectively available using the -K option.
The server was designed to make combined use of these last two features very efficient.
Of course hardware limitations on servers were quite different than they are now (though a DoS is still very easy if you push more work onto the server).
I suspect things have moved on with more recent versions of the front / back-end split in the whois server mechanism. The back-end mechanism uses SQL and it should be easy to implement recursive as-set expansion using either direct sql joins or else a set of efficient
I would however encourage people to try out the above before putting more crud into the server-side
irrtoolset / bgpq3 / rpsltool and everything else already use these techniques. The problem looks like it's related almost entirely to serialisation delays, not client side delays or raw server performance delays. Looking at the 10 minute / 4 second figures, the only way to get around this with the ripe whois server is to open up multiple connections to the whois server and issue parallel whois queries - this technique will reduce the overall latency by a factor equal to the number of parallel open sessions. I don't think anyone in the NCC is going to thank me for opening up 175 parallel connections in order to get my end-user performance similar to merit irrd levels :-) Nick
participants (4)
-
Joao Damas
-
Job Snijders
-
Kaveh Ranjbar
-
Nick Hilliard