Hi Christian, thanks for sending this out, it is a very interesting paper. I agree with you that RIPE should acknowledge its support for this research. I have a couple of observations to add before everyone agrees to go dropping damping from their configurations: 1) Its hard to dispute that NO damping is better than the suppression of constantly oscilating routes from my perspective, as a steady state *will* always be reached. As Randy states, it may require less aggressive parameters to achieve this. 2) Damping originated a while back when routing overhead had a far greater affect on the packet forwarding of routers. Now that the Internet backbone is mostly made up of much more powerful routing devices with separate routing and forwarding engines it could be argued that the affect is less harmful. I guess it could also be argued that emerging areas of the Internet will still use less powerful routing engines? It doesnt aid the stability argument other than to speed up convergence though. How important is it in this context to still have damping in place ? 3) The key to utopian damping would be in the exact identification of the originating withdrawal. I.e the source. By this I mean if any device that receives a withdraw was intelligent enough (and the protocol had the feature which it does not readily have) to identify that it had received multiple withdraws for the same prefix from the same source for the same flap (this would require some kind of sequence number ! Maybe each witthdraw adds a specific transitive extended community but that seems a lot of work..) it may work as expected. But this is a big redesign of the protocol specifically for a new feature and hence a no starter. 4) More empirical data is surely required from research to quantify the global nature of the phenomenon? Large scale simulations of many thousands of systems with damping on and on should be theoretically possible with todays computing power, a research project for someone ? ;-). 5) Section 7 of http://www.cs.berkeley.edu/~zmao/Papers/sig02.pdf 'Selective Route Flap Damping' makes some really interesting proposals that sound promising. There is some similarity with my point 3 above, and it contains some potential alluding to the long forgotten transitive attribute DPA that never got off the ground. I particularly like i) Only treat withdraws as bad events and ii) wait until the next announce before triggering further penalties. For the authors, would these alterations alone to the damping spec have a large positive affect on the overall stabiulity of BGP systems ? 6) The fact that the major router vendors have no BCP to follow other than RIPE-229 (which is really an operator BCP rather than a vendor targeted one) means that already we have differences in the backbone just when default values are used. i.e JNX vs CSCO, 4 vs 3 withdrawals. We will never fix the issue of timer differences as individual organisations will always have different ideas and agendas. Tony Barber At 06:02 PM 17/09/02 +0200, Christian Panigl, ACOnet/VIX/UniVie wrote:
Dear RIPE Routing-WG members,
at the RIPE43 meeting in Rhodes last week Randy Bush was giving an interesting presentation on recent observations of counterproductive effects of current implementations of BGP Route Flap Damp(en)ing.
The presentation is available in Acrobat Reader format at:
http://www.ripe.net/ripe/meetings/archive/ripe-43/presentations/ripe43-rout ing-flap.pdf
Brief summary:
Basically because of "Cascaded Withdrawals" a single original flap of a prefix may cause the route flap penalty to exceed the suppress threshold at some points (routers) in the Internet. Main reasons for this behaviour are the characteristics of BGP (path vector), variations in BGP timers and delays, algorythm, implementation and parameters of current Route Flap Damping.
This phenomenon is also (though less clearly) described in RIPE-229 (Flap Damping Parameter Recommendation), as it was in its predecessors since early 1998:
http://www.ripe.net/ripe/docs/ripe-229.html#4.1
Only in late July this year I did again observe severe occurences of this phenomenon and was starting a conversation with Philip Smith (Cisco) and others.
Now there is a SIGCOM 2002 Publication (August 2002) "Route Flap Damping Exacerbates Internet Routing Convergence" by Z.Mao and others:
http://www.cs.berkeley.edu/~zmao/Papers/sig02.pdf
I would like to thank the authors for the excellent research and, as one of the authors of RIPE-229 and its predecessors, I'm acknowledging that we haven't yet been able to analyse this specific problem in detail, but have communicated it to e.g. the MERIT/IPMA project team.
I'm therefore more than grateful that Z.Mao and his co-authors did finally investigate it and would like to propose to the RIPE Routing-WG that we formally acknowledge their work, and specifically support their conclusion.
Comments are of course welcome !
Kind regards CP
---
--- Christian Panigl : Vienna University Computer Center - ACOnet --- --- VUCC - ACOnet - VIX :
--- Universitaetsstrasse 7 : Mail: Panigl@CC.UniVie.ac.at (CP8-RIPE)
--- A-1010 Vienna / Austria : Tel: +43 1 4277-14032 (Fax: -9140) --- ---