Long AS paths causing commotion
You might have experienced it or read about it Monday on the nanog
and cisco mailing list. Widespread routing instability caused by a single AS, The beauty of distributed routing system.
and arbor security
.
a few weeks ago as well. BGP sessions started flapping because of invalid data (AS_CONFED_SEQUENCE and AS_CONFED_SET) in the AS4_PATH field. This is actually a feature not a bug, or if you will a bug in the RFC.
The RFC described that a BGP speaker should teardown the session if it sees such an update. This was done to isolate the problem as much as possible and only direct neighbors would be affected. As it turns out in some cases the direct neighbor does not detect this and propagates the update and as a result routers further in the core will start flapping.
The problem here is comparable, a single announcement is able to teardown BGP sessions all over the Internet, so not just its direct neighbors. This results in lots of BGP updates and global BGP instability.
The above I guess, could be described a BGP denial of service attack.
However it important to realize that in one case the flaw was actually in the RFC and this is being fixed. In the case we saw this week it is a software bug. As with many of the BGP related events we have seen lately, most are non intentionally. Never the less the impact can be huge.
, I notice that at the exact moment that my upstream started to experience these problems an AS path with a length of 257 was detected. In that case AS45307 had prepended it's AS 251 times.

What happened?
AS47868 (SuproNet) apparently was experimenting with traffic engineering, by using AS path prepending. AS path prepending is a frequently used method to make a certain announcement a bit less preferable by making the AS path longer. It can help network administrators influencing on which peering traffic for certain prefix is preferred. This is done by prepending your own AS one, two or maybe a few times. I guess it's fair to say that prepends up to let's say 5 are fairly common, you will see them longer as well but in normal scenario's that shouldn't be necessary.Impact
AS47868 was prepending it's AS path many times, up to 252 times resulting in a AS path of 256. Although this is an insanely high number, considering that the average AS path length is about 4.3, It should definitely not cause the behavior we observed Monday. A number of routers that apparently run older software, were not capable to handle these long AS paths and as a result a fair number of BGP sessions started to flap, which caused a wave of updates (many times higher then normal) causing instability. A Good technical explanations can be found at renesys

So what?
The key thing here is that a single AS, announcing a single odd announcement was able to influence many BGP routers, resulting in world wide instability. So who do we 'blame'? Well, I would not blame AS47868 in this case, the real cause are the ASn's with buggy BGP implementations. A single odd announcement should never be able to impact so many others.Monitoring for Long AS paths
I added some extra functionality to the BGPmon software. It now collects the longest AS paths seen each day. It will also display the AS path and additional information. Check it out here: http://bgpmon.net/maxASpath.php.Related incidents
Interestingly we did see a similar issue
Same kind of incident last week
Last week one of our upstream providers in Vancouver experienced a similar problem, causing some routing instability for them and all their customers. According to the Post Mortem we received, one of their peers sent them BGP updates with Malformed AS paths, this is the exact same behavior many people experienced on Monday. Looking back in some of the BGP data that I collect for BGPmon.net
3 comments
It was not an “older IOS software” problem, many (if not all) recent IOS releases were affected. See http://blog.ioshints.info/2009/02/oversized-as-paths-cisco-ios-bug.html for details.
The best workaround to get rid from the problem is to use allowed max-as as mentioned by Ivan or there is another workaround available.
regards
shivlu jain
[…] http://wiki.nil.com/Limit_the_maximum_BGP_AS-path_length http://bgpmon.net/maxASpath.php http://www.cisco.com/en/US/docs/ios/iproute/command/reference/irp_bgp1.html#wp1013932 […]