When BGP AS-Override goes the wrong way

Sunday, 13 Jan 2019

BGP AS-Override

Much like my post on when BGP SoO goes the wrong way, I seem to have a problem with directionality of commands on Cisco IOS - this time, with BGP AS-Override. I came across this in an Enterprise Network (the same kind where we say "MPLS" but actually mean "IP VPN we buy from someone else"), where the ISP we used had an offering they called "Shared Access" - which basically means they'll let you hook an Access Circuit into someone else's IP VPN/VRF with them, as long as you, the ISP and the "VRF Owning Company" co-sign an agreement saying it's allowed.

Why might you want to do this? Think along the lines of Extranets, and furthering the idea that "Everything is just a Line Card" across Company boundaries; particularly useful if you work in the Large Enterprise and Public Sector space, as here there are often strange agreements where multiple Managed Service Providers (MSPs), Systems Integrators (SIs) and sometimes even Service Providers (SPs) (reluctantly) come together to offer a common "Service" back to either the General Public, or perhaps some large Industry Sector. Regardless of the why, the problem is normally the same old BGP-over-VRF limitation - if you use the same ISP for multiple IP VPNs/VRFs, and have end-to-end BGP reachability, BGP doesn't know to turn off it's split-horizon-based-on-ASN functionality; because it just sees the same ASN twice in the AS_PATH, rather than "knowing" that the AS_PATH consists of two differing VRFs/Routing Domains.

The Scenario Topology

undefined

This is the Scenario Network Topology, showing:

  • 2x My Network MPLS CE Network Customer Edge (CE) Router
  • 4x MPLS SP Network Provider Edge (PE) Routers
    • 2x Connected to My Company Network IP VPN/VRF VPNN123456
    • 2x Connected to Other Company Network IP VPN/VRF VPNN654321
  • 2x eBGP Peering from My Company Network < -> SP MPLS PE Router, connected to My Company IP VPN/VRF VPNN123456
  • 2x eBGP Peering from Other Company Network <-> SP MPLS PE Router, connected to Other Company IP VPN/VRF VPNN654321
    • 1x "Foreign Network" CE Router @ My Company Data Centre
  • AS-Override applied on My Network MPLS CE Network (CE) Router (towards My Company IP VPN/VRF VPNN123456)
    • Note that I am "piggy-in-the-middle"

Some notes on SP Terminology

As some of this is specific to using a Third Party SP's MPLS Network, through a "wires-only" IP VPN offering - here's a quick primer on some terminology I'm using, as this will differ between varying SP's:

  • "wires-only" - Means the SP drops a NTE/NTU in My Company's Premises, to which I attach my self-managed CE Router
    • The SP does not manage any of CE Router; I eBGP Peer direct from a Private ASN to the SP's Public ASN (or whatever they use)
    • I'm told this model is more popular in the USA than Europe (but I'm in the UK, so there are exceptions to the rule...)
  • VPNNxxxxxx - The SP-allocated IP VPN/VRF Identifier, so that they can differentiate between their various Customers (they could name their VRF instances by Company Name, but what happens when the Company changes name, or two different Companies have the same/similar names...)
  • ASN Numbers - Those on the left-hand side are My Network ones; those on the right-hand side are "Foreign" (Other Company Network) ones
    • Just like between IPsec Encryption Domains, it's a good idea to make sure these don't conflict (tricky when everyone is using the same Private BGP ASN Range)
    • It is the same Core ASN/PE-CE Peering ASN that the SP uses for all Customers
  • CE Devices - I am the Customer (or one of two), and not the SP here; I have no visibility or access to any of the PE's in this topology
    • This is a very different slant to most write-ups and blog posts I've read on the matter; everyone seems to work for an SP bar me!
  • AS-Override - This is applied at My Company end only; the "Foreign" Company are not performing AS-Override
    • So the AS_PATH they "advertise" to me contains the raw SP ASN for their own CE-PE Peering,Their CE1 <->  PE2 and Their CE15 <-> PE66

What I thought would happen

Caveat - apparently, Cisco IOS doesn't let you use AS-Override in the Global Routing Table (GRT, y'know, the one that's not in an "address-family" command); but it sometimes does (worked on my ASR1K's), and that's not the point of this post.

Focussing on My Company Data Centre - and ignoring the "Southbound" eBGP Peering from this DC into MPLS IP VPN/VRF VPNN123456 - here's an example of the Prefix I'm looking at, received from "Foreign" Company:

CE1#172.31.0.0/24 via <DC-Router1>, AS_PATH: 65007 1234 64999

Now, if we look at the "Southbound" eBGP Peering towards My Company IP VPN/VRF VPNN123456, I want to re-advertise "Foreign" Company Prefix 172.31.0.0/24 onward, via VPNN123456, into My Company Other Campus DEF (bottom-right). Given the "as-override" command is applied towards the SP's PE Router, I expected the "find-and-replace" operation to work in a similar (outbound) manner. That is, for this configuration on my CE1 Router @ My Company Network, Data Centre ABC:

CE1#
router bgp 65432
 neighbor 192.168.0.1 remote-as 1234
 neighbor 192.168.0.1 as-override

I thought my CE1 Router would therefore rewrite it's own AS65432 (Local ASN, CE1 Router) with the SP's AS1234 (Foreign ASN, CE1 Router perspective) - so an AS_PATH that actually looks like this, to the downstream PE1 (and any other Routers) on VPNN123456:

PE1(VRF "VPNN123456") or CE99#172.31.0.0/24 via 192.168.0.2, AS_PATH: 65432 65439 65007 1234 64999

 ...but that's not how AS-Override works here.

What actually happens

It transpires the "find-and-replace" behaviour isn't working with the "find" parameter I think it is. If I use some colouring here, this will be easier to see. If we show the entire AS_PATH (including the Routers at either end, which you normally wouldn't see in BGP outputs), here's what you've got for Prefix 172.31.0.0/24 going all the way to CE1 @ My Company Data Centre ABC:

  • 64999 1234 65007 65439 65432 1234 65430

I appreciate this runs inverse/reverse to the AS_PATH that CE1 actually sees; but bear with my incorrect directional thinking here. So the part I'm focusing in on is between CE1 <-> PE1, or this part:

  • ...65432 1234...

At this point, in my head, I'm thinking "The neighbour command is applied outbound to the 192.168.0.1 SP PE1 peering, so it must use this relationship in the find-replace activity", so I'm thinking, after the AS-Override rewrite, it looks like this:

  • 64999 1234 65007 65439 65432 65432 65430

Here's the kicker

The reality is that AS-Override doesn't care about eBGP Peering relationships; it acts as a dumb "find-replace" algorithm, but it uses the eBGP Peering configuration to get it's "find" parameter, by looking at the ASN value after the "remote-as" command, so here for CE1:

  • router bgp 65432
     
    neighbor 192.168.0.1 remote-as 1234

What it then "dumbly" does is looks at the entire AS_PATH it already has, and simply replaces the <REMOTE_AS> value with it's <LOCAL_AS>, before "advertising" this out, so for CE1 it would do this instead:

  • 64999 65432 65007 65439 65432 65432 65430

Which completely broke my thinking, as I hadn't appreciated that a downstream Router could overwrite an AS_PATH entry that happened much earlier-on in the formation of the AS_PATH (i.e. for a Peering Association it wasn't involved in, so how could it dare overwrite anything to do with that?).

So what next

For the example given, we actually ended up moving all this entirely, such that we had a PE-like Router where we could control ingress/egress into both IP VPNs (and AS-Override in both directions, between both IP VPNs) - but this isn't always possible. Technologically, it's easy to look dismissively at the Scenario Topology; but if you step back a bit, you appreciate our hand was forced. As I described earlier, this is a politically complex setup, with various MSPs and SIs - and as you can see, although CE15 sits in "our" DC (actually an MSP, but anyway...), it's actually a CE Router of our "Foreign" (think Extranet) Company's IP VPN (VPNN654321); which they just so happen to have with the same SP that we have Our Company IP VPN (VPNN123456) with.

Sure, this isn't a great place to be - but (in that time-honoured phrase), "It is, what it is"; looking longingly at CCNP and CCIE Greenfield Exam Topologies isn't making this self-rectify. We were fortunate because we had the capability to entirely redesign this (something for another blog post), but if we hadn't, there's a whole manner of constraints here causing pain, such as:

  • SP won't let us reconfigure their PEs on either IP VPN/VRF (so no quick-win "Bang AS-Override on PE66 and PE1" for you)
  • Commercials mean we can't collapse-out the CE15 <-> PE66 arrangement
  • CE1 / Data Centre ABC doesn't just exist for this flow (so no quick-win "Bang the VPNN123456 eBGP Peering into a VRF Lite instance, instead of the GRT"

What's the point then?

Ignoring the goal of getting this working, this was a useful real-world exercise, as it taught me:

  1. BGP AS-Override is dumb, and will quite happily assume the <REMOTE_AS> to <LOCAL_AS> Peering is the only one that contains the <REMOTE_AS>, which couldn't possibly already be in the AS_PATH
  2. BGP is not VRF-aware; it's rules of split-horizon are there to annoy me and rob me of sleep
  3. Stop reading "neighbor" commands and assuming they imply the directionality of the thing they are doing
  4. Googling for issues like this throws up limited results, because everyone else seems to be able to access the SP PE Routers
  5. I need to flip-round the way I think about AS_PATH as "Destination-to-Source" rather than "Source-to-Destination"

When BGP SoO Site of Origin goes the wrong way

Sunday, 12 Aug 2018

BGP Site of Origin (SoO)

I have a scenario where an "internal" Service Provider (SP) MPLS Network interfaces with a Third Party's MPLS Network, as an IPVPN - rather than a true MP-BGP Handoff; or in other words, "I happen to know it's underpinned by MPLS so I'll call it that, even though technically it's not MPLS Presentation to me" (the same way most Enterprise Network shops refer to their WAN as "MPLS").

Unlike the base assumption of most Cisco articles on SoO, I don't actually control the Provider Edge (PE) Routers on this Third Party (let's say, "BT") MPLS Network; and nor am I the Third Party themselves. What I'd like to do is identify Prefixes I have on my IPVPN "Overlay" MPLS Network, from CE Routers on said IPVPN Overlay Network that I do control, and block them from coming back into my own SP MPLS Network. I thought BGP Site of Origin (SoO) might be my friend here...

The Scenario Topology

undefined

This is the Scenario Network Topology, showing:

  • 2x My MPLS SP Network Provider Edge (PE) Routers
  • 1x My MPLS SP Network Route Reflector (RR) Router
  • 2x BT MPLS PE Routers (Location Unknown)
  • 2x eBGP Peerings from My <-> BT MPLS PE Routers
  • 2x Repeated BGP SoO Communities (65432:999), applied to VRF (IPVPN) "BLAH"

What I thought would happen

Given the BGP SoO Attribute is applied towards the BT (Third Party MPLS Network) PE Router, I thought I'd be able to jump on one of my MPLS Enterprise Network CE Routers, and see the SoO Attribute 65432:999 applied, as it made sense to me that this configuration would "advertise" the BGP SoO Extended Community from Me -> BT:

PE1#
router bgp 65432
 address-family ipv4 vrf BLAH
  neighbor 192.168.0.1 remote-as 2856
  neighbor 192.168.0.1 send-community both
  neighbor 192.168.0.1 soo 65432:999

Where I then duly hop onto my CE1 Router and issue the following, expecting to see the SoO Tag for my VRF "BLAH" 99.99.99.99/32 Network:

undefined

But, no dice - it's just a normal IPv4 BGP Prefix with no Extended Communities. What gives?

What actually happens

At this point, confused, I start to wonder if I'm misunderstanding what SoO does and is, and getting confused between the simplistic Cisco and Internet examples of SoO (which are aimed at Service Providers, from their perspective, towards a singular Customer Edge/Customer) - so I poke around. The first point I poke around on is a "non-SoO Tagging PE Router', PE99 - which has an attachment to VRF "BLAH" (an ingress/attachment point into My MPLS Network, for VRF "BLAH"; but performs no SoO tagging) - and see what I see:

PE99#sh ip bgp vpnv4 vrf VLAH 99.99.99.99/32
<snip>
99.99.99.99/32
  ...Extended Community: SoO:65432:999 RT...

Which starts to make it click - the SoO must be applied "the wrong-way-around" from what I thought, and be an "ingress only" behaviour, as otherwise I wouldn't see it this side of the MPLS CE-PE Network Handoff fence. Or more succinctly, this is the direction/Routing Domain the SoO Tag is applied into:

undefined

Even though I first looked at this part of the SoO command as an "advertise-out"/egress behaviour; it's actually a "Match Packets from this BT Peer, mark them with this when advertised deeper back to us" behaviour. Because I looked at this part of the config as an "advertise SoO to neighbour" behaviour:

PE1#
  neighbor 192.168.0.1 soo 65432:999

Which is different to what I first expected ("advertise-out" behaviour), which would have been this:

undefined

What have I learned?

In effect, then, depending on your "directional thinking", based on the Cisco IOS syntax, you might be unpleasantly surprised by how this works - to my mind, it's working the wrong-way-around from what the config syntax would suggest. What actually happens, as a result of just one line of config, is:

  1. On PE1# (My<->BT Network 1st CE-PE Handoff)
    1. Apply SoO Tag 65432:999 inbound/ingress (BT->Me) for all BT-side Prefixes
    2. Advertise this SoO Tag 65432:999 deeper back to My SP Network (not BT's at all)
  2. On PE99# (Any other CE-like Router on My Network; Attachment Point into VRF "BLAH")
    1. See SoO Tag 65432:999 on BT-native Prefixes
    2. Do nothing about it; "advertise on" SoO Prefix (don't strip it out on re-advertise to another PE/Router)
  3. On PE2# (My<->BT Network 2nd CE-PE Handoff)
    1. See SoO Tag 65432:999 pre-egress/outbound (just before Me->BT) for the BT-side Prefix
    2. Because the Me<->BT eBGP Peer has the same SoO Tag set, don't allow it out to BT Router ("reverse behaviour")

The same would then happen for BT->PE2->My MPLS->PE1, performing an overlapping-behaviour for the secondary/dual-homed path between My Network<->BT's Network.

So these bits are wrong then

Which means, given SoO is an "inbound behaviour", not an "outbound" behaviour, the whole concept of tagging these with 65432:999 as the SoO Tag doesn't make sense; it probably should be 2856:999, to show these are BT-native, not My Network-native.

It also means I should re-think why I'm using SoO here, as a Tag+Block/Route Map technique, using bog-standard BGP Communities, might be a better fit for the behaviour I wanted.

I've been here before

Sadly, I've fallen victim to this presumed "outbound behaviour of the config" before with my friend BGP AS-Override, which also has a strange "not the way you might expect" behaviour, but that's one for another blog post. Key points here are:

  • Trust nothing
  • Lab everything
  • Assume that Cisco Support Forums write-up that looked exactly like your scenario was too good to be true

The difference between BGP RD and RT

Friday, 10 Aug 2018

BGP Route Descriptor and Route Target

Let me caveat this post by saying I'm not a Service Provider (SP) kid by trade; I spend my life doing Enterprise, Data Centre and Wireless - so all this MPLSery is new territory for me, and my imaginary sidekick-dog friend ("Hi Jake!") - which means this might be technically incorrect, but this is how the concepts of RD and RT finally "clicked" for me.

How it was explained to me

undefined

When I first starting Googling for Dear Life (TM) about this (because I needed to spin up a new VRF/IPVPN/L3VPN on our MPLS Network), and looked at a few existing config excerpts, I thought they were both the same thing, which seems valid:

vrf definition ADVENTURE-TIME-VRF
 rd 192.168.0.1:999
 route-target export 65432:999
 route-target import 65432:999

I didn't really question the fact that the Export/Import Route Target (RT) was the same (and didn't know about "Full Mesh VRF" vs "Hub-and-Spoke VRF"), but it did strike me as odd that the RD wasn't the same as the RT, given all the explanation I'd read said things like:

The RD is used to keep all prefixes in the BGP table unique between Customers or VRFs...

Which I read thinking:

"Hmm, that makes sense; BGP will just append the RD in-front of the Prefix, to identify the VRF it belongs to. But wouldn't that mean the RD should be the same for each PE Router, the same for each instantiation of that VRF/Customer across the network?"

So then why the differing RD from the RTs?

Why bother with the extra admin work of creating a different value each time, between the RD and RT?

How I now understand it

undefined

When I started exploring Full Mesh VRF vs Hub-and-Spoke VRF, it started to click into place - the RT and RD aren't really related, and I think there's some missing text from the common definition of how RD's are enacted:

  • RD = Route(r) Descriptor
  • RT = Rout(ing Table) Target

When I looked around the configs we had elsewhere, the pattern become clear; it decomposed like this:

vrf definition <VRF Human-friendly Name>
 rd <Router Loopback0>:<VRF RT No>
 route-target export <Router ASN>:<VRF RT No>
 route-target import <Router ASN>:<VRF RT No>

It's starting to click

Then you step back a bit more, and realise the VRF Name and RT/RD have pretty much no association (and then it suddenly clicks what they mean when they say "Locally Significant"...), and we - as humans - use the same VRF Name everywhere because it's easier for us, like a sort of "Poor Man's DNS for VRF RTs". So there's no reason this config wouldn't just stitch VRF "Bob" to VRF "Jane" between two Routers in the same MPLS Domain - but it'd be a pain in the arse to troubleshoot when it scaled to more than a few Routers:

Router_PE1#vrf definition Bob
 rd 192.168.0.1:999
 route-target export 65432:999
 route-target import 65432:999

Router_PE2#vrf definition Jane
 rd 192.168.0.2:999
 route-target export 65432:999
 route-target import 65432:999

Great Scott! He's got it!

Which is when it clicks - when you look at two Router's configurations and realise the RT is the same, but the RD changed; within what we've established is the same VRF "Container" (even though we renamed it across Routers, to cause pain to that guy in Ops that looked at our wife wrong during that Christmas Do, yeah - "Bob"...). So roughly then:

  • An RD can be thought of as the "Router Descriptor"
    • i.e. "Who injected that Prefix into my VRF?"
    • Probably makes sense to use a Loopback, or unique attribute of a Router; then you can jump on your Route Reflector (RR) and have a quick "Whodunnit?"
      • Router_RR1#sh ip bgp vpnv4 all | sec <Router Loopback0>:<VRF RT No>
  • An RT can be thought of as the "Routing Table Target"
    • i.e. "So that's just a VLAN-equivalent Tag for a VRF Container on the MPLS Domain then..."
    • If it's the same RT you're import/exporting everywhere, we're rocking Full Mesh; if it's not (or I'm suddenly doing loads of import statements/one export statement, or vice versa), we're looking at a pesky Hub-and-Spoke
      • Got multiple RT Import statements and one Export? You're probably on a Hub Router (for that VRF)
      • Got one RT Import statement and multiple Exports? You're probably on a Spoke Router (for that VRF)

Am I right here?

That's how I understand all this MPLS VRFery anyway; if I'm wrong, why not:

  • Tweet me @notworkd and tell me "U iz well wrong, Bruv..."
  • Write a comment below and tell me "Dude, do you even MPLS, Bro?"

 

I'm not Technical, but...

Sunday, 22 Jul 2018

A day in the life of

There you are, describing the latest solution/thesis you have to a problem, Project or task, and then someone comes along and says:

I'm not technical, but...

And just like that, you're thrown sideways. Disparaged. Condescended. Belittled. Siderailed.

My problem here isn't the content that's about to follow - it could well be valid (it could well not, too), and could change the direction of the idea to the right direction. My problem is the derision and disdainful manner this is normally delivered to me in. I mean, why even include the prelude and the "but"; if you've got something to contribute, jump in - it's what you'd do in a normal business conversation

You wouldn't say that to a Doctor

Would you say the same to another field or vocation that you were more familiar with - maybe a Doctor or Healthcare Professional? What about another field you popularly know to exist, but probably don't have exposure to - maybe a Nuclear Physicist?

It's not that your opinion is invalid; you have the right to opine about anything, in the same way I would. It's that your presentation of said opinion is disrespectful - not just of me, but also of:

  • My chosen career profession
  • The financial and personal background
  • The context of your and my employment
  • Our respective positions to our employer

But for some reason, you think that the inclusion of this prelude with a little "but" counteracts all these.

Respect that I'm here for a reason

  • I'm not a Project Manager; I'm not paid to manage time and cost of delivery (but I can cost-up and cook a three-course meal)
  • I'm not a Business Analyst; I'm not paid to analyse business requirements against our employer's strategy (but I can work out the difference to my family of buying a shiny new MacBook vs getting that larger house we could all do with)
  • I'm not a Commercial Analyst; I'm not paid to understand common costings or understanding market economics (but I do recognise that buying artisanal gluten-free bread should cost more than a standard white loaf)
  • I'm not a Legal Consultant; I'm not paid to understand the laws my employer is subject to, or loopholes within them (but I do recognise that torrenting Adobe Photoshop is a lesser crime than killing a man)

We all know how to do things we're not paid to do; that's why opinions can be valid. We're not paid to know all the things we can do; that's why you need to respect that I exist here for a reason, the same way I respect you exist.

If you don't understand or accept something, feel free to question and interject; but don't add an insulting prelude before you do, and recognise that the person you're talking to was not only employed to do the very thing you're about to question, but has also legitimately built a career upon it.

I'm not business; but...

Butt out.

Use fallocate and SCP to quickly test Cisco network throughput

Monday, 16 Apr 2018

A rubber band, a paper clip and a drinking straw...

In a pinch where you can't use iPerf to test network throughput - either because you can't access/RDP onto a Windows/Linux host, or maybe can't download iPerf from the big bad Interwebs?  I've been here before, and armed with the following tools (like a Network MacGyver), you can use Secure Copy (the FTP of the SSH world) and a bit of Linux standard binary to do much the same job:

  • Cisco (or Juniper) Switch or Router
    • RW/Admin/Priv 15 access to write a bit of config
  • Linux Box (for fallocate binary)
  • Network connectivity between said Linux Box and Cisco Router/Switch

Method

On the Linux Box

  1. Login to your Linux Box
  2. Issue the following command to create a 1 GB test file called "1gb-test.bin":
    1. fallocate -l 1G 1gb-test.bin
  3. Check your file is 1 GB in size:
    1. ls -l 1gb-test.bin

On the Cisco Switch/Router

  1. Login to your Cisco Router or Switch
  2. Enable SCP File Copy:
    1. # Cisco ASA Firewalls
      conf t
      scp copy enable
      end
      
      # Cisco Switches/Routers
      conf t
      ip scp server enable
      end
  3. (If ASA Firewall) Check your Management Firewall Rules allow SCP (TCP/22) transfer to the Cisco Switch/Router
  4. Get the Disk identifier for your SD Card/NVRAM (usually it's "disk0:" or "bootflash:") and check you have enough free space (1 GB in this example, or 1,073,741,824 bytes) for the file transfer:
    1. dir

Back on the Linux Box

  1. Copy your 1 GB test file "1gb-test.bin" from the Linux Box onto the Cisco Switch/Router/Firewall (in this example, my Router is running Disk0: as the NVRAM/Flash Volume):
    1. scp -v 1gb-test.bin <USERNAME>@<CISCO_IP_ADDRESS>:disk0:1gb-test.bin
  2. Watch the SCP File Transfer statistics, and/or...

Back on the Cisco Switch/Router

  1. Check the tx/rx rate on the interface you are expecting traffic to come in on:
    1. sh int | i Interface|Desc|rate|tx|rx
  2. (ProTip) If you've not change the stock setting, it's only sampled every 5 minutes; change that to 30 seconds (or similarly more-frequent) with:
    1. conf t
      interface <X>/<Y>
       load-interval 30
      end

Results

That's it; if you're using the correct interface you are bothered about, you are now doing a TCP/22 (SSH/SCP) transfer of a 1 GB test file from a Linux Box to your Cisco Router/Firewall/Switch. Bear in mind that it might not be the throughput rate you are expecting (or what the LAN/WAN Link can actually perform at), due to a few limiting factors:

  • NVRAM/Flash Medium transfer speed (microSD Cards in ASR's are faster than, say, CompactFlash in older ISR's)
  • CoPP (Control Plane Policing)
    • Technically, SCP is Control Plane operation to the Router/Switch rather than Data Plane through the Router/Switch, so your SCP copy might be being rate-limited by this

Don't forget to delete your test file when you're done, and note if the file copy doesn't complete, the file isn't pre-allocated on IOS - so a subsequent "dir" will show no data written to disk.

I've personally found this useful for minimal "stress-testing", or to try and invoke some legitimate LAN/WAN traffic to show up on a NetFlow Collector or SNMP Polling NMS (maybe something like LibreNMS).

Home