Disappearing DNS entries when your CNAME TTL differs from your PaaS Provider's

Saturday, 26 Jan 2019

The dreaded "Page can't be displayed" error

Most people in the field of IT or Networking will have seen this lovely Internet Explorer error, and immediately recognised their day was about to change course away from the schedule:

undefined

The why can vary massively; for this blog post, we'll look at one case in point - what happens when your DNS Time to Live (TTL) record, on your CNAME, doesn't match-up with your Platform as a Service (PaaS) provider's A Name. But first, a bit of background here - names changed to protect the innocent.

The Scenario with the PaaS Provider

We've got a Web Application that we've decided to farm-out to a PaaS Provider, which used to be on-premises (or "on-prem" for you cool Cloud Kids). It's very important to the Business, but for the purpose of technology employed it's nothing special - think a HTTPS Website, where the PaaS provider does DNS-based "Elastic" (boing!) Load Balancing - also known as GSLB, but the new Cloudy World has to re-invent the terms we're already used to... *grumble* *grumble*

Let's throw in some made-up pseudonyms to anonymise this a bit, and add some context:

  • My Employer (Enterprise Business, or "the Business")
    • Name - MyCompany Ltd
    • Main External URL/Domain - mycompany.com
    • Main Internal URL/Domain - prod.mycompany.uk
  • PaaS Provider
    • Name - PaaS Co. Ltd
    • Main PaaS URL/Domain - paasco.com
    • Cloud Environment Name - PaaSCloud
    • Use Load Balancers from - BigAssLoadBalancers (Vendor)

Because the Business (rightly) thinks that a new PaaS URL of https://bigassloadbalancers-appnameprodmycompany-paascloud.paasco.com might not be as easy-to-remember as the old on-prem (yes, I'm trying to bait you with that phrase) one of https://appname.prod.mycompany.uk; and because we've got no choice about the PaaS URL, we've taken the decision to make a new sub-domain of *.paascloud.mycompany.com. While we're there, we think we'll sort out the outmoded concept of Internal (prod.mycompany.uk) vs External (mycompany.com) URLs, because this is all hosted off-prem anyway; so it's technically no longer part of our "internal" Domain.

Regardless of PasS Co, MyCompany uses Internal DNS that sits on Active Directory Domain Controllers; for the sake of ease, I'll call this "Internal DNS". MyCompany outsources it's Internet DMZ Data Centres to another MSP; we'll call them MSPCo. MSPCo's only relevance here is that they run our External DNS/Domain (from Internet-facing ns1.mspco.com DNS Servers), whereas we run our Internal DNS/Domain AD-DC DNS Servers. Or, in short:

  • MyCompany
    • Run Internal DNS Servers (i.e. pdc1.mycompany.uk) that are authoritative (but not advertised to Internet) for *.mycompany.uk
  • MSPCo
    • Run External DNS Servers (i.e. ns1.mspco.com) that are authoritative for *.mycompany.com

To give us an easy-to-remember FQDN for the AppName Web Application, we've setup the following which means it will be https://appname.paascloud.mycompany.com:

  • Sub-Domain Space (for all Apps on PaaS Co)
    • *.paascloud.my.company.com
  • Current PaaS Web App (one of the Apps on PaaS Co)
    • appname.paascloud.mycompany.com
  • Internal DNS (MyCompany, i.e. pdc1.mycompany.uk)
    • Authoritatively Resolve requests for *.prod.mycompany.uk
    • Conditional Forward requests for *.paascloud.mycompany.com to ns1.mspco.com
  • External DNS (MSPCo, i.e. ns1.mspco.com)
    • Authoritatively Resolve requests for *.paascloud.mycompany.com

The Problem with DNS Recursion

All that we've achieved above is a series of "forwarders", such that, for the worst case (Internal Client), they'll do this:

  1. Lookup appname.paascloud.mycompany.com against Internal AD-DC DNS (i.e. pdc1.mycompany.uk)
  2. Internal AD-DC DNS Condtional Forwards this to MSPCo External DNS (i.e. ns1.mspco.com)
  3. MSPCo External DNS (i.e. ns1.mspco.com) resolves this to a CNAME of bigassloadbalancers-appnameprodmycompany-paascloud.paasco.com
    1. MSPCo External DNS (i.e. ns1.mspco.com) then Recursively Resolves this against it's upstream DNS Provider (let's say dns1.bigisp.com)...
    2. ...Which queries the Root DNS Servers (i.e. a.root-servers.net), which tell it to ask the PaaS Co Authoritative DNS Servers (i.e. ns1.paasco.com) for the A Name associated with bigassloadbalancers-appnameprodmycompany-paascloud.paasco.com...
    3. ...Which comes back from PaaS Co DNS Servers (i.e. ns1.paasco.com) as Public IP Address 203.0.113.234 (not real, check out RFC 5737 - IPv4 Address Blocks Reserved for Documentation)
  4. Internal AD-DC DNS replies back to the Internal Client, for a request of appname.paascloud.mycompany.com, with:
    1. (The CNAME) bigassloadbalancers-appnameprodmycompany-paascloud.paasco.com
    2. (The A Name) 203.0.113.234

Phew, there's a lot of steps eh? But at least we're out of the woods now, the client has the IPv4 Address it needs, so what's the "Page not Displayed" thing all about?

Pesky DNS TTLs

Here's the bit where the hierarchy of recursion in DNS starts to 1-up you, and the bad day kicks in - perhaps as known all-too-well by these graffiti artists:

undefined

Firstly, a caveat - all of the below may be different for your scenario, depending on how MSPCo DNS Recursion is/isn't setup.

If we make use of the lovely nslookup tool on Windows, here's what we can deduce for our good response (i.e. when the page actually displays, rather than the dreaded IE "Page not Displayed" error). Remember that pdc1.mycompany.uk is my Internal DNS Server (for this example anyway, in reality AD has a Parent/Child Regional Domain Controller hierarchy, so each Client uses a different AD-DC):

C:\Users\NervousAdmin>nslookup
> set debug
> server pdc1.mycompany.uk
<snip - goes off and resolves pdc1.mycompany.uk to IP 10.0.1.99>
> appname.paascloud.mycompany.com.
Server: pdc1.mycompany.uk
Address: 10.0.1.99

------------
Got answer:
 HEADER:
 opcode = QUERY, id = 24, rcode = NOERROR
 header flags: response, want recursion, recursion avail.
 questions = 1, answers = 2, authority records = 0, additional = 0

 QUESTIONS:
 appname.paascloud.mycompany.com, type = A, class = IN
 ANSWERS:
 -> appname.paascloud.mycompany.com
 canonical name = bigassloadbalancers-appnameprodmycompany-paascloud.paasco.com
 ttl = 7200 (2 hours)
 -> bigassloadbalancers-appnameprodmycompany-paascloud.paasco.com 
 internet address = 203.0.113.234
 ttl = 60 (1 min)
<snip>
------------
Name: bigassloadbalancers-appnameprodmycompany-paascloud.paasco.com
Address: 203.0.113.234
Aliases: appname.paascloud.mycompany.com

Given the response above is good (when everything is working), what does the above tell you? If we focus on the TTL sections, you'll see Windows has cached two responses here:

  1. appname.paascloud.mycompany.com -[CNAME]-> bigassloadbalancers-appnameprodmycompany-paascloud.paasco.com, cached for 7200 seconds (or 2 hours)
  2. bigassloadbalancers-appnameprodmycompany-paascloud.paasco.com -[A Name]-> 203.0.113.234, cached for 60 seconds (1 min)

So what happens in 60 seconds, when that A Name expires then? Let's find out - the ">" shows you are within nslookup, so just hit the Up key, and Enter to re-lookup "appname.paascloud.mycompany.com." (as per prior posts, the appended dot means "just this exact FQDN, and no additional DNS Suffixes"), eventually you'll notice the bigassloadbalancers-appnameprodmycompany-paascloud.paasco.com section goes to a TTL of 0:

> appname.paascloud.mycompany.com.
Server:  pdc1.mycompany.uk
Address:  10.0.1.99
<snip - only interested in the CNAME ttl section>
ANSWERS:
<snip>
-> bigassloadbalancers-appnameprodmycompany-paascloud.paasco.com 
 internet address = 203.0.113.234
 ttl = 0

But you'll notice your browser access to https://appname.paascloud.mycompany.com works fine during these tests; until you do the nslookup again, after the "ttl = 0" response. Now, there be dragons.

Uh-oh, where's my response gone?

When you refresh again, your heart will drop, your bum will tighten, your browser access to https://appname.paascloud.mycompany.com will stop working, and you'll see this:

C:\Users\NervousAdmin>nslookup
> set debug
> server pdc1.mycompany.uk
<snip - goes off and resolves pdc1.mycompany.uk to IP 10.0.1.99>
> appname.paascloud.mycompany.com.
Server:  pdc1.mycompany.uk
Address:  10.0.1.99

------------
Got answer:
    HEADER:
        opcode = QUERY, id = 28, rcode = NOERROR
        header flags:  response, want recursion, recursion avail.
        questions = 1,  answers = 1,  authority records = 0,  additional = 0

    QUESTIONS:
        appname.paascloud.mycompany.com, type = A, class = IN
    ANSWERS:
    ->  appname.paascloud.mycompany.com
        canonical name = bigassloadbalancers-appnameprodmycompany-paascloud.paasco.com
        ttl = 6926 (1 hour 55 mins 26 secs)

<snip>
------------
Name:    appname.paascloud.mycompany.com

Which will give you your dreaded "Page not Displayed friend", for exactly another 1 hour, 55 minutes and 26 seconds.

And how do I know that? Because that's what the TTL says that CNAME entry will stay in your cache for - regardless of the fact your Windows Client hasn't had a recursive response of the actual IP Address that it ultimately resolves to (203.0.113.234).

So what's the fix? Firstly, lets touch on DNS TTL. This isn't much different to IPv4 TTL; it just means that, once the TTL hits 0, the entry will be purged from your local DNS Cache. What happens next is the crucial part, dictated by the "DNS Response Hierarchy" your response had; if it's just a straight single-level hierarchy (i.e. domain.com -> 203.0.113.1), then your Client will go off and re-request the DNS Request to lookup domain.com to an IP Address.

But our case is different, and not in a good way - our "DNS Response Hierarchy" looks like this:

  1. (Parent) Fetch appname.paascloud.mycompany.com
    1. (Child) If you got here, now fetch bigassloadbalancers-appnameprodmycompany-paascloud.paasco.com

But our TTL's look like this:

  1. (Parent) appname.paascloud.mycompany.com = TTL <bigger than "Child">
    1. (Child) bigassloadbalancers-appnameprodmycompany-paascloud.paasco.com = TTL <smaller than "Parent">

That's not what we want at all; given these are two differing DNS Administrative Domains (owned and operated by two differing Companies - MSPCo for appname.paascloud.mycompany.com and PaaS Co for bigassloadbalancers-appnameprodmycompany-paascloud.paasco.com), we (MyCompany) don't have any direct control over these. Regardless though, we need them to flip-it-around so that this happens:

  1. (Parent) appname.paascloud.mycompany.com = TTL <smaller (or same) than "Child">
    1. (Child) bigassloadbalancers-appnameprodmycompany-paascloud.paasco.com = TTL <bigger than "Parent">

This way, when the "Parent" (initial, or root, or "actual FQDN I wanted the IP for") TTL expires, it will remove the "Child" (CNAME) entry with it; which means the DNS Lookup process will re-occur, and we'll happily get an IPv4 Address back. Technically simple, but you try and explain that to MSPCo and PaaS Co, and you'll find your "shouty voice TTL" quickly gets towards that precious 0...

Remotely changing the Management SVI on a Cisco 3524XL

Friday, 25 Jan 2019

A Cisco 35-what-what now?

You probably haven't heard of a Cisco 3524XL. You're possibly sat reading this thinking: "I've heard of the Nexus 3K, sure, but WTF is a 3520-Seires, am I behind already?". The answer is no, you aren't (or yes, you are if you're unfortunate enough to know what a C3524XL is) - but don't take my word for it, let's ask what Danny Dyer thinks:

undefined

Why are you blogging about a Cisco Switch that went EoL over a decade ago?

Indeed, the Cisco Catalyst 3524XL went End of Life in 2002 - far before I even started working in the field of Networking. So why am I talking about it here? Well, a few reasons:

  1. @DarrenFullwel challenged me to on Twitter
  2. It's got lessons to teach us all
  3. History needs to remind us that banging on the suffix "XL" should only be confined to fast food and t-shirts

Let's focus on what it can teach us - first, a little primer on my chief bugbear with it as a "capable Layer 3 Campus Access Switch".

The C3524XL only supports one SVI

That's not too bad you might think; you probably only want to give it a Management IP Address to the SVI, and let something more capable handle inter-VLAN Routing. But what happens when you want to do something like this:

  1. Remotely re-IP Address the Management IP (and the boss won't let you hire a van and take the day to drive to the arse-end of nowhere)
  2. Remotely change the configuration your colleague left with it using VLAN1 as the SVI, but everywhere else uses VLAN55 for Switch Management (and the boss still won't let you hire that van)

Any ideas on how you're going to sort that out, remotely? Let me introduce you to the age-old Network Engineering practice of...

Squeaky bum time

undefined

There's nothing for it, soldier; we've got two basic choices to do this remotely, and we're gonna need a stock of toilet roll for both:

  1. Use a SNMP-based config upload tool like Network Billy (coincidentally the finest thing to have come out of a GeoCities website)
  2. Use a TFTP-based config upload tool (like TFTPd32)
  3. Keep hassling the boss for that van

I went for option two, TFTP-based; but the basic concepts are the same. Firstly, we're going to double-check what we want to achieve; for my scenario, that's two things:

  1. Disable VLAN1
  2. Migrate the Management IP to VLAN55 (172.31.0.0/24)
    1. I'll also have to change this upstream, so that my L3 Default Gateway Switch/Router moves 172.31.0.0/24 from VLAN1 to VLAN55, or have both co-exist for a while and VRF Lite one VLAN off from the other; but that's for another blog post

To do this interactively, I'd want to do something like the following:

conf t
int vlan1
 no ip address
 no desc
 shut
vlan 55
 name Mgmt_VLAN
int vlan55
 desc Management VLAN
 ip address 172.31.0.99 255.255.255.0
 no shutdown
ip default-gateway 172.31.0.1
end
wr mem

But we don't have that luxury, so we'll go for a three-step approach.

Step 1 - The interactive bit

We need to setup the VLAN (just at Layer 2) ready to go; as we're talking about an archaic C3524XL, depending on the age of IOS on the Switch, that's either going to be the "new Cisco way" (as above), or if you're as unlucky as Dyer thinks, the old VLAN Database method, like this:

C3524XL#vlan database
vlan 55
exit

Regardless of which, we'll then check we've got the VLAN ready to go, and if necessary, add it to any 802.1q Trunk interfaces up to the Core (L3) Switch:

C3524XL#sh vlan id 55
C3524XL#sh int trunk | inc Span|Port|55

Now onward to the offline part.

Step 2 - The offline bit

Firstly, we need to grab the config file off the C3524XL. If you've got TFTPd32 running on your PC (which needs to be accessible from the existing C3524XL VLAN1 SVI IP Address, say your PC is 10.0.0.99), this is just a matter of turning TFTPd32 on, configuring it to a directory and ensuring Winblows Firewall isn't blocking inbound TFTP (UDP/69). Then login to your C3524XL, and do something like this to copy the config from the Switch to your PC:

C3524XL#copy run tftp://10.0.0.99/c3524xl-confg
yes

Now you have the file locally, we'll be editing it in a text editor to make the changes above, and turn it into the startup-config (for the sake of space, I'm only showing the changed lines; the rest of the config needs to be there, you are only Find-Replacing these sections):

<snip - rest of config removed, but would be there>
hostname C3524XL
<snip - rest of config removed, but would be there>
int vlan1
 no ip address
 no desc
 shut
int vlan55
 desc Management VLAN
 ip address 172.31.0.99 255.255.255.0
 no shutdown
<snip - rest of config removed, but would be there>
ip default-gateway 172.31.0.1
<snip - rest of config removed, but would be there>

A few handy hints here:

  • Make sure all your interconnect, Trunks and Management SVI VLAN55 are set to "no shutdown"
  • Triple-check that in your scenario it is actually VLAN 55 for Management; the IP Address is correct and doesn't conflict & VLAN55 exists and would be allowed on the Trunk

Nothing left now but to execute our actions and make rocket go now!

Step 3 - The bit you make a calming brew beforehand for

Now it's crunch time. You've obviously got an RFC Change Request that's approved to do this (because you wouldn't "Lab on Live", would you?), so what's to fear, eh?

Firstly, we upload the amended config file, straight into startup-config:

C3524XL#copy tftp://10.0.0.99/c3524xl-startup.txt startup-config

Then we get paranoid and double-check it copied everything correctly, that we're definitely Trunking that VLAN55 and we've set the Management VLAN 55 to "no shut":

C3524XL#sh start
C3524XL#sh vlan id 55
C3524XL#sh int trunk | inc Span|Port|55

And finally we sup-up that brew, clench the derriere, and invoke the outage-causing Management IP switchover:

C3524XL#reload
yes

Then we wait, and nervously set our local PC Command Prompt "ping-t" going, waiting for it to pop back up with the new Management IP address:

C:\Users\NervousAdmin>ping -t 172.31.0.99

Pinging 172.31.0.99 with 32 bytes of data:
Request timed out.
Request timed out.
Request timed out.
<2-3 nervous minutes later>
Reply for 172.31.0.99: bytes=32 time=13ms TTL=64
Reply for 172.31.0.99: bytes=32 time=13ms TTL=64
[CTRL+C]

Wrapping it up

And there we go; remotely changing the Management VLAN and IP Address of a Switch that's older than time - and hopefully a useful tip if you have a similar single-SVI-only piece of sh... kit. Enjoy!

When BGP AS-Override goes the wrong way

Sunday, 13 Jan 2019

BGP AS-Override

Much like my post on when BGP SoO goes the wrong way, I seem to have a problem with directionality of commands on Cisco IOS - this time, with BGP AS-Override. I came across this in an Enterprise Network (the same kind where we say "MPLS" but actually mean "IP VPN we buy from someone else"), where the ISP we used had an offering they called "Shared Access" - which basically means they'll let you hook an Access Circuit into someone else's IP VPN/VRF with them, as long as you, the ISP and the "VRF Owning Company" co-sign an agreement saying it's allowed.

Why might you want to do this? Think along the lines of Extranets, and furthering the idea that "Everything is just a Line Card" across Company boundaries; particularly useful if you work in the Large Enterprise and Public Sector space, as here there are often strange agreements where multiple Managed Service Providers (MSPs), Systems Integrators (SIs) and sometimes even Service Providers (SPs) (reluctantly) come together to offer a common "Service" back to either the General Public, or perhaps some large Industry Sector. Regardless of the why, the problem is normally the same old BGP-over-VRF limitation - if you use the same ISP for multiple IP VPNs/VRFs, and have end-to-end BGP reachability, BGP doesn't know to turn off it's split-horizon-based-on-ASN functionality; because it just sees the same ASN twice in the AS_PATH, rather than "knowing" that the AS_PATH consists of two differing VRFs/Routing Domains.

The Scenario Topology

undefined

This is the Scenario Network Topology, showing:

  • 2x My Network MPLS CE Network Customer Edge (CE) Router
  • 4x MPLS SP Network Provider Edge (PE) Routers
    • 2x Connected to My Company Network IP VPN/VRF VPNN123456
    • 2x Connected to Other Company Network IP VPN/VRF VPNN654321
  • 2x eBGP Peering from My Company Network < -> SP MPLS PE Router, connected to My Company IP VPN/VRF VPNN123456
  • 2x eBGP Peering from Other Company Network <-> SP MPLS PE Router, connected to Other Company IP VPN/VRF VPNN654321
    • 1x "Foreign Network" CE Router @ My Company Data Centre
  • AS-Override applied on My Network MPLS CE Network (CE) Router (towards My Company IP VPN/VRF VPNN123456)
    • Note that I am "piggy-in-the-middle"

Some notes on SP Terminology

As some of this is specific to using a Third Party SP's MPLS Network, through a "wires-only" IP VPN offering - here's a quick primer on some terminology I'm using, as this will differ between varying SP's:

  • "wires-only" - Means the SP drops a NTE/NTU in My Company's Premises, to which I attach my self-managed CE Router
    • The SP does not manage any of CE Router; I eBGP Peer direct from a Private ASN to the SP's Public ASN (or whatever they use)
    • I'm told this model is more popular in the USA than Europe (but I'm in the UK, so there are exceptions to the rule...)
  • VPNNxxxxxx - The SP-allocated IP VPN/VRF Identifier, so that they can differentiate between their various Customers (they could name their VRF instances by Company Name, but what happens when the Company changes name, or two different Companies have the same/similar names...)
  • ASN Numbers - Those on the left-hand side are My Network ones; those on the right-hand side are "Foreign" (Other Company Network) ones
    • Just like between IPsec Encryption Domains, it's a good idea to make sure these don't conflict (tricky when everyone is using the same Private BGP ASN Range)
    • It is the same Core ASN/PE-CE Peering ASN that the SP uses for all Customers
  • CE Devices - I am the Customer (or one of two), and not the SP here; I have no visibility or access to any of the PE's in this topology
    • This is a very different slant to most write-ups and blog posts I've read on the matter; everyone seems to work for an SP bar me!
  • AS-Override - This is applied at My Company end only; the "Foreign" Company are not performing AS-Override
    • So the AS_PATH they "advertise" to me contains the raw SP ASN for their own CE-PE Peering,Their CE1 <->  PE2 and Their CE15 <-> PE66

What I thought would happen

Caveat - apparently, Cisco IOS doesn't let you use AS-Override in the Global Routing Table (GRT, y'know, the one that's not in an "address-family" command); but it sometimes does (worked on my ASR1K's), and that's not the point of this post.

Focussing on My Company Data Centre - and ignoring the "Southbound" eBGP Peering from this DC into MPLS IP VPN/VRF VPNN123456 - here's an example of the Prefix I'm looking at, received from "Foreign" Company:

CE1#172.31.0.0/24 via <DC-Router1>, AS_PATH: 65007 1234 64999

Now, if we look at the "Southbound" eBGP Peering towards My Company IP VPN/VRF VPNN123456, I want to re-advertise "Foreign" Company Prefix 172.31.0.0/24 onward, via VPNN123456, into My Company Other Campus DEF (bottom-right). Given the "as-override" command is applied towards the SP's PE Router, I expected the "find-and-replace" operation to work in a similar (outbound) manner. That is, for this configuration on my CE1 Router @ My Company Network, Data Centre ABC:

CE1#
router bgp 65432
 neighbor 192.168.0.1 remote-as 1234
 neighbor 192.168.0.1 as-override

I thought my CE1 Router would therefore rewrite it's own AS65432 (Local ASN, CE1 Router) with the SP's AS1234 (Foreign ASN, CE1 Router perspective) - so an AS_PATH that actually looks like this, to the downstream PE1 (and any other Routers) on VPNN123456:

PE1(VRF "VPNN123456") or CE99#172.31.0.0/24 via 192.168.0.2, AS_PATH: 65432 65439 65007 1234 64999

 ...but that's not how AS-Override works here.

What actually happens

It transpires the "find-and-replace" behaviour isn't working with the "find" parameter I think it is. If I use some colouring here, this will be easier to see. If we show the entire AS_PATH (including the Routers at either end, which you normally wouldn't see in BGP outputs), here's what you've got for Prefix 172.31.0.0/24 going all the way to CE1 @ My Company Data Centre ABC:

  • 64999 1234 65007 65439 65432 1234 65430

I appreciate this runs inverse/reverse to the AS_PATH that CE1 actually sees; but bear with my incorrect directional thinking here. So the part I'm focusing in on is between CE1 <-> PE1, or this part:

  • ...65432 1234...

At this point, in my head, I'm thinking "The neighbour command is applied outbound to the 192.168.0.1 SP PE1 peering, so it must use this relationship in the find-replace activity", so I'm thinking, after the AS-Override rewrite, it looks like this:

  • 64999 1234 65007 65439 65432 65432 65430

Here's the kicker

The reality is that AS-Override doesn't care about eBGP Peering relationships; it acts as a dumb "find-replace" algorithm, but it uses the eBGP Peering configuration to get it's "find" parameter, by looking at the ASN value after the "remote-as" command, so here for CE1:

  • router bgp 65432
     
    neighbor 192.168.0.1 remote-as 1234

What it then "dumbly" does is looks at the entire AS_PATH it already has, and simply replaces the <REMOTE_AS> value with it's <LOCAL_AS>, before "advertising" this out, so for CE1 it would do this instead:

  • 64999 65432 65007 65439 65432 65432 65430

Which completely broke my thinking, as I hadn't appreciated that a downstream Router could overwrite an AS_PATH entry that happened much earlier-on in the formation of the AS_PATH (i.e. for a Peering Association it wasn't involved in, so how could it dare overwrite anything to do with that?).

So what next

For the example given, we actually ended up moving all this entirely, such that we had a PE-like Router where we could control ingress/egress into both IP VPNs (and AS-Override in both directions, between both IP VPNs) - but this isn't always possible. Technologically, it's easy to look dismissively at the Scenario Topology; but if you step back a bit, you appreciate our hand was forced. As I described earlier, this is a politically complex setup, with various MSPs and SIs - and as you can see, although CE15 sits in "our" DC (actually an MSP, but anyway...), it's actually a CE Router of our "Foreign" (think Extranet) Company's IP VPN (VPNN654321); which they just so happen to have with the same SP that we have Our Company IP VPN (VPNN123456) with.

Sure, this isn't a great place to be - but (in that time-honoured phrase), "It is, what it is"; looking longingly at CCNP and CCIE Greenfield Exam Topologies isn't making this self-rectify. We were fortunate because we had the capability to entirely redesign this (something for another blog post), but if we hadn't, there's a whole manner of constraints here causing pain, such as:

  • SP won't let us reconfigure their PEs on either IP VPN/VRF (so no quick-win "Bang AS-Override on PE66 and PE1" for you)
  • Commercials mean we can't collapse-out the CE15 <-> PE66 arrangement
  • CE1 / Data Centre ABC doesn't just exist for this flow (so no quick-win "Bang the VPNN123456 eBGP Peering into a VRF Lite instance, instead of the GRT"

What's the point then?

Ignoring the goal of getting this working, this was a useful real-world exercise, as it taught me:

  1. BGP AS-Override is dumb, and will quite happily assume the <REMOTE_AS> to <LOCAL_AS> Peering is the only one that contains the <REMOTE_AS>, which couldn't possibly already be in the AS_PATH
  2. BGP is not VRF-aware; it's rules of split-horizon are there to annoy me and rob me of sleep
  3. Stop reading "neighbor" commands and assuming they imply the directionality of the thing they are doing
  4. Googling for issues like this throws up limited results, because everyone else seems to be able to access the SP PE Routers
  5. I need to flip-round the way I think about AS_PATH as "Destination-to-Source" rather than "Source-to-Destination"

When BGP SoO Site of Origin goes the wrong way

Sunday, 12 Aug 2018

BGP Site of Origin (SoO)

I have a scenario where an "internal" Service Provider (SP) MPLS Network interfaces with a Third Party's MPLS Network, as an IPVPN - rather than a true MP-BGP Handoff; or in other words, "I happen to know it's underpinned by MPLS so I'll call it that, even though technically it's not MPLS Presentation to me" (the same way most Enterprise Network shops refer to their WAN as "MPLS").

Unlike the base assumption of most Cisco articles on SoO, I don't actually control the Provider Edge (PE) Routers on this Third Party (let's say, "BT") MPLS Network; and nor am I the Third Party themselves. What I'd like to do is identify Prefixes I have on my IPVPN "Overlay" MPLS Network, from CE Routers on said IPVPN Overlay Network that I do control, and block them from coming back into my own SP MPLS Network. I thought BGP Site of Origin (SoO) might be my friend here...

The Scenario Topology

undefined

This is the Scenario Network Topology, showing:

  • 2x My MPLS SP Network Provider Edge (PE) Routers
  • 1x My MPLS SP Network Route Reflector (RR) Router
  • 2x BT MPLS PE Routers (Location Unknown)
  • 2x eBGP Peerings from My <-> BT MPLS PE Routers
  • 2x Repeated BGP SoO Communities (65432:999), applied to VRF (IPVPN) "BLAH"

What I thought would happen

Given the BGP SoO Attribute is applied towards the BT (Third Party MPLS Network) PE Router, I thought I'd be able to jump on one of my MPLS Enterprise Network CE Routers, and see the SoO Attribute 65432:999 applied, as it made sense to me that this configuration would "advertise" the BGP SoO Extended Community from Me -> BT:

PE1#
router bgp 65432
 address-family ipv4 vrf BLAH
  neighbor 192.168.0.1 remote-as 2856
  neighbor 192.168.0.1 send-community both
  neighbor 192.168.0.1 soo 65432:999

Where I then duly hop onto my CE1 Router and issue the following, expecting to see the SoO Tag for my VRF "BLAH" 99.99.99.99/32 Network:

undefined

But, no dice - it's just a normal IPv4 BGP Prefix with no Extended Communities. What gives?

What actually happens

At this point, confused, I start to wonder if I'm misunderstanding what SoO does and is, and getting confused between the simplistic Cisco and Internet examples of SoO (which are aimed at Service Providers, from their perspective, towards a singular Customer Edge/Customer) - so I poke around. The first point I poke around on is a "non-SoO Tagging PE Router', PE99 - which has an attachment to VRF "BLAH" (an ingress/attachment point into My MPLS Network, for VRF "BLAH"; but performs no SoO tagging) - and see what I see:

PE99#sh ip bgp vpnv4 vrf VLAH 99.99.99.99/32
<snip>
99.99.99.99/32
  ...Extended Community: SoO:65432:999 RT...

Which starts to make it click - the SoO must be applied "the wrong-way-around" from what I thought, and be an "ingress only" behaviour, as otherwise I wouldn't see it this side of the MPLS CE-PE Network Handoff fence. Or more succinctly, this is the direction/Routing Domain the SoO Tag is applied into:

undefined

Even though I first looked at this part of the SoO command as an "advertise-out"/egress behaviour; it's actually a "Match Packets from this BT Peer, mark them with this when advertised deeper back to us" behaviour. Because I looked at this part of the config as an "advertise SoO to neighbour" behaviour:

PE1#
  neighbor 192.168.0.1 soo 65432:999

Which is different to what I first expected ("advertise-out" behaviour), which would have been this:

undefined

What have I learned?

In effect, then, depending on your "directional thinking", based on the Cisco IOS syntax, you might be unpleasantly surprised by how this works - to my mind, it's working the wrong-way-around from what the config syntax would suggest. What actually happens, as a result of just one line of config, is:

  1. On PE1# (My<->BT Network 1st CE-PE Handoff)
    1. Apply SoO Tag 65432:999 inbound/ingress (BT->Me) for all BT-side Prefixes
    2. Advertise this SoO Tag 65432:999 deeper back to My SP Network (not BT's at all)
  2. On PE99# (Any other CE-like Router on My Network; Attachment Point into VRF "BLAH")
    1. See SoO Tag 65432:999 on BT-native Prefixes
    2. Do nothing about it; "advertise on" SoO Prefix (don't strip it out on re-advertise to another PE/Router)
  3. On PE2# (My<->BT Network 2nd CE-PE Handoff)
    1. See SoO Tag 65432:999 pre-egress/outbound (just before Me->BT) for the BT-side Prefix
    2. Because the Me<->BT eBGP Peer has the same SoO Tag set, don't allow it out to BT Router ("reverse behaviour")

The same would then happen for BT->PE2->My MPLS->PE1, performing an overlapping-behaviour for the secondary/dual-homed path between My Network<->BT's Network.

So these bits are wrong then

Which means, given SoO is an "inbound behaviour", not an "outbound" behaviour, the whole concept of tagging these with 65432:999 as the SoO Tag doesn't make sense; it probably should be 2856:999, to show these are BT-native, not My Network-native.

It also means I should re-think why I'm using SoO here, as a Tag+Block/Route Map technique, using bog-standard BGP Communities, might be a better fit for the behaviour I wanted.

I've been here before

Sadly, I've fallen victim to this presumed "outbound behaviour of the config" before with my friend BGP AS-Override, which also has a strange "not the way you might expect" behaviour, but that's one for another blog post. Key points here are:

  • Trust nothing
  • Lab everything
  • Assume that Cisco Support Forums write-up that looked exactly like your scenario was too good to be true

The difference between BGP RD and RT

Friday, 10 Aug 2018

BGP Route Descriptor and Route Target

Let me caveat this post by saying I'm not a Service Provider (SP) kid by trade; I spend my life doing Enterprise, Data Centre and Wireless - so all this MPLSery is new territory for me, and my imaginary sidekick-dog friend ("Hi Jake!") - which means this might be technically incorrect, but this is how the concepts of RD and RT finally "clicked" for me.

How it was explained to me

undefined

When I first starting Googling for Dear Life (TM) about this (because I needed to spin up a new VRF/IPVPN/L3VPN on our MPLS Network), and looked at a few existing config excerpts, I thought they were both the same thing, which seems valid:

vrf definition ADVENTURE-TIME-VRF
 rd 192.168.0.1:999
 route-target export 65432:999
 route-target import 65432:999

I didn't really question the fact that the Export/Import Route Target (RT) was the same (and didn't know about "Full Mesh VRF" vs "Hub-and-Spoke VRF"), but it did strike me as odd that the RD wasn't the same as the RT, given all the explanation I'd read said things like:

The RD is used to keep all prefixes in the BGP table unique between Customers or VRFs...

Which I read thinking:

"Hmm, that makes sense; BGP will just append the RD in-front of the Prefix, to identify the VRF it belongs to. But wouldn't that mean the RD should be the same for each PE Router, the same for each instantiation of that VRF/Customer across the network?"

So then why the differing RD from the RTs?

Why bother with the extra admin work of creating a different value each time, between the RD and RT?

How I now understand it

undefined

When I started exploring Full Mesh VRF vs Hub-and-Spoke VRF, it started to click into place - the RT and RD aren't really related, and I think there's some missing text from the common definition of how RD's are enacted:

  • RD = Route(r) Descriptor
  • RT = Rout(ing Table) Target

When I looked around the configs we had elsewhere, the pattern become clear; it decomposed like this:

vrf definition <VRF Human-friendly Name>
 rd <Router Loopback0>:<VRF RT No>
 route-target export <Router ASN>:<VRF RT No>
 route-target import <Router ASN>:<VRF RT No>

It's starting to click

Then you step back a bit more, and realise the VRF Name and RT/RD have pretty much no association (and then it suddenly clicks what they mean when they say "Locally Significant"...), and we - as humans - use the same VRF Name everywhere because it's easier for us, like a sort of "Poor Man's DNS for VRF RTs". So there's no reason this config wouldn't just stitch VRF "Bob" to VRF "Jane" between two Routers in the same MPLS Domain - but it'd be a pain in the arse to troubleshoot when it scaled to more than a few Routers:

Router_PE1#vrf definition Bob
 rd 192.168.0.1:999
 route-target export 65432:999
 route-target import 65432:999

Router_PE2#vrf definition Jane
 rd 192.168.0.2:999
 route-target export 65432:999
 route-target import 65432:999

Great Scott! He's got it!

Which is when it clicks - when you look at two Router's configurations and realise the RT is the same, but the RD changed; within what we've established is the same VRF "Container" (even though we renamed it across Routers, to cause pain to that guy in Ops that looked at our wife wrong during that Christmas Do, yeah - "Bob"...). So roughly then:

  • An RD can be thought of as the "Router Descriptor"
    • i.e. "Who injected that Prefix into my VRF?"
    • Probably makes sense to use a Loopback, or unique attribute of a Router; then you can jump on your Route Reflector (RR) and have a quick "Whodunnit?"
      • Router_RR1#sh ip bgp vpnv4 all | sec <Router Loopback0>:<VRF RT No>
  • An RT can be thought of as the "Routing Table Target"
    • i.e. "So that's just a VLAN-equivalent Tag for a VRF Container on the MPLS Domain then..."
    • If it's the same RT you're import/exporting everywhere, we're rocking Full Mesh; if it's not (or I'm suddenly doing loads of import statements/one export statement, or vice versa), we're looking at a pesky Hub-and-Spoke
      • Got multiple RT Import statements and one Export? You're probably on a Hub Router (for that VRF)
      • Got one RT Import statement and multiple Exports? You're probably on a Spoke Router (for that VRF)

Am I right here?

That's how I understand all this MPLS VRFery anyway; if I'm wrong, why not:

  • Tweet me @notworkd and tell me "U iz well wrong, Bruv..."
  • Write a comment below and tell me "Dude, do you even MPLS, Bro?"

 

I'm not Technical, but...

Sunday, 22 Jul 2018

A day in the life of

There you are, describing the latest solution/thesis you have to a problem, Project or task, and then someone comes along and says:

I'm not technical, but...

And just like that, you're thrown sideways. Disparaged. Condescended. Belittled. Siderailed.

My problem here isn't the content that's about to follow - it could well be valid (it could well not, too), and could change the direction of the idea to the right direction. My problem is the derision and disdainful manner this is normally delivered to me in. I mean, why even include the prelude and the "but"; if you've got something to contribute, jump in - it's what you'd do in a normal business conversation

You wouldn't say that to a Doctor

Would you say the same to another field or vocation that you were more familiar with - maybe a Doctor or Healthcare Professional? What about another field you popularly know to exist, but probably don't have exposure to - maybe a Nuclear Physicist?

It's not that your opinion is invalid; you have the right to opine about anything, in the same way I would. It's that your presentation of said opinion is disrespectful - not just of me, but also of:

  • My chosen career profession
  • The financial and personal background
  • The context of your and my employment
  • Our respective positions to our employer

But for some reason, you think that the inclusion of this prelude with a little "but" counteracts all these.

Respect that I'm here for a reason

  • I'm not a Project Manager; I'm not paid to manage time and cost of delivery (but I can cost-up and cook a three-course meal)
  • I'm not a Business Analyst; I'm not paid to analyse business requirements against our employer's strategy (but I can work out the difference to my family of buying a shiny new MacBook vs getting that larger house we could all do with)
  • I'm not a Commercial Analyst; I'm not paid to understand common costings or understanding market economics (but I do recognise that buying artisanal gluten-free bread should cost more than a standard white loaf)
  • I'm not a Legal Consultant; I'm not paid to understand the laws my employer is subject to, or loopholes within them (but I do recognise that torrenting Adobe Photoshop is a lesser crime than killing a man)

We all know how to do things we're not paid to do; that's why opinions can be valid. We're not paid to know all the things we can do; that's why you need to respect that I exist here for a reason, the same way I respect you exist.

If you don't understand or accept something, feel free to question and interject; but don't add an insulting prelude before you do, and recognise that the person you're talking to was not only employed to do the very thing you're about to question, but has also legitimately built a career upon it.

I'm not business; but...

Butt out.

Home ← Older posts