For some reason routes with own ASN are not imported into default VRF.
Maybe also others. These routes forward packets through the firewalls.
As long as both exits are up this is not a problem, because routes
going to peer exit don’t include this exit’s own ASN.
If the peer goes down, all remaining routes sent by firewalls have our
own ASN and are not imported into default VRF, so L3 servers lose
connectivity to internal networks.
If the exit strips own ASN from received routes, importing works OK.
We strip both our and peer’s ASNs to keep path lengths the same.
This has involved an indecent amount of poking knobs and knobbing
pokes and it might cause other issues elsewhere.
Routed through and mostly dropped by the firewall, of course. So we
don’t necessarily have to do NAT for everything that comes from the
old / USI network.
Turns out that while Cumulus supports “up to” 255 VRFs, no switch it
runs on supports more than 64. So we have to turn down paranoia and
put internal networks for each tenant in the same VRF.
This commit just ensures VRF definitions are not duplicated on exits.
Ten minutes to set up and ten hours to convince Ansible to not be
quite so retarded. The list2dict filter seems to be the (or another)
missing piece. Now let’s rewrite everything else using it. Or not.
And group them into vrf_prefixes for VLAN networks and bgp_prefixes for
servers plugged directly into fabric.
This should reduce the number of queries to NetBox when configuring
firewalls and exit switches. Not sure but I think set_fact helps to
avoid queries (as opposed to setting group_vars).
… instead of generating them from prefixes. A NetBox script can be
used to create and configure all necessary data for a new VLAN.
Instead of VLAN roles “inside" and “outside” we now create separate
VRFs for inside VLANs to match the actual exit/firewall configuration.
The “outside” VRF is for all VLANs that are directly accessible from
the internet.