network/README.md

203 lines
10 KiB
Markdown
Raw Normal View History

2023-12-18 10:22:14 +00:00
# FRI network
Ansible playbooks to configure the FRI network. Network configuration resides in [NetBox](https://netbox.fri.uni-lj.si); an overview of core switches and servers can be found in the [topology view](https://netbox.fri.uni-lj.si/plugins/netbox_topology_views/topology/?filter_id=2&show_cables=on&show_logical_connections=on).
## Setup
Install dependencies with `pip install --user -r requirements.txt` or with the package manager. Since querying the API is not very fast, it is helpful to setup Ansible cache, for example by adding the following to `~/.profile` or similar:
export ANSIBLE_INVENTORY_CACHE=True
export ANSIBLE_INVENTORY_CACHE_PLUGIN=jsonfile
export ANSIBLE_CACHE_PLUGIN_CONNECTION=~/.ansible/cache
Devices are accessible on a separate network, reachable through a WireGuard tunnel. For device access an SSH key is required, with the public key authorized for `root` on each device.
## Usage
Create a read-only token in NetBox. Set variables required to access NetBox:
# one for nb_inventory and one for nb_lookup
export NETBOX_API_KEY=<token>
export NETBOX_TOKEN="${NETBOX_API_KEY}"
# same for both
export NETBOX_API=<netbox API endpoint>
Run one-off tasks with (add `--key-file` or other options as necessary):
ansible -i inventory.yml -m ping 'spine-*'
Run a playbook with:
ansible-playbook setup.yml -i inventory.yml -l 'spine-*'
## NetBox data
The following values are used throughout the network and should be defined in a site-wide [config context](https://netbox.fri.uni-lj.si/extras/config-contexts/?q=fri):
* `dhcp`: DHCP server address
* `dns`: list of DNS server IPv4 addresses
* `dns6`: list of DNS server IPv6 addresses
* `domain`: site domain
* `nat`: list of IPv4 ranges used for SNAT and DNAT
* `ntp`: list of NTP server addresses
* `wg_ip`: public IPv4 address for wireguard connections, anycast between firewall nodes
* `wg_net`: client wireguard IPv4 addresses are assigned from this range
* `wg_net6`: client wireguard IPv6 addresses are assigned from this range
### Common setup
For most devices a management interface must be defined to run Ansible scripts, with at least the IP address and default gateway set:
{
"name": "eth0", "type": { "value": "1000base-t" },
"mgmt_only": true,
"mac_address": "98:03:9B:9C:2D:10",
"ip_addresses": [ { "address": "10.20.30.40/24" } ],
"custom_fields": { "gateway": { "address": "10.20.30.1/24" } }
}
The MAC address is only used in some playbooks to set the interface name. All JSON samples in this document are subsets of the inventory as returned by Ansible. Omitted values should be set to null or empty unless stated otherwise (except for foreign keys such as `type` and `role`, where some values are omitted for brevity).
#### L1 setup
To break out a port, create the appropriately named interfaces and disable the original interface:
{ "name": "swp14", "enabled": false },
{ "name": "swp14s0", "type": { "value": "25gbase-x-sfp28" } },
{ "name": "swp14s1", "type": { "value": "25gbase-x-sfp28" } },
{ "name": "swp14s2", "type": { "value": "25gbase-x-sfp28" } },
{ "name": "swp14s3", "type": { "value": "25gbase-x-sfp28" } }
Note that for SN2700 switches only odd‐numbered ports may be broken out; the next even‐numbered port must be disabled as well as the original port in this case. The new ports can be used normally in further configuration.
#### L3 setup
For L3 devices the `asn` custom field must be set. For the fabric and core servers we use [private ASNs above 65000](https://netbox.fri.uni-lj.si/search/?q=65%5B0-9%5D%2B&obj_types=ipam.asn&lookup=iregex).
Each L3 node should define IPv4 and IPv6 addresses on the loopback interface. These are displayed e.g. by traceroute. The IPv4 loopback address is also used as the BGP router ID. For MLAG switches specify the same [VXLAN anycast IP](https://docs.nvidia.com/networking-ethernet-software/cumulus-linux/Network-Virtualization/VXLAN-Active-Active-Mode/) on both peers with the anycast role.
{
"name": "lo", "type": { "value": "virtual" },
"ip_addresses": [
{ "address": "10.34.0.8/32", "role": { "value": "loopback" } },
{ "address": "2001:1470:fffd:3400::8/128", "role": { "value: "loopback" } },
{ "address": "10.34.0.7/32", "role": { "value": "anycast" } }
],
}
Interfaces to L3 servers should have the tenant custom field defined:
{
"name": "swp9", "type": { "value": "100gbase-x-qsfp28" },
"custom_fields": { "tenant": { "slug": "lrk" } }
}
The tenant determines which prefixes can be received on this interface. It is important that all user‐facing ports either have a tenant defined or are disabled. Interfaces without a tenant are assumed to connect to fabric and allow all prefixes. TODO make previous sentence untrue and delete it
#### L2 setup
For leaf switches providing L2 access we must add a single `bridge` interface. If no VLANs are explicitly set, the bridge will allow any VLAN allowed on at least one of its ports. Otherwise it will only allow the specified VLANs.
{
"name": "bridge", "type": { "value": "bridge" },
"mode": { "value": "tagged" },
"tagged_vlans": [
{ "name": "vlan-foo", "vid": 1234 },
{ "name": "vlan-bar", "vid": 1235 }
]
}
For dual-attached devices we form a MLAG between two leaf switches. Each leaf must have the `peer` context key set to the hostname of the other leaf. Create a bond named `peerlink` as one of the `bridge` ports, and assign it the interfaces for inter-switch links. For example [exit-1](https://netbox.fri.uni-lj.si/search/?q=exit-1&obj_types=dcim.device&lookup=iexact) with two links to [exit-2](https://netbox.fri.uni-lj.si/search/?q=exit-2&obj_types=dcim.device&lookup=iexact):
{
"name": "peerlink", "type": { "value": "lag" },
"bridge": { "name": "bridge" },
"mode": { "value": "tagged" }
}
{
"name": "swp29",
"lag": { "name": "peerlink" },
"connected_endpoints": [ { "device": { "name": "exit-2" }, "name": "swp29" } ]
},
{
"name": "swp30",
"lag": { "name": "peerlink" },
"connected_endpoints": [ { "device": { "name": "exit-2" }, "name": "swp30" } ]
},
For each dual‐attached L2 device (server or switch) first create a bond on each leaf. Note that, on Cumulus Linux on Mellanox switches, a bond must be created even if a single interface is used on a particular switch. For example, the bond for [access-bdc-1](https://netbox.fri.uni-lj.si/search/?q=access-bdc-1&obj_types=dcim.device&lookup=iexact) on [exit-1](https://netbox.fri.uni-lj.si/search/?q=exit-1&obj_types=dcim.device&lookup=iexact):
{
"name": "access-bdc-1", "type": { "value": "lag" },
"bridge": { "name": "bridge" }
}
Assign the new bond all interfaces connecting to the device (here the bond has the name of the attached L2 switch `access-bdc-1`):
{
"name": "swp23s0",
"lag": { "name": "access-bdc-1" },
"connected_endpoints": [ { "device": { "name": "access-bdc-1" }, "name": "ethernet 1/0/49" } ]
}
If a bond with the same name (except `peerlink`) exists on both peer switches, a [MLAG ID](https://docs.nvidia.com/networking-ethernet-software/cumulus-linux/Layer-2/Multi-Chassis-Link-Aggregation-MLAG/#basic-configuration) is assigned automatically. In this case the (same) [VXLAN anycast IP](https://docs.nvidia.com/networking-ethernet-software/cumulus-linux/Network-Virtualization/VXLAN-Active-Active-Mode/#configure-vxlan-active-active) should be set on each leaf’s loopback interface.
The bond interface can be set as an access or a tagged port by setting the `mode` attribute. Either `untagged_vlan` or `tagged_vlans` should be set as appropriate in this case. Otherwise the bond will allow all VLANs allowed by `bridge`.
The device on the other end of the bond should use the active‐active 802.3ad (LACP) mode.
### Access switches
Currently all [access switches](https://netbox.fri.uni-lj.si/search/?q=access-%5Bbcr%5Ddc-%28poe-%29%3F%5B0-9%5D%2B&obj_types=dcim.device&lookup=iregex) are D-Link DGS-1510. Connection parameters are set for those device types in a [config context](https://netbox.fri.uni-lj.si/extras/config-contexts/1/) and applied automatically by Ansible.
The config template supports configuring the port channels and tagging ports, but is otherwise limited to this setup. Further additions should attempt to preserve (fake) idempotency by filtering out unimportant differing lines.
To set up a bonded interface to exit switches, configure these interfaces:
{
"name": "port-channel 1", "type": { "value": "lag" },
"mode": { "value": "tagged" },
"tagged_vlans": [
{ "name": "vlan-foo", "vid": 1234 },
{ "name": "vlan-bar", "vid": 1235 }
]
},
{
"name": "ethernet 1/0/49", "lag": "port-channel 1",
"link_peers": { "device": { "name": "exit-1" } }
}
{
"name": "ethernet 1/0/50", "lag": "port-channel 1",
"link_peers": { "device": { "name": "exit-2" } }
}
To enable an access interface, tag it with the appropriate VLAN(s), for example:
{
"name": "ethernet 1/0/10",
"mode": { "value": "access" },
"untagged_vlan": { "vid": 1234 }
},
{
"name": "ethernet 1/0/11",
"mode": { "value": "tagged" },
"tagged_vlans": [{ "vid": 1234 }, { "vid": 1235 }]
}
Interfaces marked as disabled are shut down.
### Firewall
The setup consists of two [firewall nodes](https://netbox.fri.uni-lj.si/search/?q=fw-%5B0-9%5D%2B&obj_types=dcim.device&lookup=iregex) and a [control node](https://netbox.fri.uni-lj.si/search/?q=zid&obj_types=virtualization.virtualmachine&lookup=iexact).
For the firewall nodes, configure `mgmt0` and `lo` as usual for L3 devices. Additionally, the firewall nodes should define the following interfaces:
{ "name": "lan0" },
{ "name": "lan1" },
{ "name": "mgmt1", "ip_addresses": [{ "address": "fe80::1/64" }] },
The MAC address should be defined for each interface, as they are renamed by the OS. The `mgmt1` interface is used for synchronizing connection-tracking information and should use the `fe80::1/64` and `fe80::2/64` addresses for the first and second firewall node, respectively.
Each firewall node should have a local config context with the keys `master` and `iface_sync` defining the names of the control node and the synchronization interface.