Consolidate base system and networking setup into debian role and BGP
configuration into frr role. Add facts role to collect data from NetBox
once to avoid many slow lookups. Also many other tweaks and cleanups.
Since OIDC auth doesn’t support groups, get them from AD over LDAP.
Add a script to fetch user and groups, and update /etc/pve/user.cfg. The
script is only installed on one node (first alphabetically), with a cron
job to run it daily.
The script is installed for clusters with the sync-ldap context key set
to a corresponding OIDC realm. The keys ldap_user and ldap_pass must be
present in the password store under cluster/<name>.
Firewall policy is set in NetBox as cluster services¹. For Proxmox we
have to manually allow communication between nodes when using L3,
since the default management ipset does not get populated correctly.
We also need to open VTEP communication between nodes, which the
default rules don’t. We allow all inter-node traffic, as SSH without
passwords must be permitted anyway.
This also adds some helper filters that are spectacularly annoying to
implement purely in templates.
¹ There is actually no such thing as as a cluster service (yet?), so
instead we create a fake VM for the cluster, define services for it,
and then add the same services to a custom field on the cluster.
Alternative would be to tie services to a specific node, but that
could be problematic if that node is replaced.
The Proxmox SDN feature does not play nice with our FRR and VXLAN setup.
With a single bridge we can’t have interface aliases. So use a bridge
for each VLAN. Actually don’t even have VLANs, just bridges mainlined
into VXLAN tunnels.
Read the list of VLANs carried by Proxmox nodes from a custom field on
the cluster in NetBox. Remove the vmbr0 device from individual nodes.
Set standardized interface names (mgmt0… for L2 management interfaces
and lan0… for L3 data interfaces speaking BGP). ASN is stored as a
custom field in netbox but that might change.