Post-Mortem
Motivation
The primary goal of the out-of-band (OOB) management network is that the devices are remotely manageable in the case of disaster, when the rest of the network is not functional, and thus service can be restored.
The secondary goal/benefit of the OOB management network is for security. Isolating management to only the OOB network significantly reduces the attack surface of the equipment.
Counter motivation
The first goal is irrelevant, because the Wi-Fi network is an overlay. If the equipment is not reachable it is because the underlay is not working, and we'll be fixing that first. Two notes here:
- Administrators of the Wi-Fi network need some kind of network connectivity that isn't the VT Wi-Fi network, which is trivial. A wired adapter, home ISP, mobile hotspot, any of these will do.
- To address the case of a device with an unusable network configuration (e.g., the out of box config), they still need some kind of non-network access (i.e., serial), though that access can be reachable through network resources. Indeed, serial connection accessed through the OOB network is already part of our standard setup.
The second goal strongly implies (though doesn't strictly require) that the management of a device is isolated to that device. This is not the case with the Wi-Fi infrastructure. The configuration is all done on the MC, which is pushed to the MDs, which is in turn pushed to the APs.
More critically, there is a need for the management to have a clear separation from the production and support network. An overlay design does not lend itself to this, and sure enough, it does not exist in the wireless controllers. In particular, the controllers do not have multiple routing tables, which makes it extremely difficult if not impossible to separate the different network planes.
In particular:
- user traffic is carried to the MD inside a tunnel
- MDs in a cluster build a tunnel and have a host specific route to each other
- MDs build a tunnel and have a host specific route the MCs
This means any wireless user can reach† the management of the MC any MD in the cluster they are connected to. This could be stopped with a client ACL, but it must:
- be applied to every role
- enumerate every address (including IPv6 link local!) on every controller
This is obviously error prone and a fair bit of work, all to accomplish a secondary goal. And we still end with a design that is only a weak assurance of this goal (e.g., have we found every path into the management plane? Probably not.)
† Can reach the L4 management interface that is. Obviously, L7 still needs auth(z).
Out-Of-Band Management
Logical Diagram
Data paths
- MD join clusters with the in band management address
lc-cluster group-profile "lcc-foo" controller-v6 <blue> priority 255 mcast-vlan 0 vrrp-ip-v6 <blue> vrrp-vlan <blue> group <#>
- APs connect to cluster on in band management
- In band mgmt and user networks are trunked over the same port channel.
- MD controller IP is in band mgmt
masteripv6 ... interface-f vlan-f <blue> ... controller-ipv6 vlan <blue> address <blue>
- mgmt auth (i.e., netadmin) for MDs happens on OOB mgmt
- user auth (e.g., eduroam) happens on in band mgmt
- MC-MD management happens inside the IPsec tunnel that gets built over the in band management.
Questions
- How do we prevent mgmt login from non-OOB mgmt
networks?
If we can't do this, we haven't actually done anything.
- Force management to ports
22
and4343
, and only allow these on OOB- AP-MD and MD-MC management is done through a tunnel, thus not stopped by these ACLs. This is good for the purposes of getting things to work, but kinda violates the principles we are after to begin with.
- Captive portals use ports
80
and443
and we can force HTTPS management to exclusively4343
. This lets us expose a L7 distinction in L4. Again, this functions, but eww.
- Force management to ports
- How many captive portal users are legacy only? Do we need this legacy address?
- Can we do no legacy addresses?
- No. At the least, we need legacy addresses for RAPs.
- Can we add members to a cluster by an IP that is not the controller IP?
- Yes
- Do we want to keep a legacy address on in band mgmt
to give us time to migrate APs? (And to have less changes at once)
- Yes. Lets make less changes at once.
TODO
conehead/grub
- Add v6 addresses on the OOB mgmt [NISNETR-396]
- Accept netadmin auth from the MDs' oob mgmt [NISNETR-399]
MM
Nothing?
MD
- Wire up MDs on OOB
- Address MDs on OOB
- Apply static route to OOB network
- Apply ACLs to limit port 4343 and 22 to only be allowed on the OOB side [NISNETR-398]
-
Change
asr-conehead-netadmin
to use the OOB v6 address on conehead [NISNETR-399] -
Change
asr-grub-netadmin
to use the OOB v6 address on grub [NISNETR-399] - Figure out initial setup
- Remove remaining legacy addresses
Config changes
The MM is configured exactly the same as before. The MDs have additional configuration (col-md-5.dev as an example here):
interface gigabitethernet 0/0/0
no shutdown
!
vlan 301
description oob-mgmt
!
interface port-channel 1
gigabitethernet 0/0/0
switchport access vlan 301
switchport mode access
trusted
trusted vlan 1-4094
!
interface vlan 301
operstate up
ipv6 address 2607:b400:e1:4000:0:0:0:15/64
!
ipv6 route 2607:b400:e1:0:0:0:0:0/48 2607:b400:e1:4000:0:0:132:1
Old ideas
These are things we are currently deciding against. They are noted here in case they turn out to be a good idea or lead to other useful ideas.
MC-MD connection:
- Static routes over OOB
- IPsec tunnel between MC and FW