Manual F5 GTM Wide-IP Pool Failover for topology load-balancing
The "problem"
When using Topology load balancing, pool status is not taken into account. This means when a pool is disabled/offline, traffic won't failover to another pool.
This is explained well in K05446309:
- This is by design that the topology of wide IP does not take pool status into account, so the preferred topology takes effect but not the default one.
- There are load balancings of Preferred, Alternate, and Fallback. By default, the method for each of them is Round Robin, Round Robin, and Return to DNS, when the pool is offline, load balancings of Preferred and Alternate fail, so the Fallback returns all the member IPs as the resolution result.
And also in K39000314:
When Return to DNS has been configured and all pool members are down, the BIG-IP DNS will refer to the ZoneRunner for the zone name... and resolved the IPs configured in the zone file.
In other words, the default behavior when an offline/disabled pool is selected, is to simply return all the member IPs of that pool.
Solutions
- Ensure there are additional topology records with lower weight/scores matching available pools.
- Temporarily change the Alternate, and Fallback methods for the pool under maintenance to either None or Topology
- None and Topology have the same effect when Topology is used as pool-lb-mode on the wide-ip.
- You can also set this permanently on all pools to allow automatic failover between pools
- Temporarily disable the pool that will be under maintenance
This will cause the pool to be "skipped" so that lower scoring topology records can select a different pool instead.
See the GTM manual:
If the alternate load balancing method for a pool is None, BIG-IP GTM skips the alternate method and immediately tries the fallback method. If the fallback method is None, and there are multiple pools configured, BIG-IP GTM uses the next available pool. If all pools are unavailable, BIG-IP GTM returns an aggregate of the IP addresses of all pool members using BIND. Alternatively, when the preferred method for all pools is configured, but the alternate and fallback methods are set to None, if the preferred method fails, BIG-IP GTM uses the next available pool.
Also K13412:
The weight specifies the score that will be given to a destination object which matches the topology record. If a name resolution request matches more than one topology record, the BIG-IP DNS system uses the destination object with the highest weight to determine which statement it uses to load balance the request.
And finally, this forum post
Alternatives
Failover all traffic to one pool
- Temporarily change the load-balancing method on the Wide-IP to "Global-Availability"
- Temporarily disable the pool that will be under maintenance
- if the pool under maintenance happens to be a higher order than the active pool, it's not strictly necessary to disable it
This can be a simpler option--especially when there are only two pools configured.