Internet2 recommended bidirectional forward detection (BFD) guide
Internet2 recommends the implementation of BFD. BFD supports more granular tuning which improves Layer3 forwarding assurance which ultimately increases site-to-site availability. There are four conditions where BFD can be utilized with different parameters based upon factors specifically relating to various layers of protection. This guide intends to outline BFD parameters for specific use cases and can also be used as a foundation for alternative cases.
Common use scenarios:
Internet2 IP Core to Site with AL2S redundancy
- BGP session configured between an Internet2 IP Core Router and any Site across AL2S Network
- BFD 600ms timeout (keepalive 200ms, multiplier 3)
Internet2 IP Core to Site without lower layer redundancy
- BGP session configured between an Internet2 IP Core Router and any Site without underlying redundancy (including direct physical connection to Internet2 IP Core Router or AL2S circuit with primary path only)
- BFD 360ms timeout (keepalive 120ms, multiplier 3)
Site to Site with AL2S redundancy
- BGP session configured between two sites (same entity or unique entities) across AL2S Network
- BFD 2100ms timeout (keepalive 700ms, multipler 3)
- BFD values represent the maximum recommended configuration given the possibility of transcontinental latency
Site to Site without without lower layer redundancy
- BGP session configured between two sites (same entity or unique entities) without underlying redundancy (including direct physical connection to Internet2 IP Core Router or AL2S circuit with primary path only)
- BFD 600ms timeout (keepalive 200ms, multipler 3)
- BFD values represent the maximum recommended configuration given the possibility of transcontinental latency
Additional factors may affect BFD parameters. Two of the scenarios described above utilize Internet2 AL2S, which can be configured to support redundancy at Layer2 (primary and secondary path). Considerations need to be given to factors that influence AL2S fail-over; the number of circuits affected and the latency from the affected area to the controller. AL2S processes circuit fail-overs in parallel, classifying the impact as low and uniform regardless of the affected area. Latency from the core nodes with the affected failure to the controller is classified as moderate, due the variable latency depending on location.
Sites that do prefer to tune BFD parameters for faster fail-over while utilizing Layer2 redundnacy on AL2S may observe timeouts resulting in BGP session flaps or log messages indicating keepalives were missed.