Local fast failover routing with low stretch

Klaus Tycho Foerster; Yvonne Anne Pignolet; Stefan Schmid; Gilles Tredan

doi:10.1145/3211852.3211858

Local fast failover routing with low stretch

Klaus Tycho Foerster, Yvonne Anne Pignolet, Stefan Schmid, Gilles Tredan

Publikation: Bidrag til tidsskrift › Tidsskriftartikel › Forskning › peer review

24 Citationer (Scopus)

Abstract

Network failures are frequent and disruptive, and can significantly reduce the throughput even in highly connected and regular networks such as datacenters. While many modern networks support some kind of local fast failover to quickly reroute flows encountering link failures to new paths, employing such mechanisms is known to be non-trivial, as conditional failover rules can only depend on local failure information. While over the last years, important insights have been gained on how to design failover schemes providing high resiliency, existing approaches have the shortcoming that the resulting failover routes may be unnecessarily long, i.e., they have a large stretch compared to the original route length. This is a serious drawback, as long routes entail higher latencies and introduce loads, which may cause the rerouted flows to interfere with existing flows and harm throughput. This paper presents the first deterministic local fast failover algorithms providing provable resiliency and failover route lengths, even in the presence of many concurrent failures. We present stretch-optimal failover algorithms for dierent network topologies, including multi-dimensional grids, hypercubes and Clos networks, as they are frequently deployed in the context of HPC clusters and datacenters. We show that the computed failover routes are optimal in the sense that no failover algorithm can provide shorter paths for a given number of link failures.

Originalsprog	Engelsk
Tidsskrift	Computer Communication Review
Vol/bind	48
Udgave nummer	1
Sider (fra-til)	35-41
Antal sider	7
ISSN	0146-4833
DOI	https://doi.org/10.1145/3211852.3211858
Status	Udgivet - 1 jan. 2018

Adgang til dokumentet

10.1145/3211852.3211858

AUB Link

Søg efter materialet i Aalborg Universitetsbiblioteks søgemaskine

Andre filer og links

http://www.scopus.com/inward/record.url?scp=85047296815&partnerID=8YFLogxK

Citationsformater

@article{50928f593f2341caaa89349caaa3e94e,

title = "Local fast failover routing with low stretch",

abstract = "Network failures are frequent and disruptive, and can significantly reduce the throughput even in highly connected and regular networks such as datacenters. While many modern networks support some kind of local fast failover to quickly reroute flows encountering link failures to new paths, employing such mechanisms is known to be non-trivial, as conditional failover rules can only depend on local failure information. While over the last years, important insights have been gained on how to design failover schemes providing high resiliency, existing approaches have the shortcoming that the resulting failover routes may be unnecessarily long, i.e., they have a large stretch compared to the original route length. This is a serious drawback, as long routes entail higher latencies and introduce loads, which may cause the rerouted flows to interfere with existing flows and harm throughput. This paper presents the first deterministic local fast failover algorithms providing provable resiliency and failover route lengths, even in the presence of many concurrent failures. We present stretch-optimal failover algorithms for dierent network topologies, including multi-dimensional grids, hypercubes and Clos networks, as they are frequently deployed in the context of HPC clusters and datacenters. We show that the computed failover routes are optimal in the sense that no failover algorithm can provide shorter paths for a given number of link failures.",

keywords = "Fast Reroute, Network Algorithms, Static Resiliency",

author = "Foerster, {Klaus Tycho} and Pignolet, {Yvonne Anne} and Stefan Schmid and Gilles Tredan",

year = "2018",

month = jan,

day = "1",

doi = "10.1145/3211852.3211858",

language = "English",

volume = "48",

pages = "35--41",

journal = "Computer Communication Review",

issn = "0146-4833",

publisher = "Association for Computing Machinery",

number = "1",

}

TY - JOUR

T1 - Local fast failover routing with low stretch

AU - Foerster, Klaus Tycho

AU - Pignolet, Yvonne Anne

AU - Schmid, Stefan

AU - Tredan, Gilles

PY - 2018/1/1

Y1 - 2018/1/1

N2 - Network failures are frequent and disruptive, and can significantly reduce the throughput even in highly connected and regular networks such as datacenters. While many modern networks support some kind of local fast failover to quickly reroute flows encountering link failures to new paths, employing such mechanisms is known to be non-trivial, as conditional failover rules can only depend on local failure information. While over the last years, important insights have been gained on how to design failover schemes providing high resiliency, existing approaches have the shortcoming that the resulting failover routes may be unnecessarily long, i.e., they have a large stretch compared to the original route length. This is a serious drawback, as long routes entail higher latencies and introduce loads, which may cause the rerouted flows to interfere with existing flows and harm throughput. This paper presents the first deterministic local fast failover algorithms providing provable resiliency and failover route lengths, even in the presence of many concurrent failures. We present stretch-optimal failover algorithms for dierent network topologies, including multi-dimensional grids, hypercubes and Clos networks, as they are frequently deployed in the context of HPC clusters and datacenters. We show that the computed failover routes are optimal in the sense that no failover algorithm can provide shorter paths for a given number of link failures.

AB - Network failures are frequent and disruptive, and can significantly reduce the throughput even in highly connected and regular networks such as datacenters. While many modern networks support some kind of local fast failover to quickly reroute flows encountering link failures to new paths, employing such mechanisms is known to be non-trivial, as conditional failover rules can only depend on local failure information. While over the last years, important insights have been gained on how to design failover schemes providing high resiliency, existing approaches have the shortcoming that the resulting failover routes may be unnecessarily long, i.e., they have a large stretch compared to the original route length. This is a serious drawback, as long routes entail higher latencies and introduce loads, which may cause the rerouted flows to interfere with existing flows and harm throughput. This paper presents the first deterministic local fast failover algorithms providing provable resiliency and failover route lengths, even in the presence of many concurrent failures. We present stretch-optimal failover algorithms for dierent network topologies, including multi-dimensional grids, hypercubes and Clos networks, as they are frequently deployed in the context of HPC clusters and datacenters. We show that the computed failover routes are optimal in the sense that no failover algorithm can provide shorter paths for a given number of link failures.

KW - Fast Reroute

KW - Network Algorithms

KW - Static Resiliency

UR - http://www.scopus.com/inward/record.url?scp=85047296815&partnerID=8YFLogxK

U2 - 10.1145/3211852.3211858

DO - 10.1145/3211852.3211858

M3 - Journal article

AN - SCOPUS:85047296815

SN - 0146-4833

VL - 48

SP - 35

EP - 41

JO - Computer Communication Review

JF - Computer Communication Review

IS - 1

ER -

Local fast failover routing with low stretch

Abstract

Adgang til dokumentet

AUB Link

Andre filer og links

Fingeraftryk

Citationsformater