Optimal Configuration of Fault-Tolerance Parameters for Distributed Server Access

Alessandro Daidone, Thibault Renier, Andrea Bondavalli, Hans-Peter Schwefel

Research output: Contribution to journalJournal articleResearchpeer-review

Abstract

Server replication is a common fault-tolerance strategy to improve transaction dependability for services in communications networks. In distributed architectures, fault-diagnosis and recovery are implemented via the interaction of the server replicas with the clients and other entities such as enhanced name servers. Such architectures provide an increased number of redundancy configuration choices. The influence of a (wide area) network connection can be quite significant and induce trade-offs between dependability and user-perceived performance. This paper develops a quantitative stochastic model using stochastic activity networks (SAN) for the evaluation of performance and dependability metrics of a generic transaction-based service implemented on a distributed replication architecture. The composite SAN model can be easily adapted to a wide range of client-server applications deployed in replicated server architectures. In order to obtain insight into the system behaviour, a set of relevant environment parameters and controllable fault-tolerance parameters are chosen and the dependability/performance trade-off is evaluated.
Original languageEnglish
JournalInternational Journal of Critical Computer-Based Systems
Volume4
Issue number2
Pages (from-to)144-172
ISSN1757-8779
DOIs
Publication statusPublished - Oct 2013

Keywords

  • Dependability; availability; model-based evaluation; stochastic activity networks; distributed architectures; fault tolerance; replicated server access; server replication; fault diagnosis; fault recovery; wide area networks; WANs; stochastic modelling.

Fingerprint

Dive into the research topics of 'Optimal Configuration of Fault-Tolerance Parameters for Distributed Server Access'. Together they form a unique fingerprint.

Cite this