Logo image
Reliability and performance modeling and enhancement for storage area networks: a dissertation in Electrical Engineering
Dissertation   Open access

Reliability and performance modeling and enhancement for storage area networks: a dissertation in Electrical Engineering

Guixiang Lyu
Doctor of Philosophy (PHD), University of Massachusetts Dartmouth
2025
DOI:
https://doi.org/10.62791/20444

Abstract

Storage area networks (SAN) provide an effective solution to the significant growth issue in remote data storage and access. To deliver the desired quality of service, the reliability challenges of SANs must be addressed. A major threat to SAN reliability and performance is cascading failures, where a single incident triggers a chain reaction, causing extensive damage and even crash of the entire system. In this dissertation research, we focus on overload-triggered cascading failures, where the overloading of one device (e.g., a switch) causes it to fail, reallocating its workload to other devices, which in turn become overloaded, leading to further failures in a domino effect. We first investigate the effects of data loading on the reliability of an individual switch device in SANs using the proportional-hazards model and accelerated failure time model. We then investigate the effects of loading on the reliability of an entire SAN through dynamic fault trees and binary decision diagrams-based analysis. Furthermore, to enhance SAN reliability, we design proactive load redistribution-based mitigation strategies that aim to prevent cascading failures during the specified mission time, or at least alleviate the consequence of such failures. Two triggering mechanisms, based on the overall SAN reliability and switch loading, are considered. Load-based and reliability-based node selection rules are explored. Additionally, traffic reallocation strategies are investigated to enhance SAN performance in terms of load balancing and overall response time. The performance metrics of switch utilization, switch response time, and overall response time are analyzed using Jackson queueing networks. The application and effectiveness of the proposed mitigation strategies are demonstrated and compared through detailed case studies of SANs with a mesh topology.
pdf
Lyu G. COE PhD Dissertation 20253.93 MBDownloadView
Open Access CC BY-NC-ND V4.0

Metrics

11 File views/ downloads
28 Record Views

Details

Logo image