In this paper, we investigate data layout schemes and their impact on system reliability in a petabyte scale storage system
built from thousands of Object-Based Storage Devices. We delve in two underlying data layout schemes: RAID 5 and
RAID 5 mirroring. To accelerate data reconstruction, Fast Mirroring Copy is employed where the reconstructed objects
are stored on different OBSDs throughout the system. In order to improve the system reliability, SMART Reliability
Mechanism (SRM) is introduced for enhancing the reliability in very large-scale storage system. Analysis results show
that they can be used to assure the reliability of data storage and efficiently utilize the disk resource while exert minimum
impact on the whole systems performance.
In this paper, we investigate the reliability in a petabyte scale storage system built from thousands of Object-Based
Storage Devices and study the mechanisms to protect data loss when disk failure happens. We delve in two underlying
redundancy mechanisms: 2-way mirroring, 3-way mirroring. To accelerate data reconstruction, Fast Mirroring Copy is
employed where the reconstructed objects are stored on different OBSDs throughout the system. A SMART reliability
for enhancing the reliability in very large-scale storage system is proposed. Results show that our SMART Reliability
Mechanism can utilize the spare resources (including processing, network, and storage resources) to improve the
reliability in very large storage systems.
With the rapid development of massive storages, traditional RAID of single-protocol is increasingly unable to satisfy the
various demands of users. For the purpose of keeping down the investment of storages, we propose a multi-protocol
RAID that utilizes existing storage devices. The multi-protocol RAID achieves the integration of storage via managing
the disks of different interfaces. This paper presents a framework of multi-protocol RAID and a prototype
implementation of it, i.e., the proposed multi-protocol approach can not only unify the storage devices of different types,
but also provide different access channels (e.g. iSCSI, FC) to manage the heterogeneous RAID system, thus achieving
the goal of centralized management. Our function tests validate the feasibility and flexibility of the proposed RAID
system. The comparison tests indicate that the multi-protocol RAID can attain even higher performance than that of the
single-protocol RAID, especially the aggregated bandwidth.
The distribution of metadata is very important in mass storage system. Many storage systems use subtree partition or
hash algorithm to distribute the metadata among metadata server cluster. Although the system access performance is
improved, the scalability problem is remarkable in most of these algorithms. This paper proposes a new directory hash
(DH) algorithm. It treats directory as hash key value, implements a concentrated storage of metadata, and take a dynamic
load balance strategy. It improves the efficiency of metadata distribution and access in mass storage system by hashing
to directory and placing metadata together with directory granularity. DH algorithm has solved the scalable problems
existing in file hash algorithm such as changing directory name or permission, adding or removing MDS from the
cluster, and so on. DH algorithm reduces the additional request amount and the scale of each data migration in scalable
operations. It enhances the scalability of mass storage system remarkably.
The structured overlay networks based on DHT provide a decentralized, self-organizing substrate for building the large
distributed systems such as file sharing and data storage. However, in most of these systems, many problems still remain
to be solved regarding system scalability, network proximity and so on. In this paper, we present a novel routing protocol
called PBHC. By combining hierarchical DHT algorithm with a data proximity mechanism PBHC minimizes the intracluster
access traffic and boosts the local access ratio in heterogeneous networks environment. The simulation results
show PBHC can significantly improve the routing performance and scalability of the P2P storage system.
The access time of Disk/RAID has not been improved as fast as the memory performance whose rate of improvement has been 25% per year and hence the disk access penalty is considerably increasing with each enhancement in the memory architecture. To solve the problem, a new kind of storage hierarchy, Volume Holographic Universal Storage Cache (short for VHUSC) is proposed. VHUSC acts as a layer between main memory and disk or disk array. VHUSC can lower the disk access latency, provide much higher I/O bandwidth and throughput, it thus greatly improve the I/O performance of computer system. In this paper, an application independent model based on queuing theory is proposed for the VHUSC performance evaluation. Based on this model, VHUSC and traditional disk/RAID performance is analyzed and compared. Result shows that in most cases VHUSC can improve the disk read/write performance by 1 order of magnitude, especially when the hit rate is larger than 99%, the performance can reach 2 orders of magnitude.
This paper presents a novel storage architecture called Volume Holographic Universal Storage Cache (for short VHUSC) for the purpose of optimizing disk I/O performance. The main idea of VHUSC is to make use of the Volume Holographic Memory, referred to as VHUSC, as a new layer between main memory and disk. VHUSC can lower the disk access latency, provide much higher I/O bandwidth and throughput. An application independent model based on queuing theory is proposed for performance comparison between VHUSC and traditional disk. The results show performance improvements of up to one order of magnitude.