چكيده به لاتين
Abstract:
Nowadays, data reliability has been main challenge in the cloud storage environment. For scientific applications with high volume data storage, data storage in cloud with 3-replication for managing data reliability can lead to high data storage cost. An alternative approach is erasure code, that same data volume has more reliability compared to 3-replication. Using this encoding method has many challenges such as high repair bandwidth, I/O repair, number of disk for repair and access latency. Reed-Solomon code is a standard design and often high repair cost has been considered. Reducing Reed-Solomon computaional cost is very important. Since Hadoop-EC use Reed-Solomon code, improving the speed of this coding repair time is very importance.
In this thesis, we investigate different Reed-Solomon implementation, and also we evaluate the encoding and decoding performance in Hadoop-EC. We address computational overhead in Reed-Solomon code. Here, we have implemented a GPU-accelerated version of the Reed-Solomon matrix-based coding algorithm. Our experimental results show a 3.5X faster encoding / decoding time for 64MB HDFS block sizes, reducing the performance overhead of erasure codes.
Keywords: Cloud Storage, Erasure code, 3-replication, Reed-Solomon, Hadoop.