Document Type

Journal Article

Publication Details

Sun, Z., Shen, J. & Yong, J. (2013). A novel approach to data deduplication over the engineering-oriented cloud systems. Integrated Computer Aided Engineering, 20 (1), 45-57.

Abstract

This paper presents a duplication-less storage system over the engineering-oriented cloud computing platforms. Our deduplication storage system, which manages data and duplication over the cloud system, consists of two major components, a front-end deduplication application and a mass storage system as back-end. Hadoop distributed file system (HDFS) is a common distribution file system on the cloud, which is used with Hadoop database (HBase). We use HDFS to build up a mass storage system and employ HBase to build up a fast indexing system. With a deduplication application, a scalable and parallel deduplicated cloud storage system can be effectively built up. We further use VMware to generate a simulated cloud environment. The simulation results demonstrate that our deduplication storage system is sufficiently accurate and efficient for distributed and cooperative data intensive engineering applications

RIS ID

36337

 

Link to publisher version (DOI)

http://dx.doi.org/10.3233/ICA-120418