University of Wollongong
Browse

A novel approach to data deduplication over the engineering-oriented cloud systems

Download (518.82 kB)
journal contribution
posted on 2024-11-15, 03:54 authored by Zhe Sun, Jun ShenJun Shen, Jianming Young
This paper presents a duplication-less storage system over the engineering-oriented cloud computing platforms. Our deduplication storage system, which manages data and duplication over the cloud system, consists of two major components, a front-end deduplication application and a mass storage system as back-end. Hadoop distributed file system (HDFS) is a common distribution file system on the cloud, which is used with Hadoop database (HBase). We use HDFS to build up a mass storage system and employ HBase to build up a fast indexing system. With a deduplication application, a scalable and parallel deduplicated cloud storage system can be effectively built up. We further use VMware to generate a simulated cloud environment. The simulation results demonstrate that our deduplication storage system is sufficiently accurate and efficient for distributed and cooperative data intensive engineering applications

History

Citation

Sun, Z., Shen, J. & Yong, J. (2013). A novel approach to data deduplication over the engineering-oriented cloud systems. Integrated Computer Aided Engineering, 20 (1), 45-57.

Journal title

Integrated Computer Aided Engineering

Volume

20

Issue

1

Pagination

45-57

Language

English

RIS ID

36337

Usage metrics

    Categories

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC