High-performance implementation of evolutionary privacy-preserving algorithm for big data using GPU platform
Privacy protection has become a predominant concern in big data analysis. Privacy-Preserving Association Rule Mining (PPARM) is a process in which original data is transformed into a sanitized version to remove sensitive patterns. Although evolutionary PPARM approaches were developed, they are inefficient because the fitness function is overwhelmingly expensive. In this paper, an efficient algorithm, GPU-based Evolutionary Privacy-Preserving (GEPP), for improving the performance of PPARM is presented. We introduce two strategies for this reason: the first one is to develop a parallel indexing machine to generate retrieval index lists by parallelizing dataset scanning between GPU blocks, and the second strategy is the parallelization of the index lists for fitness function computation using CPU/GPU platform. In the second strategy, search mechanism and fitness function steps are performed on CPU and GPU, respectively. The scalability of GEPP is evaluated in terms of GPU characteristics, e.g., blocks and threads, data characteristics, e.g., the number of transactions, items, and density ratio, and patterns characteristics, e.g., the ratio of sensitive patterns and support thresholds. Experimental results show that our parallel implementation of the proposed approach obtains a speedup up to 36.6x, on average, compared to the CPU implementation using real and synthetic large-scale transaction datasets.
Open Access Status
This publication is not available as open access