Reviewing the Methods of Estimating the Density Function Based on Masked Data
Data privacy is an issue of increasing importance for big data mining, especially for micro-level data. A popular approach to protecting the such is perturbation. Therefore, techniques used to recover the statistical information of the original data from the perturbed data become indispensable in data mining. This paper reviews and exams the existing techniques for estimating (alternatively, reconstructing) the density function of the original data based on the data perturbed using the additive/multiplicative noise method. Our studies show that the techniques developed for noise-added data cannot replace the techniques for noise-multiplied data, though the two types of masked data could be mutually converted through data transformation. This conclusion might attract data providers' attention.