Privacy Preserving Jaccard Similarity by Cloud-Assisted for Classification
2020, Springer Science+Business Media, LLC, part of Springer Nature. In the current information era, data mining has been extensively applied in many fields to discover the huge knowledge. It is desirable to design a system to exchange the sensitive data between the data owner and the client during data mining process regardless of the leakage of information. The privacy preserving mechanism is indeed an essential enabler to protect the data being corrupted in the trend of sophisticated cyber-attack. In fact, an approach to secure the data while mining knowledge is appropriate such that both data owners and users are not able to learn anything from each other data. Hence, we target preserving data privacy by mining on encrypted data. In data mining, similarity is the important factor, which either measures how much alike two data objects are, or describes as a distance with dimensions representing features of the objects. In this work, we firstly propose a privacy preserving approach on computing Jaccard similarity by cloud-assisted for data mining scheme. Particularly, our design focuses on the k-nearest neighbors classification, which is a conventional data mining algorithm. Rather, due to the efficient measurement of the Jaccard similarity, we present a solution how to employ the cloud services to preserve the Jaccard computation between the owner's data and the client's queries in a secure manner. We also implemented the proposed scheme on the workstation and observed that the computation efficiency of the proposed approach is well-suited in the data mining applications.