Fuzzy Keyword Search Over Encrypted Data Using Cloud Computing(2010)

Note: Please Scroll Down to See the Download Link.

ABSTRACT:

Cloud computing is a technology that uses the internet and central remote servers to maintain data and applications. Cloud computing allows consumers and businesses to use applications without installation and access their personal files at any computer with internet access. This technology allows for much more efficient computing by centralizing storage, memory, processing and bandwidth. Perhaps the biggest concerns about cloud computing are security and privacy. If a client can log in from any location to access data and applications, it's possible the client's privacy could be compromised. In existing technique we retrieve the files from the cloud, by searching the keywords on the encrypted data. There are many searching technique which were implemented in the cloud these technique supports only exact keyword search. Typical users searching behaviours are happen very frequently these are the drawbacks with the existing system which are not suitable for cloud computing environment and which effects system usability. Using fuzzy search the exact keywords are displayed along with similarity keywords, which solve the problems faced by the cloud users. This paper concentrates on solving the problems of the user who search the data with the help of fuzzy keyword on cloud.

Existing System:

This straightforward approach apparently provides fuzzy keyword search over the encrypted files while achieving search privacy using the technique of secure trapdoors. However, this approaches serious efficiency disadvantages. The simple enumeration method in constructing fuzzy key-word sets would introduce large storage complexities, which greatly affect the usability.

For example, the following is the listing variants after a substitution operation on the first character of keyword

                       CASTLE:  {AASTLE, BASTLE, DASTLE, YASTLE, ZASTLE}.

Proposed System:

            1. Wildcard – Based Technique

            2. Gram - Based Technique

            3. Symbol – Based Trie – traverse Search Scheme

1. Wildcard – Based Technique:

       In the above straightforward approach, all the variants of the keywords have to be listed even if an operation is performed at the same position. Based on the above observation, we proposed to use an wildcard to denote edit operations at the same position. The wildcard-based fuzzy set edits distance to solve the problems.

For example, for the keyword CASTLE with the pre-set edit distance 1, its wildcard based fuzzy keyword set can be constructed as

 SCASTLE, 1 = {CASTLE, *CASTLE,*ASTLE, C*ASTLE, C*STLE, CASTL*E, CASTL*, CASTLE*}.

Edit Distance:

•         Substitution

•         Deletion

•         Insertion

•         Substitution :  changing one character to another in a  word;

•         Deletion :  deleting one character from a word;

•         Insertion:  inserting a single character into a word.

2. Gram – Based Technique:

  Another efficient technique for constructing fuzzy set is based on grams. The gram of a string is a substring that can be used as a signature for efficient approximate search. While gram has been widely used for constructing inverted list for approximate string search, we use gram for the matching purpose. We propose to utilize the fact that any primitive edit operation will affect at most one specific character of the keyword, leaving all the remaining characters untouched. In other words, the relative order of the remaining characters after the primitive operations is always kept the same as it is before the operations. 

For example, the gram-based fuzzy set SCASTLE, 1 for keyword CASTLE can be constructed as

              {CASTLE, CSTLE, CATLE, CASLE, CASTE, CASTL, ASTLE}.

3. Symbol – Based Trie – traverse Search Scheme  

             To enhance the search efficiency, we now propose a symbol-based trie-traverse search scheme, where a multi-way tree is constructed for storing the fuzzy keyword set over a finite symbol set. The key idea behind this construction is that all trapdoors sharing a common prefix may have common nodes. The root is associated with an empty set and the symbols in a trapdoor can be recovered in a search from the root to the leaf that ends the trapdoor. All fuzzy words in the trie can be found by a depth-first search.

In this section, we consider a natural extension from the previous single-user setting to multi-user setting, where a data owner stores a file collection on the cloud server and allows an arbitrary group of users to search over his file collection.

MAIN MODULES:

SYSTEM MODEL:

                              We consider a cloud data system consisting of data owner, data user and cloud server. Given a collection of n encrypted data files C = (F1, F2, . . . , FN) stored in the cloud server, a predefined set of distinct keywords W = {w1, w2, ...,wp}, the cloud server provides the search service for the authorized users over the encrypted data C. We assume the authorization between the data owner and users is appropriately done. An authorized user types in a request to selectively retrieve data files of his/her interest. The cloud server is responsible for mapping the searching request to a set of data files, where each file is indexed by a file ID and linked to a set of keywords. The fuzzy keyword search scheme returns the search results according to the following rules: 1) if the user’s searching input exactly matches the pre-set keyword, the server is expected to return the files containing the keyword1; 2) if there exist typos and/or format inconsistencies in the searching input, the server will return the closest possible results based on pre-specified similarity semantics (to be formally defined in section III-D).

THREAT MODEL:

                            We consider a semi-trusted server. Even though data files are encrypted, the cloud server may try to derive other sensitive information from users’ search requests while performing keyword-based search over C. Thus, the search should be conducted in a secure manner that allows data files to be securely retrieved while revealing as little information as possible to the cloud server. In this paper, when designing fuzzy keyword search scheme, we will follow the security definition deployed in the traditional searchable encryption.

More specifically, it is required that nothing should be leaked from the remotely stored files and index beyond the outcome and the pattern of search queries.

DESIGN GOALS:

                           In this, we address the problem of supporting efficient yet privacy-preserving fuzzy keyword search services over encrypted cloud data. Specifically, we have the following goals: i) to explore new mechanism for constructing storage efficient fuzzy keyword sets; ii) to design efficient and effective fuzzy search scheme based on the constructed fuzzy keyword sets; iii)to validate the security of the proposed scheme.

Hardware Requirements:

•         System                        : Pentium IV 2.4 GHz.

•         Hard Disk                    : 40 GB.

•         Ram                             : 512 Mb.

Software Requirements:

•         Operating system        : - Windows XP.

•         Coding Language       : DOT NET

•         Data Base                     : SQL Server 2005

Click here to download Fuzzy Keyword Search Over Encrypted Data Using Cloud Computing(2010) source code