Exact/Partial Cipher-text Retrieval on the Encrypted Cloud Exact/Partial Cipher-text Retrieval on the Encrypted Cloud Database using SCKHA Algorithm Database using SCKHA Algorithm

Searchable encryption techniques are intelligent tools that attempt to solve the challenge of querying stored data on untrusted cloud servers while preserving data con ﬁ dentiality. This paper proposes simple, powerful, and practical Searchable Symmetric Encryption (SSE) scheme which capable of executing SQL (Structural Query Language) queries over encrypted cloud databases. Our scheme improves the ef ﬁ ciency of the search over encrypted data by using the SCKHA (Symmetric Cipher based on Key Hashing Algorithm) algorithm instead of using the bloom ﬁ lter technique which causes the problem of the false positives. We built a complicated index structure for the encrypted data by combining the encryption of the data unit using Advanced Encryption Standard (AES) scheme with the encryption of SCKHA scheme for the same data to ensure data security. We also proposed a complete framework for implementing our construction, validating its practicality to achieve exact and partial search over encrypted SQL databases. Furthermore, we performed experiments on adult data sets and analyzed the experimental results in terms of computation time for index generation, trapdoor generation and average search. Experiments showed that the taken time for generating the trapdoor and the keyword search in our scheme is faster than other schemes. We also analyzed the security of our scheme and evaluated its performance experimentally and theoretically. Finally, we provide features for evaluating single, conjunctive, and disjunctive queries.


Introduction
T he fast development of computer technology and web applications has increased the demand for data access and storage capabilities (Wan and Deng, 2016). However, as the demand increases, the cost of storage on cloud server increases, search efficiency decreases and data privacy has become a research focus (Yang et al., 2019). To guarantee data privacy for organizations, plaintext must be firstly encrypted before outsourcing it to the cloud. Encrypting data is a standard method to make sure the confidentiality of the stored data at honest-butcurious cloud storage. However, various organizations (such as hospitals, banks, or corporations) usually outsource their data to relational databases. Therefore, encrypting their plain data may cause an impediment to execute SQL directly on that encrypted data. Some of the encryption schemes prohibit users of functionalities over the encrypted data, such as searching. In this case, users are required to get the entire data from the cloud server, decrypt it and then execute a local search. But This way leads to a problem performance issues which eliminate the benefits of outsourcing data. Therefore, servers in the cloud should have the ability to execute a search operation in their outsourced databases, with remaining the actual databases being encrypted. Constructing SQL queries on encrypted data can be performed using two ways. The first way is to execute directly over an encrypted database; however, the second way is performed by using an index. The problem of the first way is performing calculations directly on encrypted data arose long before cloud computing appeared on the application scene. Rivest et al. (Ronald et al., 1978) introduced the idea of fully homomorphic encryption (FHE) to address this problem. Working with FHE ensures data confidentiality because it operates efficiently with computations and does not recover any information about the stored data. But on the other hand, it suffers from performance overhead. So, it is necessary to try other methods to avoid this problem. Curtmola et al. (2006) built a secure inverted index for the whole database using both bloom filter and pseudo-random function to improve query performance, however, it only supports a search on single keyword. It is the first time for this scheme to outlines the security target for SSE scheme.

Work contributions
To query data securely and effectively, this paper suggests a novel searchable encryption technique that supports SQL queries. This scheme depends upon constructing a complicated encrypted index for sensitive columns in the relational database table to improve the search quality. Moreover, the effectiveness of our scheme is verified by experimental and theoretical analysis. Our contributions are as follows: (1) We propose a new searchable encryption technique that supports SQL queries method based on the SCKHA (Souror et al., 2021(Souror et al., , 2022 algorithm, which preserves the search functionality. It reduces the enormous communication costs for trapdoor transformation. (2) To resist keyword-guessing attacks from malicious servers, we built the encrypted search section using the encryption key of a data owner. As a result, the adversary cannot find the trapdoor keywords by navigating the keyword space. (3) Our symmetric searchable encryption scheme has partial searching capabilities in addition to exact search over the encrypted database. (4) A ciphertext index structure is constructed by combining the SCKHA algorithm and AES together to help in retrieving data efficiently and securely.

Organization of the paper
The rest of the paper is structured as follows: Section 2 reviews the existing relevant works for the searchable encryption schemes. In section 3 we briefly review some of the preliminaries and notations used in our paper. We offer some features of SQL database, extraction methods for the keyword in section 4. Section 5 depicts the practical system model of our SSE scheme. It includes a discussion of the SCKHA scheme and its use to build the format of the stored encrypted data on the cloud. Implementation and performance evaluation are shown in section 6. The performance comparison and security analysis are described in Section 7. We summarize the paper and give some directions for future work in section 8.

Related work
Nowadays, storing data on a server with its plaintext format no longer meets the requirements of privacy preservation; thus, encrypting the data has increasingly taken more attention. However, encrypting private data may produce data several query problems so, a searchable encryption scheme is presented. Recently, searchable encryption models have taken more studied (Kamara et al., 2012;Tao et al., 2022;Dhruti, 2023;Hahn and Kerschbaum, 2014). However, these schemes focus on retrieving data from text documents and did not support SQL queries for the outsourced encrypted data in the database. Currently, several studies focus on database-based searchable encryption schemes. These schemes can be classified into three categories: direct operations on encrypted data (without indexes), direct indexes, and inverted indexes.
The first study for the SSE scheme using direct operations was suggested by Song et al. (2000). They used a document file as a list of words and used a special type of stream cipher for encrypting this file. So, it is necessary for the server to scan each document in sequence word by word. Raluca et al. (2011) proposed an onion encryption scheme to join encryption methods, by encrypting the data attribute with many nested encryption algorithms. Using a combination of multiple encryptions degrades the efficiency of this mechanism. In addition, encrypted data using CryptDB can be leaked, aiding attackers in becoming familiar with the user interface and ultimately enabling them to alter user data. So, Liu et al. (2018) proposed a secure fully homomorphic order-preserving encryption (FHOPE) framework to solve this problem. In FHOPE scheme, cloud servers can execute complex SQL queries with various operators on encrypted data without frequent encryption.
Using a direct operation on encrypted data decreases its efficiency. Therefore, the encryption index way was issued to enable records that satisfied the query conditions to be quickly found by an encrypted index. The first keyword ciphertext query mechanism which is based on creating an index for each data table was proposed by Goh (2003). It also explored a security scheme named INDCKA (indistinguishability against adaptive chosen keyword attack) to secure the index. This scheme is characterized by its ability to select the matched data in the table and cannot recover unwanted information. Curtmola et al. (2011) creates an indexbased on a bloom filter and a pseudo-random function on a whole database to improve the query efficiency. Despite the query keywords being sent to the server is in an encrypted format, statistical attacks on search patterns may still occur due to the small number of keywords in the dictionary and the usage of trapdoor generation algorithm of deterministic techniques like pseudo-random permutation and pseudo-random function. Karras et al. (2016) introduced a suitable self-adaptive encrypted index to be executed on a range query for a column data table. However, there is no firm security analysis of the self-balanced binary tree structure of the self-adaptive scheme. An extension was made to use searchable encryption (SE) index for SQL database in addition to a text by Monir et al. (Azraoui et al., 2018). This scheme supports both boolean and range queries of O (1) time complexity. Range query needs to determine a range of values in advance. It is a simple multi-keyword query scheme but, has low efficiency with a non-standard query. Jiangnan et al. (Li et al., 2019) proposed a simple SSE scheme to preserve the privacy of the generated data in a smart grid. It achieves a high complexity with little leakage of information that was acceptable in practical smart grid applications. Andola et al. (2022) introduced a secure SSE scheme which depends on hash-based indexing. It reduces the computational load problem on the users and the cloud server. Results proved that it is appropriate for handling big data sets while successfully enabling dynamic updates and effective ranking of precise files corresponding to the specified multiple keywords. Hoang et al. (2023) presented DIQ-SSE (Double-Indexes-Query-SSE) scheme. It can query any substring on encrypted data on the outsourced databases. It depends on double Indexes Query and performed in Proxy model. Najafi et al. (2021) proposed rFSMSE (randomized Full Secure SE) scheme. It supports multi-keyword in SSE scheme and does not leak information neither in the index nor in the trapdoor. It is enjoying with its full security and validating the search results. Li et al. (2019) proposed a SSE scheme which applied on smart grid data. It depends on pseudo-random functions to build the index in the scheme. It is characterized by its simplicity and ease to update. However, it suffers from false-positive problem, low security because the resulted data from the search operation includes some information about the plaintext keyword. So, Xiong et al. (Xiong et al., 2022) proposed to eliminate false-positive and the security problems in (Li et al., 2019). Authors improve the efficiency of the search operation by firstly narrowing search scope and then performing a second search using bloom filter. Furthermore, scheme (Xiong et al., 2022) is not time efficient enough, and can't support partial search on the encrypted data. In this paper, we propose a more time efficient, more secure, flexible for search with both exact and partial search on the encrypted data, and false-positive-free SSE. Table 1 compares between various existing SSE Index-Based schemes. The latest research work on dynamic SSE (Yaru Liu et al., 2022;Zheli et al., 2020;Li et al., 2022), fuzzy keyword searchable encryption (Liu et al., 2020). And forward secure searchable encryption (Liu et al., 2021;Cui et al., 2022;Zeng et al., 2022) which ensures that the recently updated files cannot be linked with the previously executed search. This prevents the server from inferring the keywords.

Notations
We suppose the following scenario: Data owner D outsources a massive amount of two-dimensional table set T i ¼ {t 1 , t 2 , …, t n }, of confidential data at a database server in an encrypted format C ¼ {c 1 , c 2 , …, c m }, where m represents the number of twodimensional tables. R ¼ {r i1 ,r i2 ,… … … … r ij }, can represent a collection of records in the two-dimensional table T i . The keyword list is extracted from T i and indicated by W ¼ {w 1 , w 2 , …, w k }, where k refers to the number of keywords. Each w i 2 W, the A (w i ) indicates a set of records that contain keyword w i . Also, we define N as the total number of the keywords in a record, and n is the number of the wanted keywords in a record where the data owner wants to search, where n # N. Data owner generates an encrypted search part I SCKHAC1 based on SCKHA algorithm which allows searching the keywords directly in ciphertext (c), without compromising data confidentiality nor query privacy.

System model
Our system model shown in Fig. 1 interacts between two main parts as follows: (1) Cloud server: securely outsource the owners' encrypted data and responds to requests sent to it. But there are several attempts to reveal the encrypted data of the data owner.
(2) Data Owner (Client): it is a part that owns the plaintext data, collects recently generated data, and stores it securely on the cloud in an encrypted form. It first encrypts the plaintext with a common encryption algorithm as AES and then assigns the SCKHA mechanism which serves a search criterion for the same data. Finally, it uploads the final encrypted format to the cloud server. Although the cloud server can provide services as the client wants, it is considered as a curious-but-honest, where it provides some normal cloud service but tries to monitor the contents of data.

Querying database by SQL
We show the following select operation in SQL database queries: SELECT columns FROM table WHERE conditions. This query returns one or more records which meets the conditions of a two-dimensional table (or view). Authenticated users can only perform the following three kinds of queries: (1) Simple Query: it contains only one condition (predicate), The query is written as: (2) Conjunctive Query: it Contains many query conditions (predicates) that combined with AND operator, such as:  -Statistical attacks on search patterns occur due to limited number of keywords in dictionary. Karras et al. (Karras et al., 2016) Encrypted index Yes -Introduced a suitable selfadaptive encrypted index to be executed on a range query for a column -There is no firm security analysis of the self-balanced binary tree structure of the self-adaptive scheme Monir et al. (Azraoui et al., 2018) Encrypted index No -Uses simple method for multikeyword query scheme.
-Low efficiency with a nonstandard query. Li et al. (Li et al., 2019) Direct index Yes -Achieve high space complexity. -Search results returned to the user contain plaintext related information which reduces the security of the scheme. SELECT columns FROM table WHERE column1∧ column2 ∧ · · · ∧ columnx (3) Disjunctive query: it Contains many query conditions (predicates) that combined with OR operator, such as: SELECT columns FROM table WHERE column1∨ column2∨ · · · ∨ columnx (4) SQL query is then converted to SE query using the same operation as the connection query. 5. Practical system model for SSE scheme 5.1. Using SCKHA algorithm for searching over encrypted data SCKHA algorithm (Souror et al., 2021(Souror et al., , 2022) is a stream cipher scheme that relies on concealing the symmetric key to overcome the issue of distributing the encryption key across channels. It is secure against brute force attack through utilizing more resources for testing each possible key. The cipher format of this algorithm is consisting of two digits (C f ¼ C 1 C 2 ) for every plaintext character. This allows us to efficiently perform an exact search, or partial match search over the encrypted database. According to Algorithm .2 in paper (Souror et al., 2022), the second part ciphertext (C 2 ) is responsible for determining both the key hash side (left or right) and its index to use them in the decryption process. By eliminating the second part (C 2 ) from the ciphertext (C f ) and remaining the first part only (C 1 ), will prevent from decrypting the ciphertext. So, the format of C 1 can be calculated as described as in equation (1). This way will be exploited as a trapdoor function which helps in searching on the encrypted data. Also, we applied a random function generator to generate random strings. These random strings will be embedded through the first part (C 2 ) in a specific position. Therefore, the data owner generates an encrypted index part I SCKHAC1 (which is embedded with random strings) for the stored data in the relational database table and generates a trapdoor T SCKHAC1 (w) for the required words to send them to the cloud server.
where P is referring to the plaintext, K (L, R) is the key hash side whether it is left or right, and [idx] refers to the calculated position (index) in the key hash side.

Data encryption and outsourced data format
The cloud database is hosted and maintained for an organization by a third-party cloud service provider (CSP). So, the stored data must be encrypted with a specific format that the CSP knows nothing about it. The sensitive data must be stored in an encrypted database. We supposed that the table T a,b has set of rows (records) and b of columns (attributes). The encrypted data format T e a,b is represented as shown in equation (2): In our model, the client uses a popular security scheme like for encrypting the first part of the data T e a,b (1) as in equation (3) to ensure the confidentiality as described in equation. But we need specific mechanism for the user to perform a query on the database without exhibiting the plaintext or even their secret keys. That is why the second part T e a,b (2) shown in equation (4) is required. It acts as a trapdoor function where a user submits a query to cloud database. In this model, we use the SCKHA mechanism in the second part to search in an encrypted database. Then, the server can perform a specific search either for exact pattern string or even partial string data.

System construction
Our SSE scheme consists of six polynomial time algorithms (KeySetup, BuildEncryptedIndex, Enc, Trapdoor, Search, Dec) such that: (1) KeySetup (MK): It takes a security parameter master key (MK) and it is run by the data owner to generate both parts of the master key (K Left , K Right ) as shown in Algorithm 1.
(2) BuildEncryptedIndex (K L , K R , R): Is run by the data owner by taking both parts of the master key (K Left , K Right ), and a record R as input, then outputs the encrypted search part I SCKHC1 for each record R as shown in Algorithm 2.
(3) Enc (T, K): Data owner encrypts the sensitive columns data stored in the cloud database with symmetric key K to ensure data privacy as shown in Algorithm 3. (4) Trapdoor (K Left , K Right , w): Is run by the data owner with taking both parts of the master key (K Left , K Right ), and a keyword (w) as input, then outputs the trapdoor Tw for that keyword. (5) Keyword Search (Tw, I SCKHAC1 ): The server will eliminate the embedded random strings and then a search operation is performed by the server by taking a trapdoor Tw and the second part I SCKHAC1 as input to result 1 if w 2 R (satisfy query condition) or 0 otherwise. (6) Dec (EncTable, K): Data owner decrypts the retrieved encrypted records which satisfy the query requirements from the cloud database with symmetric key K.
In our SSE scheme, along with the encrypted value, the I SCKHAC1 part of the data will be stored in the cloud database. So, for every plaintext value T a,b , the cloud database stores (T e a,b (1), T e a,b (2)), where T e a,b (1) refers to the ciphertext of the original plaintext value, and T e a,b (2) refers to its I SCKHAC1 part. The reason for storing the data on the cloud server by this format is to disguise the cryptanalysis that the attacker may take to recover the plaintext data. To encrypt T e a,b (1), AES-GCM (Advanced Encryption Standard with Galois/Counter Mode) is used with the username of the user provided as additional data and an IV (Initialization Vector) chosen randomly. AES-GCM is ideal for protecting packets of data because it has low latency and a minimum operation overhead. Both AES and Blowfish are the most popular symmetric encryption algorithms. In Kubadia et al. (2019), the authors compared the performance of the AES and Blowfish algorithms and discovered AES has a larger block size than Blowfish, making it more secure against birthday attacks.
At searching process, data owner will generate a trapdoor T w for the word w and then send T w to the cloud server. For each record R, the cloud server runs Search (T w , I SCKHAC1 ) function and checks whether R contains w. Finally, the server will return a set of records that contain w to the client. Our module requires that the client only should hold both private master keys K Left , K Right to generate a trapdoor Tw for each word w. Therefore, server cannot guess any information related to that encrypted data (Fig. 2).  We suppose that we have some sensitive fields from the table as Id, age, sex, and material status, these steps will be followed to complete the encrypted search steps: (1) Data Owner Builds Complicated Index: Data owner runs the KeySetup function shown in Algorithm 1 to get k bits of master key MK and keep it secure. Subsequently, for each record R i in the cloud database server, Data Owner encrypts each data unit using a common encryption algorithm using Encryptcolumns (COLs, SK) function shown in Algorithm 3 to obtain first part as describes in (3). Then, the data owner runs the BuildEncryptedIndex (R, K Left , K Right ) function shown in Algorithm 2 to get part as mentioned in (4) for the record R i . Finally, the data owner combines these two parts T e a,b as in (4) and save the encrypted data format in a CSV file.
(2) Uploads Encrypted Data in Cloud: the final encrypted format T e a,b for the plaintext T a,b data unit is then stored in a CSV file and uploaded to the cloud server.
(3) Cloud Server Retrieval: when the cloud server receives the trapdoor Tw for keyword w from the client, it runs the Search (Tw, I SCKHAC1 ) function after eliminating random strings in the encrypted index. To search for a keyword w, the client computes a trapdoor Tw ¼ fMK(SCKHAC1(w)) and sends Tw to the cloud server. Then, the cloud server detects each record, whether Tw exists or not. If so, the cloud server returns set of encrypted records R i that match the query to the client, and the client decrypts its corresponding encrypted value.

Implementation and performance evaluation
Our SSE scheme is implemented according to the diagram shown in Fig. 1. The environment of the system is based on a Windows11 (64-bit) Home Operating system, the hardware configuration is Intel(R) Core (TM) i7-8550U (1.80 GHz 1.99 GHz) processor equipped with 16 GB of memory and a fast Ethernet for network (140Gbps). Our remote database was hosted on Clever Cloud as a cloud storage platform and populated by adult (Dua and Graff, 2020) schema. This data set was extracted from an enumerated census database in 1994 by Barry Becker. All the functions are implemented in Python 3.10 as a programming language for this prototype. This implementation helps us to analyze the time taken for each stage starting from generating the complicated index and ending the keyword searching. We don't care about the performance analysis for KeySetup, Enc, and Dec stages since they are mostly interchangeable with other schemes and turn our attention onto the remaining stages starting from the build complicated index stage. We will measure the computation cost time for the exact search for our scheme and compare it with both Xiong (Xiong et al., 2022), Li (Li et al., 2019) scheme using 3-parameters which are: complicated index construction time, trapdoor computation time, and average keyword search computation time.
(1) Complicated Index Construction Time: This stage comprises of index creation. After generating the index file by the data owner, it is sent to the cloud server. We compute the computation time for the index table generation by running the code on a total of 20,000 records identified and extracted from a dataset of 32,000 record. The data set includes different data types such as strings, dates, and unsigned integers. We used ten attributes, including identity information, age, gender, and material status and so on as shown in Table 2. We stored the format of encrypted data for these attributes in a CSV file and uploaded this data to the cloud database server. The data is encrypted by AES-GCM with the username of the user provided as additional data and an IV chosen randomly. We stored all the generated value formats in a column for every attribute. The computation time for the generation of the index is depicted graphically in Fig. 3. Computational time of the index generation growths linearly with increasing the number of records. Our scheme takes more time than Li and Xiong schemes because the structure of the encrypted index is more complicated than both schemes. However, our scheme takes less time than Xiong and Li schemes for both stages trapdoor generation and keyword search as described below.
(2) Trapdoor computation time: As mentioned previously, the trapdoor is used as a search query  and is created by the data owner for a specific keyword. The generated trapdoor is sent to the server, and it facilitates the search of the relevant records. The Build trapdoor stage is executed for different keywords. The computation cost for the generation of the trapdoor is plotted graphically in Fig. 4. We computed the trapdoor generation times for 5 keywords. It is clear that, our scheme takes less time than Li and Xiong schemes because the keyword search is performed on the second part of the encrypted index.

(3) Average Keyword Search Computation Time:
Once the encrypted index file is uploaded onto the cloud server and the trapdoor has been generated and sent to him, the next step is to apply an exact search of the related records. Fig. 5, represents the graph generated on executing the exact keyword search stage against the trapdoor generated for the keyword. Computational time for the keyword search growths linearly with increasing the number of records. Result shows that the search time for our scheme is faster than Xiong and Li schemes. Finally, Fig. 6, represents the graph generated on executing the partial keyword search stage against the trapdoor generated for the keyword. Computational time for the partial keyword search growths linearly with increasing the number of records.
In addition, we considered three scenarios for searching across records including single keyword queries, conjunctive keyword queries, and disjunctive keyword query. We performed single keyword for single Keyword search in the cloud database to evaluate the search efficiency as shown in Table 3. Also, we performed two different keywords for both conjunctive and disjunctive search in the cloud database to evaluate the search efficiency as shown in Tables 4 and 5 respectively. Experiments show that the average search time is linear by increasing the retrieved records from the cloud database. The corresponding graphical line representation for these tables are shown in Figs. 7e9 respectively.

Performance comparison and security analysis
This model aims to protect the stored data privacy where the adversary is not able to recover any information about the plain record in the table from the blind index with keeping the ability for the search result and retrieving the number of the desired keywords in a record. The format of the stored encrypted data disguises the attacker from   using cryptanalysis to recover the ciphertext. Our scheme makes stored records with the same keywords generate different encrypted format. Generally, the methodology of our scheme increases the space efficiency of the SSE model with no leakages of information. It removes false-positive caused by using bloom filter. Cloud server cannot obtain any information about the plaintext because the structure of the encrypted index protects the plaintext from revealing data. The following points demonstrate some security issues against the cloud server and attacker: (1) When executing the data owner commands, the cloud server is assumed to be honest-butcurious, honest means trusted during executing SQL operations. But the cloud server is curious because he/she may monitor and analyze the encrypted data or even the encrypted index.
(2) During the search process, the cloud server firstly removes the embedded random characters to execute the search process. So, removing random characters cannot enable the cloud server from learning any information about the plaintext.
(3) Removing the second part (C2) from SCKHA' ciphertext cannot allow the cloud server to discover both the key past and its index to reveal the data. (4) The overall format of the stored ciphertext disguises the cloud server or the attacker to detect which scheme is used to create the encrypted index. It is a combination of two algorithms AES and the first part of the SCKHA' ciphertext. (5) Deterministic encryption attack means that the same keywords produce the same ciphertext data with the same master key. So, embedding random characters across the stored encrypted index will generate different encrypted indexes for the same keywords. This will prevent the attacker from detecting any similar encrypted data. (6) There is no relationship between the trapdoor and the indexed records. (7) T e a,b (2) is used to look for particular values in the database or to match up specific columns without disclosing to anyone (even if CSP) which specific database cells contain the same values. (8) It is computationally impossible for the attacker to crack the pseudo-random function (AES).
In addition, both trapdoors and indexes may be exploited by the cloud server because they are permanently exposed to him. So, trapdoors and indexes should be randomized to ensure privacy, therefore the cloud server cannot recognize the link between these trapdoors and indexes from their corresponding ciphertexts. In other words, the cloud server is unable to determine whether two workers share the same keyword or interests. Additionally, the ciphertexts of the same plaintext must be different in different task-matching operations. So, we will show a security model for both Index Unlinkability (IND-UN), and Trapdoor Indistinguishability (TD-IND) issues as follows:

Index unlinkability (IND-UN)
It is a security notion that can achieve the privacy of SSE schemes. It ensures that the adversary cannot make any link for the indexes of two submitted queries to reveal the plaintext keyword. Assuming that the adversary X is a malicious CSP, and challenger Y is a trusted sender. By using IND-UN, the adversary is prevented from determining whether two indexes have similar plaintext keywords. In the following contents, we will use x ) R X to refer x is represented regularly at random from the set X, and n as the number of keywords.
(1) Theorem 1. Assuming if adversary Adv1 can break the generated IND-UN with non-negligible advantage, and another adversary Adv2 also can break the second part of the overall format with non-negligible advantage.
(2) Proof: (a) Setup: Challenger Ch obtains the private key MK. (b) Query: Challenger Ch selects two keywords p 0 , p 1 2P, where P is referring to the plaintext in the dataset. (c) Challenge: Challenger Ch lets b ¼ 1, in the case of existing the same keyword in the two pieces of data; otherwise, b ¼ 0. Challenger Ch calculates the index of p 0 , p 1 . For each keyword w i (1 # i # n) in p 0 , p 1 , Now, nX i is inserted into a list of length n. It inserts new value into the list to get a list of length n þ 1 as index. So, for both keywords, challenger Ch calculates the indexes I pb , I pb for p 0 , p 1 , and Challenger Ch sends the indexes to the adversary A.
(d) Adversary Response: he/she makes a guess and lets b' ¼ 1, in the case of existing the same keyword in the two pieces of indexes; otherwise, b' ¼ 0, so output b' 2 {0, 1}.
Using the advantage of the adversary Adv1 to be jPr [b ¼ b'] À 1/2j. Here, the adversary Adv2 will simulate adversary Adv1 to break the first part and second part. Adversary Adv2 cannot break the AES with pseudo-random function and SCKHA part as the adversary is a polynomial time attacker. Therefore, the advantage is negligible, which means our scheme meets the index unlinkability.

Trapdoor indistinguishability (TD-IND)
It is a security notion that can achieve the privacy of SSE schemes. It ensures that the adversary cannot make any link for the trapdoors of two submitted queries to reveal the plaintext keyword. We suppose that Adversary Adv1 is malicious CSP, and challenger Ch is a trusted sender. By using IND-UN, the adversary is prevented from determining whether two trapdoors have similar plaintext keywords w 0 , w 1 . In the following contents, we will use x ) R X to refer x is represented regularly at random from the set X, and n as the number of keywords.
Theorem 2. Assuming if adversary adv1 can break the generated TD-IND with non-negligible advantage, and another adversary adv2 can break the first and second parts of the overall format with non-negligible advantage.
(1) Proof: (a) Setup: Challenger Ch obtains the private key MK. (b) Query: Challenger Ch selects two keywords w 0 , w 1 2 P, where P is referring to the plaintext in the dataset. (c) Challenge: Challenger Ch computes the trapdoor T w of the keyword w b and b/ R f0; 1g. So, T w for keyword w b is calculated as: After the challenger Ch obtains the trapdoor T wb for the keyword w b , he/she sends this trapdoor to the adversary adv1. Adversary Response: it is the moment for the adversary to guess values and output b' 2 {0, 1}. Using the advantage of the adversary Adv1 to be jPr [b¼b'] À 1/2j. In this case, adversary B will emulate adversary Adv1 to compromise the SCKHA format which is embedded with random string by pseudo-random function. Because Adversary is a polynomial time attacker, it is impossible for it to crack SCKHA format or pseudo-random function. Since the advantage is negligible, our technique satisfies the trapdoor indistinguishability requirement. Searching complexity for our scheme becomes O(n) where n is the number of the required keywords that are believed to be a tiny constant in practical applications. Schemes like Li (Li et al., 2019), and Xiong (Xiong et al., 2022) are similar where they are using a direct index structure. However, our scheme is based on building combined index structure. li, and Xiong' schemes support exact search and can't support partial search, however our scheme supports both exact and partial search. Table 6 gives a comparison between our scheme and other SSE schemes.

Conclusion and future work
In this work, we presented SE scheme ciphertext retrieval framework on cloud databases, which is based on using the SCKHA mechanism. We exploited SCKHA's ciphertext format to use it as a trapdoor function for searching in the encrypted data. Furthermore, the structure of the SCKHA scheme provides a secure retrieval of data without a support of a secure channel. Our scheme supports exact and partial matching methodology for searching over encrypted. We presented a model which embeds the search algorithm into cloud based MYSQL server. This framework is evaluated using a census database. Experiments proved that the taken time for generating the trapdoor and the keyword search in our scheme is faster than Xiong and Li schemes. In addition, the security analysis for our scheme is evaluated its performance experimentally and theoretically. Also, experimental results show that our scheme is more secure and efficient for partial in addition to exact search over encrypted database. We considered three scenarios which are single keyword queries, conjunctive keyword query, and disjunctive keyword query. Our model considers only one client scenario. But, in practice, the data owner must grant access permission to many users. Using Attribute-Based Encryption will handle this problem in the future. Also, we will support another type of query such as range query by using both Order-Preserving Encryption (OPE) and Homomorphic Encryption (HE).

Funding statement
No financial support was received.

Conflict of interest
There are no conflicts of interest.
Credit authorship contribution statement