Literature DB >> 28273795

PAVS: A New Privacy-Preserving Data Aggregation Scheme for Vehicle Sensing Systems.

Chang Xu^1,2, Rongxing Lu³, Huaxiong Wang⁴, Liehuang Zhu⁵, Cheng Huang⁶.

Abstract

Air pollution has become one of the most pressing environmental issues in recent years. According to a World Health Organization (WHO) report, air pollution has led to the deaths of millions of people worldwide. Accordingly, expensive and complex air-monitoring instruments have been exploited to measure air pollution. Comparatively, a vehicle sensing system (VSS), as it can be effectively used for many purposes and can bring huge financial benefits in reducing high maintenance and repair costs, has received considerable attention. However, the privacy issues of VSS including vehicles' location privacy have not been well addressed. Therefore, in this paper, we propose a new privacy-preserving data aggregation scheme, called PAVS, for VSS. Specifically, PAVS combines privacy-preserving classification and privacy-preserving statistics on both the mean E(·) and variance Var(·), which makes VSS more promising, as, with minimal privacy leakage, more vehicles are willing to participate in sensing. Detailed analysis shows that the proposed PAVS can achieve the properties of privacy preservation, data accuracy and scalability. In addition, the performance evaluations via extensive simulations also demonstrate its efficiency.

Entities: Chemical Disease Gene Species

Keywords: data aggregation; privacy-preserving aggregation; privacy-preserving data statistics; vehicle sensing

Year: 2017 PMID： 28273795 PMCID： PMC5375786 DOI： 10.3390/s17030500

Source DB: PubMed Journal: Sensors (Basel) ISSN： 1424-8220 Impact factor: 3.576

1. Introduction

Air pollution has become a major environmental risk factor for ill health and death. Epidemiological studies have showed that long-term exposure to PM 2.5 can cause heart disease, stroke, and lung cancer, etc. [1]. In order to attain air pollution monitoring, a series of solutions have been proposed [2,3,4]. However, traditional monitoring equipment is usually stationary, complex, and expensive due to the high cost of construction and maintenance. In contrast, vehicle sensing systems (VSS) have attracted more attention, since vehicles can be equipped with various kinds of sensors that can achieve collection and concentration measurements of a range of pollutants [5]. Specifically, the sensing data are firstly collected by vehicle sensors [6], transferred to roadside units (RSUs) by vehicle wireless transmitters via vehicular ad hoc networks (VANET) [7,8], and then relayed to remote servers by RSUs. In recent years, VSS has been regarded as a new tool to monitor gas concentration and has attracted more and more attention. Lee et al. [9] pointed out that VSS can be used to collect data when criminals spread poisonous chemicals in flight. Hu et al. [10] proposed exploiting VSS to achieve carbon dioxide monitoring. Specifically, the vehicles can be taxis or buses that collect carbon dioxide concentration and periodically report their locations and concentration. In addition, VSS can also be used for traffic monitoring. According to [11], average speed or traffic density should be collected by departments of transportation in the USA for traffic monitoring purposes. Though the traditional technologies can help collect these data, these technologies suffer high maintenance and repair costs. Designing a VSS refers to numerous problems, e.g., how to increase the sensing coverage. Accordingly, some excellent solutions have been proposed to enhance sensing coverage and reduce detection time in vehicular sensor networks [12,13,14,15,16]. Moveover, a series of aggregation schemes have been proposed [17,18,19,20]. However, these aggregation schemes are only used to reduce the overhead of transmitted sensing data. Specifically, all of the aforementioned studies did not consider how to hide the real identities and location information of vehicles. Therefore, how to achieve privacy preservation [21,22,23,24,25,26] becomes one of the most critical problems for VSS. In VSS, after the sensing data are analyzed, the statistical data e.g., the mean E(·) and variance Var(·) will probably be published in public [10]. In this case, we find there exists an attack (we call it a sensing data link attack), in which attackers may learn the vehicle’s previous location information by linking the data collected by vehicles with the published statistical data. This kind of “sensing data link attack” may breach the location privacy [27] of vehicles, since location privacy of vehicles may include the drivers’ living places, companies, and the amusement places to which they usually go, etc. [28,29,30]. Moreover, leakage of privacy is possible to produce negative effects [31]. One of the possible solutions to resist the sensing data link attack is to encrypt data and transmit ciphertexts to RSUs. However, it causes some problems in aggregation of encrypted data, e.g., how to classify the ciphertexts on the RSU side according to where the data are collected, and how to efficiently compute statistical data from aggregation results on the service provider side. Aiming at the above challenges, in this paper, we propose a new privacy-preserving data aggregation scheme for VSS, called PAVS. To the best of our knowledge, it is the first work to address this “sensing data link attack” and present a privacy-preserving data aggregation scheme to compute both the mean E(·) and variance Var(·) of sensing data for VSS. Specifically, the main contributions of this paper are fourfold: We propose new privacy-preserving data classification and privacy-preserving aggregation algorithms, so that service providers can efficiently compute the mean E and variance Var from aggregation results. In addition, the proposed PAVS captures data accuracy, i.e., the E, and Var computed from each aggregation data map to a specific area and time period. The proposed PAVS holds privacy-preserving property. Specifically, it can resist sensing data link attack. After executing PAVS, RSUs cannot get any valuable information of vehicles including vehicles’ previous location information and real identities. The PAVS scheme achieves scalability. If a service provider holds the aggregation results of areas , respectively, it can further compute the statistical data of a larger area that consists of by performing aggregation operations, without re-executing the whole PAVS scheme. To demonstrate the utility and validate the efficiency of the proposed PAVS, we theoretically analyze the performance of PAVS in terms of computational cost, communication cost and storage cost. Additionally, we develop a Java simulator to simulate the computational cost on the vehicle side, RSU side and service provider side. The experiment results show that the proposed PAVS is efficient at the three sides. The rest of the paper is organized as follows. In Section 2, we formalize the system model, security model and identify the design goal. In Section 3, we introduce bilinear pairing, related complexity assumptions, and properties of group as preliminaries. The proposed PAVS scheme is described in Section 4, followed by the security analysis in Section 5 and the performance evaluation in Section 6. The related work is given in Section 7, and we conclude this work in Section 8.

2. Models and Design Goal

In this section, we formulate the system model, the security model and identify the design goal.

2.1. System Model

In VSS, the sensing data are collected by vehicles, transmitted to RSUs, and then transferred to the service providers [6]. In our system model, the service provider further deals with the data and publishes the results of statistical analysis in public. Our model consists of four kinds of entities: trusted authority, service provider, RSUs, and vehicles (as shown in Figure 1).

Figure 1

System model.

Trusted Authority (TA): TA’s duty is to manage and distribute key materials to service providers, RSUs, and vehicles in the system. Service Provider (SP): SP deals with each aggregation result received from an RSU and gets E and Var for each area. RSUs: Each RSU serves as a message aggregator role in the system. An RSU aggregates the messages sent from vehicles and forwards the aggregation results to SP. Before executing aggregation operations, RSUs will first classify the messages according to where and when the sensing data are collected. Vehicles: Each vehicle is equipped with sensor devices. Vehicles can then collect data in different areas and transfer messages to RSUs in batch.

2.2. Security Model

In our security model, TA and SP are fully trusted. For RSUs, on one hand, RSUs will follow the designated protocol specification. On the other hand, RSUs are curious and may try to disclose vehicles’ privacy information. Specifically, RSUs can get all the messages transferred in the protocol. After RSUs get all the messages, RSUs may try decrypt the ciphertext to get sensing data and launch sensing data link attacks by linking the messages sent by vehicles and the statistical results. We will show that PAVS can resist sensing data link attack by introducing two levels of privacy: basic privacy and full privacy. Specifically, we will prove that PAVS holds full privacy to demonstrate that RSUs cannot link the messages sent by vehicles and the statistical results published by SP. Note that the collision of RSUs and SP is beyond the scope of this paper. When a run of the protocol is completed, RSUs cannot obtain vehicles’ real identity information by communicating with vehicles. When a run of the protocol is completed, RSUs cannot obtain vehicles’ real identity information and any other information with vehicles.

2.3. Design Goal

Under the aforementioned system model and security model, our design goal is to propose an efficient privacy-preserving data aggregation scheme for VSS, so that SP can obtain more abundant information from each aggregation result without vehicles’ privacy leakage. Particularly, the following four objectives should be captured: Privacy preservation. The privacy information of vehicles including previous location information and the real identities of vehicles should be protected. Accuracy. The mean E(·) and variance Var(·) computed by each aggregation result should map to a specific area and time period. Additionally, aggregation results should be generated by real RSUs, and all the sensing data should be collected by registered vehicles. Scalability. If SP has held the aggregation results for some small areas, E(·) and Var(·) for a larger area which consists of theses small areas should be efficiently computed without re-executing the whole scheme. Efficiency. The computation on the vehicle side, the RSU side and the SP side should be efficient.

3. Preliminaries

In this section, we will introduce bilinear pairing, related complexity assumptions, and properties of group that will serve as the basis of our scheme.

3.1. Bilinear Pairing and Complexity Assumptions

Let and be two multiplicative groups of order q for some large prime q, and g be a generator of . A bilinear map , which satisfies the following properties: Bilinearity: for all . Non-degeneracy: . Computability: can be computed efficiently. A bilinear parameter generator is a probability algorithm that takes a security parameter κ as input and outputs a 5-tuple , where q is a κ-bit prime number, and are two groups with the same order q, is a generator, and is an admissible bilinear map. Let

3.2. Properties of Group

Given the security parameter λ, we choose a safe prime , where and is also a prime. Then, we can calculate the Euler’s totient function as . That is, the order of is . Let According to Fermat’s Little Theorem, we have . Thus, for some integer k, the equality holds. Furthermore, we obtain Let . When , we obtain Thus, we get the following properties of group : For any , we have ; and for any , the equality holds.

4. Proposed PAVS Scheme

In this section, we present our PAVS scheme, which mainly consists of the following parts: System Initialization, Data Collection at the vehicle, Data Aggregation at RSU, and Statistical Analysis at SP.

4.1. Overview

In the System Initialization phase, TA will mainly execute the Parameter Generation algorithm to generate public parameters and the Key Generation algorithm to generate key materials to vehicles, RSUs and SP. In the Data Collection phase, the vehicles will encrypt the sensing data by performing a Data Encryption algorithm and sign the ciphertexts by running a Message Signing algorithm. After that, the vehicles will send the messages to RSUs. In the Data Aggregation phase, RSUs will classify the messages according to where and when the sensing data are collected, and aggregate the data that are collected in the same area and the same time period. Then, RSUs send the aggregation results to SP. In the Statistical Analysis phase, SP will decrypt the aggregation results and get the mean E(·) and variance Var(·) for each area. In the vehicle sensing system, the sensing data are firstly collected by vehicle sensors, transferred to RSUs by vehicle wireless transmitters via VANET, and then relayed to remote servers by RSUs. As the reviewer mentioned, the data may not be able to arrive at the data aggregation at the same time; therefore, time stamps are included in PAVS. Thus, RSUs classify the messages according to the time stamps and the area where the data are sensed. That is, only the data with the same time stamp and collected in the same area will be aggregated together. Finally, SP computes the statistic data, i.e., the E(), and Var() from each aggregation data map to a specific area and time stamp.

4.2. System Initialization

This phase is mainly comprised of the algorithm, the algorithm, and the algorithm. Parameter Generation (PG): On input security parameter λ, TA publishes system parameters where is a safe prime, is a large prime; is a generator of ; and is the output of the bilinear parameter generator. and are all cryptographic hash functions. Key Generation (KG): On input system parameters, TA generates its secret key its master private key area key and public parameter where and . Then, the following steps are executed: TA computes private key for each RSU where is the label of an RSU, and is the location of . TA generates pseudo-identity for vehicle , where is the real identity of is randomly chosen in AES is the symmetric encryption algorithm, and is used to generate the symmetric encryption key. After TA authenticates ’s real identity, TA generates ’s private key computes ’s public key and authority key . TA transfers , and to to and sends , {} and {} to SP. List Generation (LG): TA generates the vehicles’ public key list (as shown in Table 1, the area list (as shown in Table 2), the RSU private key list (as shown in Table 3), the random value lists (as shown in Table 4 R-value list-1 and Table 5 R-value list-2), and the vehicle authority key list (as shown in Table 6 A-key list). The vehicles’ Public key list, Area list, and R-value list-2 are public, the A-key list is maintained by TA and SP secretly, and the RSU private key list and R-value list-1 are kept by TA secretly.

Table 1

Public key list.

PID	PK
PID1	gs1
PID2	gs2
PID3	gs3
	...
PIDβ	gsβ

Table 2

Area list.

Areas
Area1
Area2
Area3
...
Areat

Table 3

Private key list.

RSUs	Private key
R1	H1(L1\|\|R1)s
R2	H1(L2\|\|R2)s
R3	H1(L3\|\|R3)s
...	...
Rα	H1(Lα\|\|Rα)s

Table 4

R-value list-1.

PID	R1
PID1	r1
PID2	r2
PID3	r3
...	...
PIDβ	rβ

Table 5

R-value list-2.

PID	R2
PID1	gr1
PID2	gr2
PID3	gr3
...	...
PIDβ	grβ

Table 6

A-key list.

PID	A- key
PID1	gs1r1
PID2	gs2r2
PID3	gs3r2
...	...
PIDβ	gsβrβ

The communications between TA and each vehicle, between TA and SP, between TA and each RSU are all via private and authenticated channels. TA’s secret key

4.3. Data Collection at Vehicle

After the vehicle collects sensing data, executes the following algorithm, algorithm and algorithms. Data Encryption (DE): Assume that collects in , respectively, during the same time period. Let , where t is the number of areas and β is the number of the registered vehicles in the system. In order that SP can compute E(·) and Var(·) of the sensing data of , encrypts as follows: where is the time stamp. Note that the data may not be able to arrive at the data aggregation at the same time; therefore, time stamps are included in PAVS. Thus, RSUs classify the messages according to the time stamps and the area where the data are sensed. That is, only the data with the same time stamp and collected in the same area will be aggregated together. The unit of time stamp is set by TA. In real life, the unit of time stamp can be an hour or half an hour. For simplicity, the time stamp is denoted as . That is, the subscript of T is denoted as i. In fact, the subscript can be set as any variable, since the time stamp is not related to the identities of the vehicles. Message Generation (MG): The messages sent from vehicle should include the PID of , so that RSUs can recover the public key from the public key list to verify the signature generated by . generates the message where In addition, is randomly chosen in and is the area where is collected. Here, includes two messages and which are used by RSUs to classify the ciphertexts. Message Signing (MS): computes the signature of where [32]. After that, sends to RSU .

4.4. Data Aggregation at RSUs

After RSU receives the messages verifies by executing the following algorithm. Then, classifies ..., ..., according to the areas by running the algorithm. Note that = ( ),..., = ( ). Finally, executes the algorithm. Message Verification (MV). Firstly, will check if are all registered in the Public key list; for any if it is not listed in the Public key list, will not use to generate aggregation results. Assume are included in the Public key list. further checks if holds where . If so, then is a valid message. Here, we say a message is valid if it is generated by a legitimate vehicle. Data Classification (DC): Assume are valid messages. When wants to check if and are collected in the same area and during the same period, where and , will firstly recover and to check if holds. If is satisfied, will verify if the following equality holds If so, we have . That is, and are collected in the same area. . We can see that the scheme achieves data accuracy, since each aggregation result maps to a specific area and time period. Specifically, (1) in Data Classification phase, RSUs classify the data according to the areas where and when the data was collected, which means only data collected in the same area and during the same time period will be aggregated together; (2) the aggregation results are signed by RSUs then sent to SP. That is, the aggregation results are generated by real RSUs but not impersonated RSUs; and (3) all the sensing data are generated by registered vehicles. will verify if is valid by checking whether is included in the Public key list (as shown in Table 1) and verifying by using the public key corresponding to . If is not on the list or the signature is not valid, will not be used any more. Data Aggregation (DA): Assume that are collected in the same area, , and during the same period, , where . RSU aggregates by computing and then we obtain Let generates an ID-based signature [33] on by using its private key where a is randomly chosen from and . Afterwards, sends to SP. is included in where the format of is . Thus, SP can recover from in the following Statistical Analysis phase by using and . Since only the sensing data collected in the same area will be aggregated, SP can conclude that all sensing data are collected in.

4.5. Statistical Analysis at SP

After SP receives the messages SP will verify if is valid by performing the algorithm. If is valid, SP will execute the algorithm to recover and run the algorithm to decrypt and compute E(·) and Var(·) in . Data Verification (DV): After SP receives SP verifies if the following equality holds If so, is a valid signature of . SP concludes that is generated by . Area Recovery (RA): SP extracts from . According to for any SP verifies if the following equality is satisfied If, for some the equality holds, then SP concludes that all the data aggregated in are collected in . Data Decryption (DD): SP computes statistical data E(·) and Var(·) by executing the following steps: SP recovers according to . SP computes and SP uses Pollard’s method to recover and calculates Because is within a small plaintext space . Therefore, we obtain SP computes . According to and SP computes and of variable M for where and Thus, SP gets and of variable M for at time period . Similarly, SP can compute and in other areas.

5. Security Analysis

Following aforementioned security requirements, our analysis will focus on how the proposed PAVS scheme can achieve the vehicles’ privacy-preserving property. Assume vehicle collects sensing data in different areas, submits ciphertexts to RSU and classifies and aggregates the messages and sends them to SP. We will show that the proposed PAVS scheme can resist sensing data link attack by showing that it achieves full privacy, which means that will not get any valuable information from vehicles. In order to prove the proposed scheme achieves full privacy property, we explore the game sequence [34,35] to show that cannot distinguish the messages generated by vehicle from random strings, where , and The game sequence is explored to prove that the scheme is secure. This is because game sequence is a useful tool in taming the complexity of security proofs that might otherwise become complicated as to be nearly impossible to verify [35]. In our security proof, the attack games are played between an RSU and a challenger. Both and the challenger are probabilistic processes. In the proof, Game 0 and Game 1 are constructed, where Game 0 is the original attack game. If cannot distinguish Game 0 and Game 1, we can conclude that it cannot distinguish the messages generated by a vehicle from random strings. The challenger generates private keys for n vehicles so that it can act as real vehicles. If submits to the challenger, the challenger will choose randomly as sensing data, answer ’s query by normally executing the scheme, and return the messages generated by to . At some point, submits (where is not queried before. If has been queried, may verify if and are generated by real vehicles though executing Data Classification algorithm; however, still cannot get any valuable information). The challenger generates two messages and to , where Here, are all random strings, and are generated by performing the scheme normally. The challenger selects a random uniformly, and then sends to . will return 0 if it thinks that the whole message is generated by a real vehicle . Otherwise, returns 1. We say can win Game0 with advantage , where . When submits to the challenger, the challenger will choose randomly as sensing data, answer ’s query by normally executing the scheme, and return the messages generated by to . At some time point, assume queries on (where or is not queried before). The challenger chooses two messages and to , where Here, are all random strings. The challenger selects a random uniformly and then sends to . will return 0 if it thinks that the messages include some information generated by real vehicle . Otherwise, will return 1. We say can win Game1 with advantage , where . If the advantage with which wins Game0 and Game1 is both negligible, we can conclude that cannot get any valuable information. We conclude that cannot distinguish the message generated by registered vehicles with random strings. Firstly, the advantage with which wins Game 0 is negligible. That is, cannot distinguish () from ). Let . Then, can be denoted as for some unknown . Similarly, can be denoted as for some unknown . Since h, , , are all random elements, cannot distinguish () from . Secondly, the advantage with which wins Game 1 is negligible. Assume that the challenge is to break a DBDH problem instance, i.e., to distinguish and given , , where , and is a random element in . The challenger sets ’s public key as , and as . In Game1, is treated as a random oracle [36]. The output of is set as . Specifically, the challenger will generate as follows: returns a bit and guesses that is generated by vehicle . If can distinguish a valid ciphertext from a random string with a non-negligible advantage ε, then the challenger can break the DBDH assumption with non-negligible advantage. Therefore, we can conclude that cannot get any valuable information from the messages generated by registered vehicles. Thus, the proposed PAVS scheme captures full privacy, and can resist a sensing data link attack.

6. Performance Evaluation

In this section, we evaluate the performance of the proposed PAVS scheme in terms of computational cost, communication cost, and storage cost. In order to ease the presentation, we give the corresponding notations in Table 7.

Table 7

Notations for storage and communication cost analysis.

Notation	Definition
nv	Number of vehicles
na	Number of areas
nr	Number of RSUs
ndv	Number of collected data of vehicle V
ncr	Number of collected data received by RSU R
nds	Number of aggregation results received by SP
Sid	Bit size of pseudo identity for vehicle
Srsu	Bit size of label for RSU
St	Bit size of time tamp
Sa	Bit size of area name
Sq	Bit size of an element in Zq*
Sp2	Bit size of an element in Zp2*

6.1. Theoretical Analysis

According to the proposed PAVS scheme, the computational cost, communication cost and storage cost at vehicle side, RSU side, and SP side will be analyzed in this section.

6.1.1. Computational Cost

In the proposed PAVS scheme, a vehicle needs to encrypt each piece of sensing data. Additionally, the vehicle will generate a signature. Since the vehicle can choose a generic signature to sign the messages, the performance of the signature is not analyzed here. For vehicle to encrypt each piece of sensing data, it needs to perform a pair operation, five exponentiations, and three multiplication operations. If RSU receives encrypted sensing data collected from different areas, it will classify the data. Assume and are collected in the same area. For any message , in order to verify if is collected in the same area with and , if has been verified, it is not necessary to verify if holds. Therefore, will execute at most pair operations to achieve data classification. For k ciphertexts, aggregates them by executing multiplication operations. Assume SP receives aggregation results and all the sensing data are collected in areas. SP will execute at most pair operations to recover the areas. SP executes k multiplication operations and two exponentiation operations to compute . In addition, SP needs to use Pollard’s method to recover and performs one multiplication operation to calculate .

6.1.2. Communication Cost

In the system initialization phase, TA will send long-term secrets to vehicles, RSUs and SP. After vehicles encrypt the sensing data, they will transfer the ciphertexts and a signature to RSUs. After that, RSUs will send messages to SP. The corresponding communication cost is listed in Table 8.

Table 8

Communication cost.

Entity	Communication Cost
TA	(nr+nv)Sg+2nvSid+(2nv+1)Sq
Vehicle	Sr+ndrSid+St+2Sη+2Sq
RSU	ndr(2Sg+Sη)+St+Sid+Sg

6.1.3. Storage Cost

The storage cost is related to the phase of system initialization. The storage overhead in TA is + 2 + +, where is the cost to store long-term secrets for vehicles, RSUs, SP and TA itself, and 2 + + is the cost to store the public lists. The storage overhead at vehicle is 2+, at RSU is , and at SP is (+) + as shown in Table 9.

Table 9

Storage cost.

Entity	Storage Cost
TA	nrSg+(3+2nv)Sq + 2nvSg + naSa + nvSid
Vehicle	2Sq + Sid
RSU	Srsu + Sg
SP	nv(Sid + Sg) + Sq

6.2. Experimental Simulation

6.2.1. Implementation and Experimental Settings

The performance of PAVS is independent from the security parameters and the number of hash functions. Accordingly, Table 10 shows the parameter settings. The experiment is run on a test machine with Intel(R) Core(TM) I5-4200u 1.6 GHz four-core processor, 8 GB RAM, and a Windows 8 platform based on a Java Pairing-Based library [37].

Table 10

Parameter settings.

Parameter	\|q\|	\|p′\|	\|p\|	H	H1	H2
Setting	160 bit	512 bit	513 bit	160 bit	160 bit	513 bit

6.2.2. Computational Costs on the Vehicle Side

For a vehicle , it needs to encrypt the sensing data, generate and sign . Thus, in the experiments, the computational costs of are simulated by the total runtime including encryption, signature generation, and message generation algorithms on the vehicle side. On the vehicle side, the amount of sensing data varies from 10 to 100. The change tendency of the computational cost on the vehicle side is shown in Figure 2. We can see that the computational cost is 1.235 s if is 10, and 11.141 s when equals 100. Therefore, the algorithms for vehicles are efficient enough.

Figure 2

Computational costs on the vehicle side.

6.2.3. RSU’s Computational Cost

On the RSU side, the RSUs need to verify if the received messages are valid, classify the messages and aggregate the messages. Therefore, the computational cost of an RSU is measured by the total runtime including Message Verification, Data Classification, and Data Aggregation algorithms. According to the proposed PAVS, the performance of data classification is not only related to the number of messages received by RSUs, but is also related to the number of areas which the vehicles pass by. That is, with different numbers of vehicles and areas, the computational cost of RSU will be different. Thus, we set the number of vehicles as {5, 10,..., 50} and the number of areas the vehicles pass by as {1, 2,..., 10}. As shown in Figure 3, although the increase of and leads to the increase in computational costs of RSU, the maximum running time is less than 48 s. Therefore, PAVS is efficient when computing on the RSU side, since the computation is not necessary to be in real time.

Figure 3

Computational costs on the RSU side.

6.2.4. SP’s Computational Cost

On the SP side, SP will verify if is valid, recover , and decrypt . Therefore, the computational cost on the SP side is measured by the total runtime including the Data Verification algorithm, Area Recovery algorithm, and Data Decryption algorithm. On the SP side, the number of vehicles and the number of areas which the vehicles pass by are still two core parameters. Accordingly, is chosen from 5 to 50 and is chosen from 1 to 10 to measure the computational overhead of different situations. The results are shown in Figure 4. Despite the fact that and increase, the running time of SP to get E(·) and Var(·) is less than 36 s, which is also acceptable.

Figure 4

Computational costs on the SP side.

6.3. Scalability

Assume SP has received the aggregation results of , respectively. If SP wants to get the statistical data E(·) and Var(·) of a larger area which includes some areas of , SP can still compute the new E(·) and Var(·) without re-executing the whole scheme. For instance, SP can get E(·) and Var(·) of a larger area which consists of and (as shown in Figure 5), as long as SP further aggregates and then executes Step 2, Step 3 and Step 4 of the Data Decryption algorithm.

Figure 5

Area combining.

7. Related Works

In this section, we will mainly explore some of the existing work about VSS, since we propose a privacy-preserving data aggregation scheme for VSS. In Ref. [10], Hu et al. constructed a VSS to monitor the concentration of carbon dioxide (CO) gas. The VSS can collect CO concentration in a large field. Then, the collected data are reported to a remote server. The authors monitored the CO concentration in Hsin-Chu city, Taiwan, and the data are displayed on a Google map. However, the authors did not consider security issues in their scheme. In Ref. [38], the authors proposed deploying mobile agents to collect sensor data from some specific road segments. The mobile agent moves among vehicles and communicates with the neighbour vehicles via wireless broadcast which may not reach all the vehicles in the given segment. In order to solve the problem, they proposed an agent-based data collection scheme that can help achieve close to 100% data collection rate. Similarly, in order to enhance sensing coverage, Masutani [39] proposed a route control method. The simulation experiment shows that the sensing coverage can be enhanced significantly without increasing the number of sensing vehicles. Different to Ref. [38,39], Zhang et al. [40] proposed the maximum coverage quality with a budget constraint problem. They proposed a new algorithm by selecting some of mobile users to maximize the coverage quality. The results of the simulation experiments showed that their algorithm achieved better performance compared with the random selection scheme. Freschi et al. proposed a data aggregation method [41] to monitor the roughness of road surfaces. In addition, a series of data aggregation schemes [17,18,19,20] have been proposed. However, security issues are not considered in these studies. In Ref. [42], the proposed scheme achieves authentication and integrity of aggregation data by aggregating the data and the message authentication codes. In order to tolerate duplicate messages, they also presented a probabilistic data aggregation scheme. However, privacy-preservation is not considered in [42]. Wu et al. proposed a hybrid routing scheme in urban hybrid networks [43]. They firstly presented a location-based crowd sensing framework. Then, they constructed a routing switch mechanism by utilizing ad hoc solutions and RSU resources to guarantee quality of data dissemination. In Ref. [44], the authors proposed a broadcast protocol that can support dense and sparse traffic regimes. Lee et al. [9] proposed MobEyes to support urban monitoring. For MobEyes, vehicle-local processing capabilities are utilized to extract features, and mobile agents move and collect summaries from mobile nodes. If the agents identify interest data, they will contact the involved vehicles. In Ref. [45], Lee et al. further described MobEyes. They introduced the analytic model for MobEyes performance, the effects of concurrent execution of multiple harvesting agents, the valuation network overhead, and so on. Similarly, the privacy issues are not referred to in their work.

8. Conclusions

In this paper, we have proposed PAVS—an efficient privacy-preserving data aggregation scheme for VSS. Compared with existing schemes, the proposed PAVS scheme has been identified to compute the statistical data from aggregated encryption data. To realize PAVS, we have designed concrete privacy-preserving data classification and privacy-preserving aggregation algorithms. Detailed analysis shows it can resist a sensing data link attack and hold data accuracy and scalability. PAVS’s efficiency has been evaluated with theoretical analysis and experiments. Through extensive performance evaluations, we have demonstrated that the proposed PAVS’s scheme is efficient on the SP/RSU/vehicle sides.

3 in total

1 in total

1. Efficient Privacy-Preserving Data Sharing for Fog-Assisted Vehicular Sensor Networks.

Authors: Yang Ming; Xiaopeng Yu
Journal: Sensors (Basel) Date: 2020-01-16 Impact factor: 3.576