| Literature DB >> 28545151 |
Muhammad Imran1, Helmut Hlavacs2, Inam Ul Haq3, Bilal Jan1, Fakhri Alam Khan3, Awais Ahmad3.
Abstract
Cloud computing is a recent tendency in IT that moves computing and data away from desktop and hand-held devices into large scale processing hubs and data centers respectively. It has been proposed as an effective solution for data outsourcing and on demand computing to control the rising cost of IT setups and management in enterprises. However, with Cloud platforms user's data is moved into remotely located storages such that users lose control over their data. This unique feature of the Cloud is facing many security and privacy challenges which need to be clearly understood and resolved. One of the important concerns that needs to be addressed is to provide the proof of data integrity, i.e., correctness of the user's data stored in the Cloud storage. The data in Clouds is physically not accessible to the users. Therefore, a mechanism is required where users can check if the integrity of their valuable data is maintained or compromised. For this purpose some methods are proposed like mirroring, checksumming and using third party auditors amongst others. However, these methods use extra storage space by maintaining multiple copies of data or the presence of a third party verifier is required. In this paper, we address the problem of proving data integrity in Cloud computing by proposing a scheme through which users are able to check the integrity of their data stored in Clouds. In addition, users can track the violation of data integrity if occurred. For this purpose, we utilize a relatively new concept in the Cloud computing called "Data Provenance". Our scheme is capable to reduce the need of any third party services, additional hardware support and the replication of data items on client side for integrity checking.Entities:
Mesh:
Year: 2017 PMID: 28545151 PMCID: PMC5435237 DOI: 10.1371/journal.pone.0177576
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Limitations and compatibility issues of existing integrity schemes in Clouds.
| Technique | Limitation | Cloud Compatibility |
|---|---|---|
| Mirroring Technique | Efficiency problem in terms of storage space and computations | Not tested in Cloud |
| RAID Parity | Need specialized hardware | Not Tested in Cloud |
| Checksumming | Computation overhead | Not tested in Cloud |
| Data Integrity Proof in Cloud Storage [ | Dependency on client computations | Tested in Cloud |
| Ensuring Data Integrity in Cloud Storage [ | Dependency on outside third party | Tested in Cloud |
Fig 1Black box architecture of provenance based data integrity verification.
Fig 2Collection of provenance data from different layers of abstraction in Clouds.
Various attributed with their description and corresponding layer of Cloud environment.
| Layer | Attribute | Description |
|---|---|---|
| Server (Cloud) | Item-ID | Unique ID of date item stored in the Cloud storage |
| OwnerID | Unique ID of the data item owner | |
| ItemAccessCounter | A Counter that depicts how many time a particular item was accessed | |
| Middleware | CreationDateTime | Date and Time of arrival for storage |
| UpdateDateTime | Date and Time of last change of data item and hash value | |
| Client | InitHashValue | Initial hash value computed at arrival for storage |
Fig 3Various components of integrity service (integrity tracker phase).
Fig 4Architecture of DataStore application with different layers of abstraction.
Fig 5Visualization of integrity leaks to the end user.
System details for evaluation.
| Resource | Operating System | Memory (MB) | Eucalyptus Component | Disk Size (GB) | CPU Architecture | CPU Cores | Network (Mb/s) |
|---|---|---|---|---|---|---|---|
| Machine 1 (Server) | Ubuntu 10.04 | 2048 | Walrus | 80 | x84_64 Intel(R) Core (TM) 2 | 2 (2.33 GHz) | 100 |
| Machine 2 (Server) | CentOS 6.4 | 4192 | Cluster, Node | 250 | x84_64 Intel(R) Core (TM)2 | 4 (2.83 GHz) | 100 |
| Machine 3 (Client) | Ubuntu 12.04 | 2048 | Amazon SDK | 80 | x84_64 Intel(R) Core (TM) 2 | 2 (2.13 GHz) | 100 |
Fig 6Results of the calculated time (minutes:seconds format) with and without the provenance for Eucalyptus Walrus.
Fig 7Cost of provenance collection in terms of elapsed time (in seconds) for different number of objects.
Fig 8Cost of provenance storage (disk space in Kilo bytes) for different number of objects.
Comparison of existing integrity schemes in Cloud.
| Data Integrity Scheme | Data Integrity Method | Advantages | Limitations |
|---|---|---|---|
| Provable Data possession [ | Key generation based on file comparison | Strong data integrity verification, Reduced network traffic | High computational cost on the server end for computing hash value of each file, Utilized mainly in single server settings, Depends on Client. |
| Proof of Retrievability [ | Cryptographic techniques such as sentinel based values | Less storage overhead on server side because storing only sentinel values | Computational overhead for pre-processing the sentinel value |
| HAIL [ | Data redundancy across multiple servers using principles of RAID | The proof is independent of the size of data items e.g. files of different sizes. | Storage overhead because of replicating data items. Works only for static files. Thin client cannot adopt such a scheme |
| MR-PDP [ | Multiple replicas using redundancy techniques | Provide the facility of on-demand replicas | Storage overhead because of replication. Computation required on both the client and server side |
| Third party auditor | Key generations and MAC based scheme utilized by third party | TPA can apply multiple integrity schemes for checking data integrity, Users can choose different TPA based on their preferences | Retrieval of data blocks from third party for checking data integrity (privacy issues), More cost due to involvement of third party auditors. |
Comparative analysis of proposed scheme with existing data integrity schemes.
| Evaluation Parameter | Proposed scheme | Provable data procession | Proof of Retrievability | HAIL | MR-PDP | Third party auditors |
|---|---|---|---|---|---|---|
| Dependency on third party | No | No | No | No | No | yes |
| Computation cost | Negligible because of coarse grained provenance [ | High | High | Low | Low | Depends on the scheme |
| Storage cost | Negligible because of link based provenance [ | Low to Medium | Low to Medium | High | High | Depends on the scheme |
| Basic Scheme | Utilizing historical data (provenance) | Key generation algorithms | Cryptog- raphic techniques | Data redundancy | Data replicas across multiple servers | Depends on the scheme |
| Integrity Checks | File level | File level | File level | File and block level | File and block level | Mainly on file level |
| Dependency on client | No | Yes | Yes | No | No | Yes |