| Literature DB >> 26955504 |
Abstract
The advent of next-generation sequencing (NGS) brings with it a need to manage large volumes of patient data in a manner that is compliant with both privacy laws and long-term archival needs. Outside of the realm of genomics there is a need in the broader medical community to store data, and although radiology aside the volume may be less than that of NGS, the concepts discussed herein are similarly relevant. The relation of so-called "privacy principles" to data protection and cryptographic techniques is explored with regards to the archival and backup storage of health data in Australia, and an example implementation of secure management of genomic archives is proposed with regards to this relation. Readers are presented with sufficient detail to have informed discussions - when implementing laboratory data protocols - with experts in the fields.Entities:
Keywords: Cryptography; genomics; privacy; security; storage
Year: 2016 PMID: 26955504 PMCID: PMC4763811 DOI: 10.4103/2153-3539.175793
Source DB: PubMed Journal: J Pathol Inform
The field of cryptography extends beyond the scope of what many readers may suspect. A selection of cryptographic domains and their respective focuses are outlined
Cryptographic mitigations as they apply to requirements of the NSW Health Privacy Principle 5 which is similar to the Australian Privacy Principle 11. Note that, as the authentication mechanisms described herein are based on those employed in fingerprinting, the use of authentication alone suffices to meet both requirements
Two very similar genetic regions, with only a single-nucleotide difference, have vastly different fingerprints generated by the SHA512 algorithm—only the first 16 bytes are shown, in hexadecimal notation. This is due to the avalanche effect[36]. The strict avalanche criterion[37] is met when changing a single bit in input data results in a 50% probability for the change of each output bit, independent of all other changes in output
Figure 1An example implementation detailing how a genomics laboratory may store data. The implementation is elucidated in the text, and the figure should not be interpreted in isolation. (a) A public and private key pair are generated, and the private key is protected—in the absence of a hardware security module, hard-copy media and physical protections can be used. The public key may be shared with anyone, even an adversary. (b) Data from an NGS run are encrypted with a unique key. (c) A fingerprint is generated for the encrypted data, using a different key to that which was used for encryption. (d) Both the encryption and fingerprint keys are kept secret by placing them in a “digital envelope” using the public key that was generated in the first step. The envelope can only be opened with the private key, and knowledge of the public key is insufficient to derive its private counterpart. (e) The encrypted NGS data, their fingerprint, and the envelope can be stored with a vendor on the Certified Cloud Services List.[29] This forms a “trapdoor-like” protocol whereby encryption of data is easy, but decryption requires physical access to a private key which is protected to at least the same extent as laboratory equipment