| Literature DB >> 31440503 |
Abstract
This study investigates the role and functionality of special nucleotide sequences ("DNA signatures") to detect the presence of an organism and to distinguish it from all others. After highlighting vulnerabilities of the prevalent DNA signature paradigm for the identification of agricultural genetically modified (GM) organisms it will be argued that these so-called signatures really are no signatures at all - when compared to the notion of traditional (handwritten) signatures and their generalizations in the modern (digital) world. It is suggested that a recent contamination event of an unauthorized GM Bacillus subtilis strain (Paracchini et al., 2017) in Europe could have been-or the same way could be - the consequence of exploiting gaps of prevailing DNA signatures. Moreover, a recent study (Mueller, 2019) proposes that such DNA signatures may intentionally be exploited to support the counterfeiting or even weaponization of GM organisms (GMOs). These concerns mandate a re-conceptualization of how DNA signatures need to be realized. After identifying central issues of the new vulnerabilities and overlying them with practical challenges that bio-cyber hackers would be facing, recommendations are made how DNA signatures may be enhanced. To overcome the core problem of signature transferability in bioengineered mediums, it is necessary that the identifier needs to remain secret during the entire verification process. On the other hand, however, the goal of DNA signatures is to enable public verifiability, leading to a paradoxical dilemma. It is shown that this can be addressed with ideas that underlie special cryptographic signatures, in particular those of "zero-knowledge" and "invisibility." This means more than mere signature hiding, but relies on a knowledge-based proof and differentiation of a secret (here, as assigned to specific clones) which can be realized without explicit demonstration of that secret. A re-conceptualization of these principles can be used in form of a combined (digital and physical) method to establish confidentiality and prevent un-impersonation of the manufacturer. As a result, this helps mitigate the circulation of possibly hazardous GMO counterfeits and also addresses the situation whereby attackers try to blame producers for deliberately implanting illicit adulterations hidden within authorized GMOs.Entities:
Keywords: DNA signatures; GMO counterfeiting; bio-cryptanalysis; bio-cyber hacker; cryptographic applications; cyberbiosecurity; insecure channel; knowledge-based methods
Year: 2019 PMID: 31440503 PMCID: PMC6693310 DOI: 10.3389/fbioe.2019.00189
Source DB: PubMed Journal: Front Bioeng Biotechnol ISSN: 2296-4185
Cryptographic concepts and goals. In the cyber-domain, many of those can be addressed via digital signatures.
| Confidentiality | This is a service used to keep the content of information from all but those authorized to have it. Secrecy is a term synonymous with confidentiality and privacy. There are numerous approaches for providing confidentiality, ranging from physical protection (e.g., a box with a lock, a sealed envelope, or a wall-safe) to mathematical algorithms which render data unintelligible | Digital signatures, access control, hardware protection |
| Data integrity | This is a service which addresses the alteration of data. To ensure data integrity, one must have the ability to detect arbitrary errors, as well as manipulation by unauthorized parties. Data manipulation includes such things as insertion, deletion, and substitution | Hashing, message-authentication protocols, digital signatures |
| Authentication | This is a service related to identification. This function applies to both entities (e.g., a person, a credit card, an information-carrying product - including one that is biomanufactured) and information (in particular, the source of information, including its origin, date of origin, data content, time produced, etc) | Digital signatures, passwords, authentication protocols, challenge and response |
| Availability | Is a guarantee of reliable access (to information, computers, specific components or systems, etc) by authorized people | Updates, backups, firewalls, proxy servers, physical protection |
| Non-repudiation | This is a service which prevents an entity from denying previous commitments or actions. When disputes arise due to an entity denying that certain actions were taken, a means to resolve the situation is necessary | Digital signatures, public-key schemes, trapdoor functions, commitment schemes |
It is suggested that these conceptions can help identify key functionalities of biologic signatures as well.
Principles and features of digital signatures as counterparts of traditional signatures (and with the intent toward their generalization to DNA signatures).
| How they work | “Public-key” signatures rely on the usage of specific secrets - the keys used to generate a signature. They are generated by applying a mathematical formula or an algorithm, to scramble the information into a string of digits |
| Who can produce a valid signature? | Only the holder of the private (secret) key–the signer–can produce such an “electronic autograph” |
| Who can verify a signature? | In the public-key setting, the signature can be verified by anyone |
| They provide authenticity and enable supply chain security | For messages distributed through a non-secure channel, a properly implemented digital signature gives the receiver reason to believe the message was sent by the claimed sender |
| They provide data integrity and ensure anti-counterfeiting | Any change in the message after signature will invalidate that signature, which ensures the integrity of the signed data (“the message”) against tampering or corrupting during transmission |
| They are binding | Once it is published, a signature cannot be altered or repudiated |
| What can be signed? | As with anything in the cyber-realm, the message is an alphanumeric string, including anything that can be represented as such (genomic information, producer information, processes used, etc) |
Examples of “insecure channels” in the field of cyberbiosecurity.
| DNA replication: The process of passing on a parental piece of DNA to offspring | The specific DNA sequence | The DNA sequence is the same before and after replication | Numerous cellular repair mechanism turn the potentially insecure/noisy channel into one that is secured |
| Artificial plasmids. These are carefully designed to lead to a specific trait. Specifics of the expressed phenotype are coded in the artificial sequences | The artificial DNA cassette | The sequence information of the artificial construct is the same, regardless of the lab or environment that it is utilized by. To be “secure-able” means that this information can be traced back to its original/intended sequence | Sequencing of the plasmid allows to reveal its complete and detailed sequencing information. While this is costly and technically demanding, this shows if the channel (the sequence information encoded by the plasmid) matches the expected sequence [as e.g., can be verified by secured databases (see Peccoud et al., |
| Raw data, health related information, medical databases (storage of man-made information, as opposed to sequence information in living organisms) | The digital information about medical insights, health records, etc | The digital data remain unaltered (same information regardless of when and by whom it is read out), accessible only to legitimate authorities, and whenever needed | Once the information is in place, this essentially is a cyber-problem and can therefore benefit from existing cyber-related tools |
| Artificial DNA sequences, DNA as information storage | The message is the information to be stored in form of artificial DNA bases | As above | Need to filter out alterations due to DNA processing. Can benefit from alignment-based methods such as distance-measures (e.g., Federhen et al., |
| Expression of a transgene via a GMO. Targeted phenotypic trait and expression levels | “The message” is the specific transgene. The channel that aims to be protected is the transgene only “The message” is the entire genome. The channel that aims to be protected is the entire organism | The transgene achieves its targeted phenotypic expression, relative to its trait, expression level, and in the context of its intended (molecular, biologic, cellular) environment | The phenotypic expression can be influenced by illicit genetic modifications outside of the transgene. If integrity is verified with respect to the transgene only, such covertly introduced modifications are not detected. They lie outside the specific channel To obtain a secure-able channel, it needs to be the case that (1) The entire genome can be sequenced, (2) The sequencing information obtained in different contexts and circumstances always lead to the exact same sequence (possibly including predictable differences within a certain range or distance) |
| Modern gene-edited plants and crops (see e.g., Grohmann et al., | Unclear what the message is. This is because the intended effect is based on a range of expression levels via specific biochemical pathways, which are dependent on their context and environment (here, environment is meant across the full spectrum, from molecular to gross) | The intended outcome is a spectrum of traits, depending on the specific context and environment. Here, secure-able would mean the same spectrum of phenotypic expression, as informed by different, discrete conditions in a clearly causative way | It seems much more difficult to secure a channel like this, where there is no tangible fixed, physical message that can be identified as the key information to be protected |
The key feature of insecure channels often can be formulated in terms of existing cryptographic primitives. For instance, all channels involve attributes that aim at leaving some information unaltered (integrity). Insecure channels in the cyber domain build on the salient feature that these can in fact be “secured.” In the context of integrity this would mean that the original intended information can be recaptured. In cryptography, what needs to be secured is typically called “the message.” It is important to note that this term has nothing to do with our contemporary usage of this word. Here, it describes the defining characteristics of the insecure channel. By identifying “the message” involving biological mediums it is found that many of the insecure channels are in fact “insecureable”.
Figure 1Major shortcomings of DNA signatures compared to traditional and cryptographic signatures. Traditionally, a number of security properties were obtained by sending a message concealed from outside manipulations, in form of sealed envelopes with signatures. This approach helped to ensure integrity (content of the message), its authenticity (sender and receiver), and confidentiality (the content is kept from access and alterations through unauthorized third parties). Similar features can be obtained by cryptographic signatures, by applying a mathematical algorithm (“signing”) to some fixed piece of information (“the message”). Importantly, any alterations to “the message” would not only be detected, but would invalidate the signature. The task of signing biologic entities is significantly more complex. This figure summarizes the critical vulnerabilities identified in the text (see section 3.2).
Figure 2Herein, unrecognized risks involving counterfeiting attacks are identified that rely on the intentional misuse of prevailing DNA signatures (section 3). Although no such GMO counterfeit is confirmed in circulation, a recent B2 contamination event in Europe (Paracchini et al., 2017) demonstrates that these risks need to be taken seriously. Depending on the type of risk, different strategies need to be pursued. Steps toward realizing these goals are described in sections 4, 5.
Figure 3The types of attacks involving GM plants as considered by Mueller (2019) (central part of the figure), roughly ordered from bottom to top relative to their risk-potential. Their impact is also hierarchical with risks at the lower level inherited at higher-levels. Herein, the focus is on the degree to which confidentiality and authenticity are violated (see section 4.2).
Figure 4Herein, improvements of DNA signatures are obtained by utilizing cryptographic tricks that have proven useful for special cryptographic applications such as identification protocols and enhanced signatures (Menezes et al., 1996; Camenisch and Michels, 2000; Ateniese, 2004; El Aimani, 2009; Xia, 2013). At the core are (mathematical) interactive proof systems to demonstrate the (in)validity of a certain statement such as, “This is my personal PIN.” The significance of Zero-Knowledge (ZK) proofs lies in the fact that such systems can convince of the correctness of the statement without needing the involved parties to expose any details, such as, specifics of the PIN itself. ZK protocols can be overlaid with a feature that ensures authenticity of the originator of the statement or signature. When combined, this gives a powerful method to verify signatures while at the same time preventing their transferability or misuse by unauthorized parties.
Figure 5Summary of the proposed method to enhance DNA signatures. Signatures are represented and verified in two ways. One is digital and based on specific cryptographic signatures (section 5.1) by utilizing enhanced Zero-Knowledge (ZK) proofs of knowledge via a cryptographic “invisibility” property (Figure 4). The second part ties the actual (physical) GMO to the digital part and adds a physical “invisibility” feature. Consequently, it is possible to (1) Demonstrate genuineness of a legitimate signature (this can be done both physically and digitally), (2) Prevent counterfeiters from selling manipulated GMOs, and (3) Allows authentic producers to demonstrate that a falsely attributed (fabricated) GMO is not theirs. This step may require WGS and can only be performed by a TTP or competent enforcement authorities who can verify the secret assignment into “valid” or “dummy”.
Figure 6The digital part of the enhanced DNA signature method utilizes special cryptographic signatures (sections 5.1, 5.3) whereby signature verification is accomplished via a protocol rather than verification of presence or absence of a certain sequence. This gives a high degree of security and can only be achieved by legitimate producers (or their proxies) who know the underlying secret used for computing these cryptographic signatures. Attackers are not able to mimic this process and therefore cannot distribute counterfeits of GMOs by trying to masquerade them as the original product (Figure 2).
Figure 7Various types of DNA signatures as considered herein, from bottom to top with increasing levels of security. 1. Represents the existing DNA signature paradigm (e.g., Levine, 2004); 2. and 3. are described in Mueller et al. (2016), and 4. (section 5) is an extension of the cryptographic invisibility feature which is central to the underlying cryptographic part in Mueller et al. (2016). (Sign, Signature; Adv, Advantage; Disadv, Disadvantage; Confirm, Confirmation).
Summary of the proposed method to enhance DNA signatures, relative to the two main goals of disputing a falsely alleged GMO or confirming a genuine one (see Figure 2).
| True positive | An authentic GMO can be verified as such. Signature verification protocol returns “ok” | Manufacturer/proxy can successfully run the cryptographic confirmation protocol The existence of the hashes of all the signature transgenes within the GMO is publicly verifiable (e.g., via hybridization) |
| False positive | The protocol falsely identifies/approves an unauthorized/adulterated GMO (danger of distributing a counterfeit) | Not possible, due to the cryptographic part of the protocol (as a necessary requirement to bring GMOs into circulation) : By virtue of the ZK property, attackers cannot impersonate true manufacturer; hence, cannot sell a counterfeit. The digital part is linked to the physical via signature hashes |
| True negative | An unauthorized GMO can be confirmed as such. Important that this is done via the physical part of the protocol as the digital part only gives information | Physical denial part. Thereby, a GMO is Not the complete set of signature hashes present within the genome (publicly verifiable via PCR, etc.) Verification by competent authorities (who have access to the secret of which clones are valid/dummy), according to the following Identify genetic adulterations (may require WGS) Amplify all valid clones If the illicit genetic alteration is found on a dummy clone, the GMO is a counterfeit |
| False negative | A genuine GMO is identified as inauthentic | Not possible, due to (1) the correctness/completeness of the digital part (an honest prover can successfully run the protocol), and (2) as long as physical signature components within the genome are stably integrated |