Literature DB >> 29109798

Multiplexed Nucleic Acid Programmable Protein Arrays.

Xiaobo Yu¹, Lusheng Song², Brianne Petritis², Xiaofang Bian², Haoyu Wang², Jennifer Viloria², Jin Park², Hoang Bui², Han Li², Jie Wang², Lei Liu¹, Liuhui Yang¹, Hu Duan¹, David N McMurray³, Jacqueline M Achkar⁴, Mitch Magee², Ji Qiu², Joshua LaBaer².

Abstract

Rationale: Cell-free protein microarrays display naturally-folded proteins based on just-in-time in situ synthesis, and have made important contributions to basic and translational research. However, the risk of spot-to-spot cross-talk from protein diffusion during expression has limited the feature density of these arrays.
Methods: In this work, we developed the Multiplexed Nucleic Acid Programmable Protein Array (M-NAPPA), which significantly increases the number of displayed proteins by multiplexing as many as five different gene plasmids within a printed spot.
Results: Even when proteins of different sizes were displayed within the same feature, they were readily detected using protein-specific antibodies. Protein-protein interactions and serological antibody assays using human viral proteome microarrays demonstrated that comparable hits were detected by M-NAPPA and non-multiplexed NAPPA arrays. An ultra-high density proteome microarray displaying > 16k proteins on a single microscope slide was produced by combining M-NAPPA with a photolithography-based silicon nano-well platform. Finally, four new tuberculosis-related antigens in guinea pigs vaccinated with Bacillus Calmette-Guerin (BCG) were identified with M-NAPPA and validated with ELISA.
Conclusion: All data demonstrate that multiplexing features on a protein microarray offer a cost-effective fabrication approach and have the potential to facilitate high throughput translational research.

Entities: CellLine Chemical Disease Gene Species

Keywords: Antibody; Biomarker.; Cell-free protein microarray; Protein-protein interaction; Proteomics

Mesh：

Substances：
Biomarkers

Year: 2017 PMID： 29109798 PMCID： PMC5667425 DOI： 10.7150/thno.20151

Source DB: PubMed Journal: Theranostics ISSN： 1838-7640 Impact factor: 11.556

Introduction

Protein microarrays display individual proteins at high density on a chemically-modified slide that can be tested simultaneously with high sensitivity, high specificity, and low reagent consumption. They have been widely applied in basic and translational research, such as protein interaction studies, immune profiling, vaccine development, biomarker discovery and clinical diagnostics, etc. 1-12. For example, Zhang et al. used a human protein microarray to better understand how arsenic, which is used in chemotherapy, disrupts cancer signaling pathways and, further, to identify potential targets of novel therapeutic treatments. Of the 16,368 proteins that were screened, 360 arsenic binding proteins were identified, which may be novel targets for cancer treatment 7. Anderson et al. used protein microarrays to discover a 28-autoantibody biomarker signature of early stage breast cancer with a sensitivity and specificity of 80.8% and 61.6%, respectively 13. By combining those autoantibodies with several protein biomarkers, Provista Diagnostics developed the first protein-based blood test for early breast cancer detection called Videssa® Breast 14. Ayoglu et al. screened sera from multiple sclerosis (MS) patients using protein microarrays containing 11,520 purified protein fragments and then validated those results using bead-based arrays 15. The arrays indicated that Anoctamin 2 autoantibodies and the MS-associated HLA complex DRB1*15 allele were strongly associated. Additional experiments showed that Anoctamin 2 aggregates near and inside lesions within human MS brain tissue 15. Protein microarrays can be classified into two different types, purified or cell-free, based on whether the proteins are produced in vivo or in vitro, respectively 16. Purifying proteins is labor-intensive, requires method optimization and multiple manipulations, exhibits highly variable yields of different proteins, and may not result in naturally-folded or functional mammalian products due to expression in non-mammalian systems (e.g., E. coli, yeast). Cell-free protein microarrays overcome these challenges by depositing RNA or DNA on the slide surface and rapidly expressing them just before an experiment (~2 h) through the use of various cell-free expression systems (e.g., lysate from wheat germ, insect cells, rabbit reticulocyte and human cells). Compared to purified protein microarrays, cell-free protein microarrays are more likely to produce naturally-folded mammalian proteins due to the decreased sample manipulation and use of enhanced cell extracts with native chaperone proteins. Moreover, the use of nucleic acids vastly simplifies the production of custom arrays since any protein can be produced as long as the gene-of-interest is synthesized; for example, arrays can be produced that represent a specific proteome or signaling pathway 17-20. A primary disadvantage of planar-based cell-free protein microarrays is the diffusion of mRNA or expressed proteins during in vitro transcription and translation (IVTT), which can then be captured by neighboring features (i.e., cross-talk). Thus, the closer the features are to each other, the higher the cross-talk 21-24. Planar-based cell-free protein microarrays include the protein in situ array (PISA) 22, DNA array to protein array (DAPA) 25, nucleic acid programmable protein array (NAPPA)18, 26-28, and in situ puromycin-capture array 29. DAPA, NAPPA and puromycin-capture arrays employ a probe (e.g., Ni-NTA or anti-tag antibody) on a microarray surface that captures the expressed recombinant proteins in situ during IVTT. Of the cell-free approaches, NAPPA has achieved the highest densities with ~ 2,300 plasmids per slide where the distance between neighboring spots is 625 µm) and the cross-talk is less than 2%. However, cross-talk is increased when the feature spacing is reduced to 375 µm 21. With ~ 2,300 plasmids per slide, five NAPPA slides are needed to screen a proteome-scale array with over 10,000 genes 18, 30. Therefore, an increase in spot density would reduce the amount of labor, time, reagents, and cost needed for large-scale proteome analyses like target discovery and validation experiments. To address this issue, Angenendt et al. printed cDNA and expressed the proteins in nanowells using piezoelectric dispensers 31. Takulapalli et al. demonstrated the fabrication of high-density cell-free protein arrays by combining photolithographically-etched silicon nanowells (n=8,000/slide), NAPPA, and a piezo-inkjet printer 21. Here we utilized a different strategy to produce high density arrays that does not require any specialized equipment or substrates. We developed the Multiplexed Nucleic Acid Programmable Protein Array (M-NAPPA) method by combining as many as five different DNA plasmids within one spot, which increases the number of displayed proteins per microarray by five-fold. We first demonstrate that multiplexed proteins are displayed on M-NAPPA using protein-specific antibodies. Second, we compare the ability of M-NAPPA with non-multiplexed NAPPA to detect different protein-protein interactions and the serological antibody reactivity against 646 viral proteins. Next, we show the feasibility of M-NAPPA in performing high throughput screening for immune-dominant tuberculosis (TB) antigens through the use of an ultra-high density M-NAPPA TB proteome array containing four subarrays with 4,045 TB open reading frames (ORFs) on one slide. Using M-NAPPA TB protein microarrays, four new immune-dominant antigens in the sera of BCG-vaccinated guinea pigs were identified, which were then validated using ELISA. Finally, we propose a high throughput target discovery and verification pipeline based on the M-NAPPA approach.

Materials and Methods

Sera samples

All sera samples were collected with written informed consent with the approval of Institutional Review Boards (IRB) at University of Florida (Gainsville, FL), Arizona State University (Tempe, AZ) and Albert Einstein College of Medicine (Bronx, NY). Detailed sample information was provided in our previous work 17, 20. The sera from guinea pig TB models vaccinated with BCG were kindly provided by Dr. David N. McMurray from Texas A&M College of Medicine 32. All experiments using clinical sera samples were executed according to the Declaration of Helsinki.

Generation of mathematical model

A mathematical model was built based on a two-step analysis process. The first round of screening would use multiplexed plasmids with the primary objective of identifying potential protein “hits.” The second round would be non-multiplexed, in which each multiplexed “hit” from the first round would be printed separately, with the primary goals of validating and identifying specific individual hits. The total number of printed spots (N) needed for the combined two-round screening of 10k proteins was determined by the number of plasmids printed per spot and the anticipated hit rates (i.e., percentage of displayed proteins that will be identified as significant in the study). The probability p of an individual protein being a true hit can be estimated from previous studies of a similar nature (e.g., antibody biomarkers). The following equation assumes that p follows a Bernoulli distribution and that its corresponding plasmid is randomly multiplexed where the number of different plasmids per spot is k: The number of spots needed in the first round is , The probability that a multiplexed spot would be a hit (i.e., containing at least one immune-dominant antigen) is . The number of spots needed in the second round is . The optimal level of multiplexing of k different plasmids per spot results in the smallest N.

M-NAPPA preparation

All human and viral ORF plasmids were obtained from DNASU (https://dnasu.org/), and transferred into a T7-based mammalian expression vector, pANT7-cGST, as previously described 18, 30, 33. Purified DNA plasmids were prepared by our automated DNA factory robot as previously described 18, 30, 33, and were normalized to 1,200 ng/μL, such that multiplexed plasmids contributed equally to the final concentration. In other words, a plasmid in a five-multiplexed spot would represent 240 ng/μL. Five (5) different plasmids containing a different gene-of-interest were mixed with a master printing mixture containing BSA (Sigma), BS3 cross-linker (Thermo Fisher Scientific, IL) and polyclonal α-GST antibody (Thermo Fisher Scientific, IL)26, and subsequently incubated at 4 oC for 2 h. M-NAPPA and NAPPA were printed by the NAPPA Protein Array Core (http://nappaproteinarray.org/) according to published protocols 18, 30, 33. The quality of printed plasmid DNA on M-NAPPA and NAPPA was determined using PicoGreen DNA staining 26.

Detection of protein expression on M-NAPPA

Each M-NAPPA microarray was blocked with Superblock solution (Pierce, Rockford, IL) for 1 h at 23 °C, briefly washed with water, centrifuged at 1000 rpm for 3 min to dry, and covered with a hybridization chamber (Grace BioLabs, OR). The array was then incubated with 160 μL of human in vitro transcription & translation (IVTT) solution containing human HeLa cell lysate, accessory proteins, reaction mixture, and nuclease-free water (Thermo Fisher Scientific, IL) for 1.5 h at 30 °C and 0.5 h at 15 °C to express the GST-tagged proteins-of-interest. The GST-tagged proteins were displayed on the slide surface via the polyclonal α-GST antibody that was included in the printing mixture. Then, the resulting protein microarray was incubated with 5% (w/v) milk in 1xPBS with 0.2% (v/v) Tween-20 (PBST) for 1 h at 23 °C, followed by three brief washes with PBST. The protein specific antibodies were diluted with 5% milk-PBST at 1:50 or 1:100, respectively, and incubated with the protein microarray for 16 h at 4 °C followed by a 1 h incubation at 23 °C with an Alexa Fluor 555 labeled secondary antibody (Jackson ImmunoResearch Laboratories, PA). After washing three times with PBST, the M-NAPPA slides were briefly rinsed with water and dried by centrifugation (2,000 rpm, 2 min). The arrays were scanned by a Tecan scanner (Männedorf, Switzerland).

Detection of protein-protein interactions on M-NAPPA

After proteins were expressed on M-NAPPA, the resulting protein arrays were blocked with blocking buffer (1×PBS, 1%Tween 20 and 1% BSA, pH 7.4) for 1 h at 4 oC. In parallel, the query proteins (e.g., Rb1, Jun, Fos, LidA) fused to a HaloTag were produced by incubating 90 ng of DNA in 180 μL human cell-free expression system (Thermo Fisher Scientific, IL) for 2 h at 30 oC. To screen protein-protein interactions, the protein array was incubated with unpurified Rb1-Halo protein in human HeLa lysate for 16 h at 4 oC, and then washed with cold washing buffer (PBS, 5 mM MgCl2, 0.5% Tween20, 1% BSA and 0.5% DTT, pH 7.4 at 4 oC) three times to remove unbound molecules. The arrays were consecutively incubated with a chicken anti-Halo tag antibody (GeneTel, WI) and Alexa Fluor 555 goat anti-chicken secondary antibody (Jackson ImmunoResearch Laboratories, PA) for 2 h at 4 oC. Arrays were washed and dried with brief centrifugation at 1,000 rpm for 1 min, and scanned as described above. The protein binding signal was quantified using Array-Pro Analyzer (Media Cybernetics) software as previously reported 20, 33.

Data visualization

For a given experiment, a tab-separated file with the interaction information was generated and loaded into the Cytoscape software 34 with an attribute file that contained signal intensities of features on M-NAPPA and NAPPA. In Figures 4 and 5, proteins within a multiplexed M-NAPPA feature and its five corresponding non-multiplexed proteins on NAPPA were displayed as connecting large and small nodes, respectively, with color gradients depicting signal intensities.

Figure 4

Detection of protein-protein interactions on M-NAPPA. (A) Representative images of Rb1-HaloTag binding to its known protein targets on M-NAPPA and NAPPA, which were detected using a chicken anti-HaloTag antibody and Alexa555 goat anti-chicken antibody. Rb1's protein partners are indicated with a red arrow. False-colored images across a rainbow scale corresponds to the relative level of Rb1 binding signal, where low and high Rb1 binding levels are represented by blue and red, respectively; (B) Multiplexed features on M-NAPPA (large circle) and the deconvoluted features on NAPPA (smaller connecting circles) that bound to Rb1-HaloTag. The blue scale bar corresponds to the relative level of Rb1 binding signal within the spot while the white-to-red color scale corresponds to the level of Rb1 binding signal to diffused target protein outside of the spot (i.e., “ring”).

Figure 5

Detection of serological antibodies on viral M-NAPPA arrays. (A) Representative images of viral antibody detection on M-NAPPA and NAPPA; (B) Comparison of anti-viral antibodies binding to their displayed protein antigens on M-NAPPA (large circle of five multiplexed genes) and NAPPA (small circle of deconvoluted genes). The blue scale bar corresponds to the relative level of antibody binding signal.

Detection of serological antibodies on M-NAPPA

After proteins were expressed on M-NAPPA, the arrays were blocked with 5% milk-PBST for 1 h and then incubated with sera at 1:300 dilution in 5% milk-PBST for 16 h at 4 °C. After washing three times with PBST, the resulting arrays were incubated with Alex Fluor 555 labeled anti-human IgG antibody (Jackson ImmunoResearch Laboratories, PA) 1 h at 23 °C. The slides were washed with PBST, briefly rinsed with water, and dried by centrifugation (2,000 rpm, 2 min). The fluorescent scanning was performed using a Tecan scanner (Männedorf, Switzerland). The antibody binding event was quantified by fluorescence signal intensity using Array-Pro Analyzer (Media Cybernetics) software as previously reported 20, 33.

Results

Conception of M-NAPPA

Protein microarrays have been used in functional protein and antibody biomarker studies to screen for target(s)-of-interest, which are generally rare in the tested protein population (Figure ). For example, the median hit rates (± standard deviation, SD) of studies employing protein microarrays in the past five years for screening protein function and autoantibody biomarkers (Table ) were 0.49% ± 1.23% and 1.02% ± 4.46%, respectively (Figure ). Since false positives are not uncommon during initial screens, all initial candidates require an independent verification step performed using different samples 5, 7, 8, 19, 35-37. Considering that a two-step approach for target discovery and verification often uses hundreds to thousands of samples, the cost of such studies using full-scale arrays can be inhibitory. To decrease the cost of high throughput screening experiments, we hypothesized that the plasmid cDNA encoding for different proteins could be multiplexed (by combining M different plasmids) within each feature to create a high-density array, M-NAPPA (Figure . This multiplexed array could be implemented during the initial functional screen, testing entire proteomes (P proteins) using only a fraction of the features (P/M). Multiplexed hits identified during the screening step could then be de-convoluted in the subsequent verification step using the standard, non-multiplexed NAPPA array where each feature displays only one protein (i.e., M=1). The objectives of the second step would be to identify which proteins were responsible for the positive multiplexed signal and to verify whether the hits were real. This approach exploits the high flexibility of cell-free microarrays, in which arrays can be customized by simply re-arraying individual plasmids encoding for the multiplexed features-of-interest. The schematic illustration of how M-NAPPA arrays are processed is shown in Figure . Using a standard pin-based arrayer, each spot on M-NAPPA contains plasmids encoding for different proteins-of-interest with the same fusion tag. The genes are then transcribed and translated into recombinant proteins in two hours using a cell-free expression system, and captured to the slide surface in situ via a fusion tag antibody. The optimal number of proteins to multiplex depends upon several factors, including hit frequency, cost, array space, and number of proteins. As the frequency of hits in the screen increases, more proteins will need to be tested as individual features during the verification step. Taken to the extreme, if one protein per multiplexed feature were a hit (hit rate = 1/M), all multiplexed features would require deconvolution, making the multiplexing approach impractical. However, such a high hit rate is not reflected by data collected by numerous studies; for example, the hit rate was < 5% in most of our previous NAPPA-based screening studies with 10k human genes (Figure ). We generated a mathematical model (Materials and Methods) to find the optimal M that would take into consideration array space and the cost of screening and verifying hits using our 10k protein human collection at different hit rates. In Figure , the x-axis represents the number of genes per spot (M) while the y-axis represents the number of spots or proteins that are needed for the two-step screening process. Notably, when the hit rate is < 5% for 10k proteins, a relatively small number of spots would be needed for the entire study (screening + verification) with 5 proteins multiplexed per feature in the initial screen, thus representing a good compromise between the number of initial features screened and the subsequent number of features that would be needed for deconvolution and verification.

Comparison of protein display by NAPPA and M-NAPPA

To assess the difference in transcriptional/translational efficiency as well as display competition between large and small proteins within one feature, we multiplexed proteins of varying molecular weights (MW; 20 - 124 kDa) covering 80% of the size range in our human protein collection (Figure ) on M-NAPPA. As indicated in Figure , we prepared NAPPA and M-NAPPA slides in parallel where NAPPA had only one plasmid per spot and M-NAPPA multiplexed five plasmids per spot. After IVTT, the protein arrays were probed with eight antibodies that bound targets ranging in size from 20 to 106 kDa (Methods and Figure ). These antibodies were specific to IA-2 (106 kDa), GAD2 (65 kDa), Clusterin (52 kDa), p53 (44 kDa), Fos (40 kDa), PP2A (36 kDa), SFN (28 kDa) and BCL2L2 (20 kDa). We compared the protein display between the two array types using groups of five proteins with either similar (Figure ) or varied molecular weights (Figure ), and then calculated the signal ratio of M-NAPPA to NAPPA. In both cases, all of the antibodies readily detected their corresponding antigens. For the spots with similarly-sized proteins (36 kDa to 85 kDa), the signal ratio of M-NAPPA to NAPPA was 0.78±0.44. For the spots containing five proteins covering a wide range of molecular weight, from 29 kDa to 106 kDa (Figure ), the binding signal ratio of M-NAPPA to NAPPA was 1.03±0.75 (Figure ). Thus, multiplexing proteins of similar size did not confer any advantage over random multiplexing. To further demonstrate that there were no biases in the expression of different proteins produced from mixed plasmids, five-plasmid mixtures containing various combinations of seven different genes (Abl1, IA-2, GAD2, Jun, RhoU, BCL2L2 and MT3033) were co-expressed in IVTT solution and analyzed via western blot. Despite a wide range of protein sizes, all proteins were expressed at similar amounts in their relevant combinations (Figure ). These data indicate that, although each plasmid in M-NAPPA is present at one-fifth the amount present in standard NAPPA, there was no significant difference of protein display levels between the arrays (p-value = 0.36, paired sample t-test). In addition, background signals that resulted from non-specific antibody binding were comparable between the platforms, demonstrating that multiplexing does not result in an accumulation of background signal that could contribute to the identification of false positives (Figure ). Therefore, we randomly mixed different gene plasmids in the following M-NAPPA studies. To demonstrate that the signal intensity for M-NAPPA was reproducible, we also tested the spot-to-spot, zone-to-zone and slide-to-slide variations by printing 80 gene plasmids on different locations across the M-NAPPA and NAPPA slides. Protein display was then examined with anti-GST antibody staining (Materials and Methods). The coefficient of variations (CVs) for spot-to-spot, zone-to-zone and slide-to-slide were 3.64±3.27%, 7.57±3.41% and 7.27±4.00% for M-NAPPA, respectively, and 7.63±10.58%, 12.13±7.56%% and 13.25±9.42%% for NAPPA, respectively (Table ).

Performing functional assays using M-NAPPA arrays

Fabrication of viral M-NAPPA protein microarrays

We purified 646 viral ORF plasmids from ~23 viruses, normalized their concentrations to 1,200 ng/µL, and printed viral NAPPA and M-NAPPA arrays in duplicate 20. Analyses of the deposited DNA and displayed protein levels indicate that most viral DNA plasmids were successfully printed, expressed, and captured onto the microarrays in a reproducible manner (Figure For example, plasmid DNA deposition across technical replicates of NAPPA and M-NAPPA had correlations (R) of 0.95 and 0.96, respectively. The protein display correlation (R) across technical replicates of NAPPA and M-NAPPA were 0.90 and 0.93, respectively (Figure ). “Non-spots” containing printing buffer alone without plasmid DNA was used as a negative control. 94% and 93% of the spots on NAPPA and M-NAPPA viral arrays, respectively, produced signal that was at least two SDs above the average signal intensity of these “non-spots” (Figure ). Together with Figure , the results indicate that the majority of viral proteins can be displayed on M-NAPPA arrays. In addition, we compared the S/B (signal to background) ratios between direct fluorescence and tyramide signal amplification (TSA) using a fluorophore-linked or HRP-conjugated anti-p53 antibody, respectively. Using the signal from “non-spots” as background, we found that the S/B ratio of fluorescence detection using an antibody with a directly-conjugated fluorophore, the Dylight649 rabbit anti-mouse IgG, was higher (S/B ratio = 431±38) than the TSA method with the HRP-labeled goat anti-mouse IgG (S/B ratio = 323±18). Thus, directly-conjugated fluorescent secondary antibodies were used for the following assays (Figure .

Performing protein-protein interaction assays using M-NAPPA

To determine whether NAPPA and M-NAPPA detect similar protein-protein interactions, both arrays were programmed to display proteins that are known to interact with the tumor suppressor protein Rb1. The arrays were then probed with a Rb1 query protein fused to HaloTag, and interactions were detected using an anti-HaloTag antibody. The Rb1-HaloTag query protein bound to several targets (red arrow) on NAPPA and M-NAPPA arrays; the query also bound to diffused targets outside of each spot, which appear as a “ring” around each feature (Figure ). In Figure , we used a flower pattern diagram to depict the multiplexed proteins on M-NAPPA (large central circle) and the deconvoluted individual proteins on NAPPA (five small connecting circles) (Materials and Methods). The blue gradient within the spot indicates target binding to the Rb1 query protein, whereas reactivity to the “ring” 20, 33 is indicated by a red circle around the spot. Using custom defined criteria, where the target-to-“non-spot” signal ratio is ≥ 2 and the ring score is ≥ 3, we found that 5 and 6 hits were identified on NAPPA and M-NAPPA, respectively, out of the 30 possible candidate target proteins (Table ). Five of the 6 hits on M-NAPPA were E1A, HPV11-E7, HPV16-E7, HPV18-E7 and HPV33-E7, which agrees with previous studies 38-40. The sixth hit on M-NAPPA was not detected with NAPPA, thus suggesting that the hit may be a false positive. To further examine the utility of M-NAPPA to test protein-protein interactions, additional interactions were analyzed with 35 displayed proteins on NAPPA and M-NAPPA using HaloTagged-Jun, -Fos, and -LidA queries. Jun, Fos and LidA bound to their expected interaction partners (i.e., Fos, Jun and three Rab family proteins, respectively) on both M-NAPPA and NAPPA arrays (Figure ). Regarding the protein interactions that were identified, the spot-to-spot and zone-to-to zone CVs were 5.65±2.69% and 5.75±3.86% for NAPPA, respectively, and 2.55±2.56% and 3.11±3.46% for M-NAPPA, respectively (Table These results indicate that M-NAPPA can be used for preliminary high throughput (HT) screening of novel protein-protein interactions. The screen can then be followed by a verification step using deconvoluted spots via NAPPA to identify the specific proteins that are involved.

Identification of serological antibodies using M-NAPPA

To test whether M-NAPPA can be used to detect proteomic serological response, we screened ten serum samples from patients with type 1 diabetes that had been previously characterized using NAPPA arrays 20. A dozen hits were observed with M-NAPPA and NAPPA (Figure . Forty-nine of the 53 antigens (92.5%) identified by NAPPA were also detected by M-NAPPA. Four antigens, however, were detected with only one platform (i.e., two with NAPPA, two with M-NAPPA). These uncommon discrepancies may be due to variations in surface chemistry, plasmid concentration, printing or array processing.

High throughput identification of immune-dominant antigens using M-NAPPA tuberculosis proteome microarrays

Since the multiplex concept to increase feature density was successful in detecting protein-protein interactions and serological antibody responses on planar microarrays, we wanted to determine whether M-NAPPA could also be applied to a nano-well microarray platform. We previously increased feature number by printing plasmids into photolithography-etched silicon nano-wells to create a high-density NAPPA (HD-NAPPA) platform 41. HD-NAPPA can have as many as 10k features per slide, and has successfully detected antiviral antibodies in autoimmune diseases with 761 different proteins displayed on the array in quadruplicate. These tiny wells hold only 1200 pL and use only 0.12 ng of plasmid DNA. We applied the multiplex concept to HD-NAPPA using a mixture of plasmids encoding for IA-2, GAD2 and p53 proteins. We then detected their expression and display using specific antibodies; all of these proteins were readily detectable when printed as a three-plexed mixture (Figure . We then multiplexed 4,045 tuberculosis (TB) ORFs 32 onto HD-NAPPA microarrays as four separate subarrays using three gene plasmids per well (M=3), resulting in an M-HD-NAPPA microarray displaying > 16k proteins on a single slide. This lower multiplicity was based on the mathematical model (Figure ) that took into account that the high number of conserved proteins in endemic, non-pathogenic mycobacterial species results in a higher hit rate (~10% 37). Over 95% of the spots generated a signal that was at least 10 SDs above the background, which indicates that the vast majority of proteins were well-expressed and displayed (Figure ), with a correlation of R = 0.90 across technical replicates (Figure ). Antibody reactivity from TB patient sera was observed with M-HD-NAPPA (Figure ). The technical reproducibility of these immune-dominant antigens across different M-NAPPA arrays using the same sera was very high, with a correlation of R = 0.98 (Figure ). All immune-dominant antigens identified with M-HD-NAPPA screening were then deconvoluted in the verification step using single protein NAPPA (Figure ) and validated with RAPID-ELISA as previously described 19 (Figure ). We screened the sera from guinea pigs immunized with Bacillus Calmette-Guerin (BCG), a TB vaccine, using M-HD-NAPPA TB proteome microarrays. The aim of this experiment was to identify potential protective antibodies induced with BCG. The representative fluorescence images are shown in Figure . Compared to the control mock sera pool using PBS buffer (n=5), four features on M-NAPPA arrays showed increased signals with the BCG samples (n=4) (Figure ). To deconvolute and validate those targets, we repeated the serological assay for those candidate proteins, along with two non-responsive control proteins (Rv2077A and Rv2682c), using RAPID-ELISA and the individual sera from the guinea pigs. The antibody levels of four antigens (Rv3405c, Rv1078, Rv2853 and Rv0928) in BCG-vaccinated guinea pigs were significantly higher than that of the PBS control with a p-value <0.01 (Figure ). According to the Tuberculist database (http://tuberculist.epfl.ch/), these proteins are involved in regulation, cell wall and cell processes, and are considered to be in the proline-glutamic acid / proline-protein-glutamic acid (PE/PPE) protein families (Figure ). A primary advantage of cell-free protein microarrays is that the arrays have a long shelf life. We compared the protein expression of M-NAPPA TB arrays immediately after printing and then again after 6 months of storage at room temperature in a nitrogen atmosphere. A GST-tagged protein, detected with an anti-GST antibody, was considered to be displayed if it had a signal that was two SDs above the signal of the “non-spots.” Over 99% of the proteins were displayed on new M-NAPPA arrays; this number, as well as the anti-GST signal intensity, did not change even after 6 months of storage (Figure ).

Discussion

NAPPA has been widely applied in protein-protein interactions, post-translational modifications (PTMs), antibody epitope mapping and discovery of (auto) antibody biomarkers for a variety of human diseases, including markers that are currently being used in the clinic for the detection of breast cancer 13, 14, 18, 20, 32, 36, 42-44. Due to mRNA and protein diffusion during IVTT, the number of features per planar microscope slide has been limited to ~2,300 to minimize cross-talk to neighboring spots. The feature density limit has thus required that multiple slides be used to study large proteomes. Here, we developed a new strategy, M-NAPPA, that significantly increases the number proteins that can be tested per slide multiple-fold. By combining five different plasmids within one feature, >10k proteins can be printed on one microscope slide for HT, low cost analyses when compared to studies using one-plasmid-per-feature arrays. The multiplexed hits that are identified with M-NAPPA can then be deconvoluted during the subsequent verification step (Figure ). First, we constructed a mathematical model to determine the optimal level of multiplexing, which considers the number of proteins, cost, array size, and hit rate to predict the number of arrays that would be needed for a two-step screening and verification study. A survey of HT unbiased target screening studies that used protein microarrays, both in the literature and our own results using NAPPA, revealed that hits are rare (typically <5%) (Figure ). For 10k proteins and a hit rate of 5%, the mathematical model indicated that multiplexing 5 proteins per spot (Figure ) would provide a good balance of maximizing the number of features, minimizing the number of arrays, and yielding the minimum overall workload when compared to using non-multiplexed arrays for both the screening and verification steps. Second, we demonstrated that there was nothing inherent to printing plasmids as a five-plasmid mixture that prevented their expressed proteins from routine detection regardless of protein size. However, display levels of large proteins (≥ 65 kD) were decreased by 60±0.1% for GAD2 and 63±11% for IA-2 (Figure ). This may be because the plasmids were mixed equally together based on their masses, a requirement imposed by the printing chemistry; this would result in a lower molarity of large plasmids (i.e., large proteins) relative to small plasmids in the printing mixture. Another possible reason is that larger proteins are produced more slowly than smaller proteins due to their longer mRNA sequences. Third, we showed that M-NAPPA can be used in protein-protein interaction and serological screening studies. The results from M-NAPPA agreed strongly with those observed with non-multiplexed NAPPA (Figure -5, Table and S3, Figure ). These data indicate that M-NAPPA presents a labor- and cost-effective strategy to initially screen for hits. Fourth, we further increased the feature density by applying this method to our previously-published nano-well platform 21. With M-HD-NAPPA, the entire TB proteome containing 4,045 genes was successfully printed on a nano-well array in quadruplicate 20 (Figure 6). This generates the highest density nano-well protein microarray to date and increases the previously demonstrated content by more than five-fold 20. Our data indicate that the multiplexing strategy has great potential value for use with different microarray platforms (Figure and Figure ) 45.

Figure 6

M-NAPPA TB proteome microarray fabrication, protein display, and role in detecting immune-dominant antigens. (A) Representative image of protein display on M-NAPPA TB proteome microarrays; (B) Distribution of protein display across four M-NAPPA TB proteome microarrays using an antibody specific to the capturing fusion tag; (C) Correlation of protein display across different TB proteome microarrays; (D) Representative image of serological antibody detection on an M-NAPPA TB proteome microarray; (E) Distribution of serum antibody binding signals on a M-NAPPA TB proteome microarray; (F) Correlation of serological antibody detection using M-NAPPA TB proteome microarrays; (G) Deconvoluted and verification of TB antibody candidates from M-NAPPA using NAPPA protein microarrays; (H) Validation of a reactive serological antibody on M-NAPPA and NAPPA using ELISA. (A, D, G) False-colored images across a rainbow scale where low and high binding are represented by blue and red, respectively.

Finally, we evaluated the reproducibility of M-NAPPA arrays for protein array preparation and protein-protein interactions. We found M-NAPPA can be reproducibly fabricated with spot-to-spot, zone-to-zone and slide-to-slide CVs that are similar to those obtained with NAPPA (Table ). The spot-to-spot and zone-to-to zone CVs for protein-protein interactions were also similar between the two array platforms (Table While the correlations within and between different M-NAPPA slides were good (i.e., R = 0.93 for both) (Figure and Figure ), with some size adaptation, the reproducibility could eventually be further improved with the use of automation equipment like the HS 4800 Pro Hybridization Station (Tecan Trading AG; Männedorf, Switzerland). In some ways, M-NAPPA resembles “natural protein” microarrays that print unpurified or partially fractionated proteins from lysates of human cells, tissues or body fluids, but in a much more controlled manner. Each feature of a natural protein microarray typically represents a mixture of unknown proteins. Thus, responsive hits on natural protein arrays require a challenging and time-consuming process to determine the identity of the protein responsible for the response. This may require further purification, identification by mass spectrometry and additional response testing of recombinant proteins 46, 47. In the case of M-NAPPA, the identities of the proteins in each mix are known in advance and the plasmids encoding for each protein are available for secondary testing. M-NAPPA would be useful in unbiased HT screening studies, such as protein-protein interactions, protein-DNA interactions, discovery of drug binding target as well as (auto)antibody biomarkers for a variety of human diseases. However, it should be noted that there are situations in which using a non-multiplexed array format would be more appropriate. For example, NAPPA should be used when investigating protein functions or when the number of proteins to be screened is low. Additional attributes of M-NAPPA should be considered as well. Large, multiplexed proteins (≥ 65 kDa) on M-NAPPA are displayed at a lower level (37 - 40%) than their non-multiplexed counterparts (Figure ). This issue could be resolved by increasing plasmid DNA concentration before printing or reducing multiplicity per spot. Alternatively, since large proteins represent a small fraction of the proteome, a hybrid array containing multiplexed spots with plasmids encoding for proteins with low to moderate MWs and non-multiplexed spots for large proteins (≥ 65 kDa) could be employed. In addition, PTMs that occur during cell-free protein expression may affect the protein display or activity on M-NAPPA arrays 42. We have observed that the human expression system contains the ability to phosphorylate some proteins (data not shown); other types of PTMs (e.g., glycosylation, acetylation) by the expression system are not well known or reported. In our studies, PTMs did not appear to affect protein expression, protein-protein interactions, or the identification of serological antigens on M-NAPPA when compared to NAPPA (Figure , Figures ).

Conclusion

We developed a method that multiplexes five different proteins within the same feature, called M-NAPPA, which significantly increases array density while decreasing experimental time and cost. Although we used this approach with NAPPA and HD-NAPPA, the same concept could be applied toward other microarray technologies or platforms. Our results show that M-NAPPA identified hits in protein interaction and serum screening studies, thus highlighting its potential to be employed in high throughput proteomics studies. Supplementary Table S1-S5 and Supplementary Figures S1-S12. Click here for additional data file.

44 in total

1. Cytoscape: a software environment for integrated models of biomolecular interaction networks.

Authors: Paul Shannon; Andrew Markiel; Owen Ozier; Nitin S Baliga; Jonathan T Wang; Daniel Ramage; Nada Amin; Benno Schwikowski; Trey Ideker
Journal: Genome Res Date: 2003-11 Impact factor: 9.043

2. Anoctamin 2 identified as an autoimmune target in multiple sclerosis.

Authors: Burcu Ayoglu; Nicholas Mitsios; Ingrid Kockum; Mohsen Khademi; Arash Zandian; Ronald Sjöberg; Björn Forsström; Johan Bredenberg; Izaura Lima Bomfim; Erik Holmgren; Hans Grönlund; André Ortlieb Guerreiro-Cacais; Nada Abdelmagid; Mathias Uhlén; Tim Waterboer; Lars Alfredsson; Jan Mulder; Jochen M Schwenk; Tomas Olsson; Peter Nilsson
Journal: Proc Natl Acad Sci U S A Date: 2016-02-09 Impact factor: 11.205

3. Protein microarray signature of autoantibody biomarkers for the early detection of breast cancer.

Authors: Karen S Anderson; Sahar Sibani; Garrick Wallstrom; Ji Qiu; Eliseo A Mendoza; Jacob Raphael; Eugenie Hainsworth; Wagner R Montor; Jessica Wong; Jin G Park; Naa Lokko; Tanya Logvinenko; Niroshan Ramachandran; Andrew K Godwin; Jeffrey Marks; Paul Engstrom; Joshua Labaer
Journal: J Proteome Res Date: 2010-11-23 Impact factor: 4.466

4. Plasma Autoantibodies Associated with Basal-like Breast Cancers.

Authors: Jie Wang; Jonine D Figueroa; Garrick Wallstrom; Kristi Barker; Jin G Park; Gokhan Demirkan; Jolanta Lissowska; Karen S Anderson; Ji Qiu; Joshua LaBaer
Journal: Cancer Epidemiol Biomarkers Prev Date: 2015-06-12 Impact factor: 4.254

5. Dynamic antibody responses to the Mycobacterium tuberculosis proteome.

Authors: Shajo Kunnath-Velayudhan; Hugh Salamon; Hui-Yun Wang; Amy L Davidow; Douglas M Molina; Vu T Huynh; Daniela M Cirillo; Gerd Michel; Elizabeth A Talbot; Mark D Perkins; Philip L Felgner; Xiaowu Liang; Maria L Gennaro
Journal: Proc Natl Acad Sci U S A Date: 2010-07-28 Impact factor: 11.205

6. Mycobacterium tuberculosis proteome microarray for global studies of protein function and immunogenicity.

Authors: Jiaoyu Deng; Lijun Bi; Lin Zhou; Shu-juan Guo; Joy Fleming; He-wei Jiang; Ying Zhou; Jia Gu; Qiu Zhong; Zong-xiu Wang; Zhonghui Liu; Rui-ping Deng; Jing Gao; Tao Chen; Wenjuan Li; Jing-fang Wang; Xude Wang; Haicheng Li; Feng Ge; Guofeng Zhu; Hai-nan Zhang; Jing Gu; Fan-lin Wu; Zhiping Zhang; Dianbing Wang; Haiying Hang; Yang Li; Li Cheng; Xiang He; Sheng-ce Tao; Xian-en Zhang
Journal: Cell Rep Date: 2014-12-11 Impact factor: 9.423

7. Systematic identification of arsenic-binding proteins reveals that hexokinase-2 is inhibited by arsenic.

Authors: Hai-Nan Zhang; Lina Yang; Jian-Ya Ling; Daniel M Czajkowsky; Jing-Fang Wang; Xiao-Wei Zhang; Yi-Ming Zhou; Feng Ge; Ming-Kun Yang; Qian Xiong; Shu-Juan Guo; Huang-Ying Le; Song-Fang Wu; Wei Yan; Bingya Liu; Heng Zhu; Zhu Chen; Sheng-Ce Tao
Journal: Proc Natl Acad Sci U S A Date: 2015-11-23 Impact factor: 11.205

8. Self-assembling protein microarrays.

Authors: Niroshan Ramachandran; Eugenie Hainsworth; Bhupinder Bhullar; Samuel Eisenstein; Benjamin Rosen; Albert Y Lau; Johannes C Walter; Joshua LaBaer
Journal: Science Date: 2004-07-02 Impact factor: 47.728

9. The identification of phosphoglycerate kinase-1 and histone H4 autoantibodies in pancreatic cancer patient serum using a natural protein microarray.

Authors: Tasneem H Patwa; Chen Li; Laila M Poisson; Hye-Yeung Kim; Manoj Pal; Debashis Ghosh; Diane M Simeone; David M Lubman
Journal: Electrophoresis Date: 2009-06 Impact factor: 3.535

Review 10. Advancing translational research with next-generation protein microarrays.

Authors: Xiaobo Yu; Brianne Petritis; Joshua LaBaer
Journal: Proteomics Date: 2016-03-31 Impact factor: 3.984

6 in total

1. Analysis of Protein-Protein Interactions by Protein Microarrays.

Authors: Ana Montero-Calle; Rodrigo Barderas
Journal: Methods Mol Biol Date: 2021

2. Single molecule protein patterning using hole mask colloidal lithography.

Authors: William Lum; Dinesh Gautam; Jixin Chen; Laura B Sagle
Journal: Nanoscale Date: 2019-08-27 Impact factor: 7.790

3. Identification of Serum Biomarkers for Systemic Lupus Erythematosus Using a Library of Phage Displayed Random Peptides and Deep Sequencing.

Authors: Fan-Lin Wu; Dan-Yun Lai; Hui-Hua Ding; Yuan-Jia Tang; Zhao-Wei Xu; Ming-Liang Ma; Shu-Juan Guo; Jing-Fang Wang; Nan Shen; Xiao-Dong Zhao; Huan Qi; Hua Li; Sheng-Ce Tao
Journal: Mol Cell Proteomics Date: 2019-07-15 Impact factor: 5.911

4. PhIP-Seq characterization of serum antibodies using oligonucleotide-encoded peptidomes.

Authors: Divya Mohan; Daniel L Wansley; Brandon M Sie; Uri Laserson; H Benjamin Larman; Muhammad S Noon; Alan N Baer
Journal: Nat Protoc Date: 2018-09 Impact factor: 13.491

5. In-depth serum proteomics reveals biomarkers of psoriasis severity and response to traditional Chinese medicine.

Authors: Meng Xu; Jingwen Deng; Kaikun Xu; Tiansheng Zhu; Ling Han; Yuhong Yan; Danni Yao; Hao Deng; Dan Wang; Yaoting Sun; Cheng Chang; Xiaomei Zhang; Jiayu Dai; Liang Yue; Qiushi Zhang; Xue Cai; Yi Zhu; Hu Duan; Yuan Liu; Dong Li; Yunping Zhu; Timothy R D J Radstake; Deepak M W Balak; Danke Xu; Tiannan Guo; Chuanjian Lu; Xiaobo Yu
Journal: Theranostics Date: 2019-04-13 Impact factor: 11.556

6. SARS-CoV-2 Proteome Microarray for Mapping COVID-19 Antibody Interactions at Amino Acid Resolution.

Authors: Hongye Wang; Xian Wu; Xiaomei Zhang; Xin Hou; Te Liang; Dan Wang; Fei Teng; Jiayu Dai; Hu Duan; Shubin Guo; Yongzhe Li; Xiaobo Yu
Journal: ACS Cent Sci Date: 2020-10-21 Impact factor: 14.553

6 in total