| Literature DB >> 34849844 |
Daniel Mariyappa1, Douglas B Rusch2, Shunhua Han3, Arthur Luhur1, Danielle Overton1, David F B Miller2, Casey M Bergman3,4, Andrew C Zelhof1.
Abstract
Drosophila cell lines are used by researchers to investigate various cell biological phenomena. It is crucial to exercise good cell culture practice. Poor handling can lead to both inter- and intra-species cross-contamination. Prolonged culturing can lead to introduction of large- and small-scale genomic changes. These factors, therefore, make it imperative that methods to authenticate Drosophila cell lines are developed to ensure reproducibility. Mammalian cell line authentication is reliant on short tandem repeat (STR) profiling; however, the relatively low STR mutation rate in Drosophila melanogaster at the individual level is likely to preclude the value of this technique. In contrast, transposable elements (TEs) are highly polymorphic among individual flies and abundant in Drosophila cell lines. Therefore, we investigated the utility of TE insertions as markers to discriminate Drosophila cell lines derived from the same or different donor genotypes, divergent sub-lines of the same cell line, and from other insect cell lines. We developed a PCR-based next-generation sequencing protocol to cluster cell lines based on the genome-wide distribution of a limited number of diagnostic TE families. We determined the distribution of five TE families in S2R+, S2-DRSC, S2-DGRC, Kc167, ML-DmBG3-c2, mbn2, CME W1 Cl.8+, and ovarian somatic sheath Drosophila cell lines. Two independent downstream analyses of the next-generation sequencing data yielded similar clustering of these cell lines. Double-blind testing of the protocol reliably identified various Drosophila cell lines. In addition, our data indicate minimal changes with respect to the genome-wide distribution of these five TE families when cells are passaged for at least 50 times. The protocol developed can accurately identify and distinguish the numerous Drosophila cell lines available to the research community, thereby aiding reproducible Drosophila cell culture research.Entities:
Keywords: zzm321990 Drosophilazzm321990 ; authentication; cell lines; transposable element
Mesh:
Substances:
Year: 2022 PMID: 34849844 PMCID: PMC9210319 DOI: 10.1093/g3journal/jkab403
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.542
Summary of transposable element (TE) insertions detected by gTED
| Cell line | Tissue source | DGRC stock number | Cellosaurus ID | Number of TE insertions mean (±SD) |
|---|---|---|---|---|
| S2R+ | Embryo | 150 | CVCL_Z831 | 1009 (±30.4) |
| S2 DGRC | Embryo | 6 | CVCL_TZ72 | 704 (±3.2) |
| mbn2 | Larval circulatory system | 147 | CVCL_Z706 | 633 (±6.4) |
| S2 DRSC | Embryo | 181 | CVCL_Z992 | 530 (±14.8) |
| Kc167 | Embryo | 1 | CVCL_Z834 | 516 (±18.3) |
| OSS | Adult ovary | 190 | CVCL_1B46 | 404 (±8.5) |
| CME-W1-Cl.8+ | Larval wing disc | 151 | CVCL_Z790 | 309 (±11.1) |
| ML-DmBG3-c2 | Larval CNS | 68 | CVCL_Z728 | 227 (±4.7) |
The total number of TE insertions that were detected in each of the listed cell lines is presented as a mean (n = 3) of the samples analyzed. CNS, central nervous system; SD, standard deviation.
List of blinded samples processed
| Sample label | Source | Identification with gTED pipeline | Confirmation |
|---|---|---|---|
| DRSC_Blinded_1-3 | DRSC | Kc167 | Kc167 |
| DRSC_Blinded_4-6 | DRSC | No ID |
|
| DRSC_Blinded_7-9 | DRSC | No ID |
|
| DRSC_Blinded_10-12 | DRSC | Kc167 | Kc167 |
| DRSC_Blinded_13-15 | DRSC | S2R+ | S2R+ |
| DRSC_Blinded_16-18 | DRSC | S2 | S2 |
| SGLab_Blinded_1-3 | Gorski Lab | mbn2 | mbn2 |
| SGLab_Blinded_4-6 | Gorski Lab | S2 | S2 |
| DGRC_Blinded_A | Internal | No ID | 1182-4H |
| DGRC_Blinded_B | Internal | No ID | Ras[V12]; wts[RNAi] |
| DGRC_Blinded_C | Internal | No ID | delta l(3)mbt-OSC |
Blinded samples were donated by external (Drosophila RNAi Screening Center and Dr S. Gorski) or generated internally. The identifications were made upon processing the sample through the genomic TE distribution pipeline followed by computational analysis. No ID: The genomic TE signatures of the cell lines did not match with any of the lines analyzed to provide a positive identification. A. a: cell line derived from Aedes aegypti; A. g: cell line derived from A. gambiae.
Figure 1Protocol used for generating libraries to establish genomic transposable element distribution signatures. (A) Fragmented genomic DNA (gDNA; light brown lines) from the Nextera libraries containing TEs (green bar) and flanking gDNA were amplified with the randomly oriented i5 (blue arrow) and i7 (black arrow) primers. (B) Reactions A and B involved amplification with the i5 primer oriented in either direction with respect to the TE, in combination either with TE-specific Reverse (dark brown arrow) and Forward (dark gray arrow) primers, respectively. (C) The Nest PCR reactions amplified from within the products of the respective Reactions A and B using the i5 primer and either the TE-specific Nest Reverse (light brown arrow) or TE-specific Nest Forward (light gray arrow) primers. Read 2 anchors were added onto both the Nest PCR primers. (D) The final amplification step was performed with the i5 primer and the Read 2 anchor with the i7 index primer (black box). The reads from the genome sequences flanking the TE are designated as Read 1; the reads internal to the TE are designated Read 2.
Figure 2Read mapping strategy used to generate genomic transposable element distribution signatures. Read 1 (R1) reads from demultiplexed fragments were used to identify the transposon junctions (green) from the set of all R1 reads. The schematic represents R1 reads at junctions on either end (5′ or 3′) of a TE. The number of reads that specifically identify a junction is relatively small compared to the total number of reads near the junction. Variation in sequencing depth and subtle differences in the insert sizes produced by the Nextera library could cause junctions to be missed if only explicit junction calls are used. To avoid these issues, after the junctions have been identified, a 300-bp region of genomic sequence flanking the transposon is used to quantify the number of R1 reads (red) associated with that junction.
Figure 3Clustering of cell lines based on genomic transposable element distribution. The cell line clustering was derived upon processing NGS data as described in the Materials and methods. The triplicates for each cell lines are indicated with 1–3 following the cell line name.
Figure 4Cell line authentication of double-blind samples using genomic transposable element distribution signatures. Triplicate samples of external blinded cell lines from the lab of Dr S. Gorski (shaded yellow) and Drosophila RNAi Screening Center (shaded green) along with internal blinded samples (shaded brown) and internal control samples (shaded red) were processed with the gTED protocol (Figure 2B) and clustered as described in the Materials and methods along with the previously processed known samples. The cell lines that the blinded samples cluster with are indicated with the black lines. Internal blinded samples cluster as a separate group. Samples DRSC_Blinded_4-9 with very few or no TEs detected were from mosquito cell lines (Table 2).
Figure 5Genomic transposable element distribution signatures for S2R+ cells do not cluster by passage number. (A) Schematic outlining the protocol to acquire samples between 1 and 50 S2R+ passages for assessment by the gTED protocol. (B) Clustering of all the passage samples generated based on TE predictions. The triplicates samples of every passage are shaded in one color each.