| Literature DB >> 25798340 |
Carla Jo Logan-Young1, John Z Yu2, Surender K Verma1, Richard G Percy2, Alan E Pepper1.
Abstract
PREMISE OF THE STUDY: Single-nucleotide polymorphism (SNP) marker discovery in plants with complex allotetraploid genomes is often confounded by the presence of homeologous loci (along with paralogous and orthologous loci). Here we present a strategy to filter for SNPs representing orthologous loci. METHODS ANDEntities:
Keywords: Gossypium; genotyping by sequencing; interspecific; intraspecific; next-generation sequencing; polyploid; single-nucleotide polymorphisms
Year: 2015 PMID: 25798340 PMCID: PMC4356317 DOI: 10.3732/apps.1400077
Source DB: PubMed Journal: Appl Plant Sci ISSN: 2168-0450 Impact factor: 1.936
Fig. 1.Predicted marker type categories from the sstacks algorithm for four common genetic scenarios (out of many) that give rise to apparent GBS polymorphisms between two allotetraploids. Red lines indicate sequences that can be clearly assigned to the AT subgenome, and blue lines indicate those that can be assigned to the DT subgenome. Gray lines indicate regions of high sequence similarity between homeologs or paralogs (e.g., no differences outside of the SNP of interest). Marker type predictions are based on the assumption that there is adequate sequence coverage to accurately score all alleles at all relevant loci.
Seed sources, taxonomy, and preliminary GBS statistics for a set of diploid (A1-27, D5-1) and allotetraploid cottons.
| Year | Scientific name | Name or designation | PI no. | Origin | ||||
| 2003 | A1-27 | PI 408785 | Peru | 1,678,012 | 12,638 | 1,743,678 | 43,883 | |
| 1989 | D5-1 | PI 530898 | Ecuador | 949,301 | 4215 | 15,323,076 | 8932 | |
| 1984 | K-56 | PI 274514 | Sinchao Chico, Piura, Peru, | 3,995,793 | 16,668 | 3,440,581 | 17,332 | |
| 2005 | TM-1 | PI 607172 | College Station, Texas, USA | 2,752,301 | 22,862 | 2,609,330 | 10,792 | |
| 2002 | Pima 3-79 | Sacaton, Arizona, USA | 2,756,143 | 23,077 | 1,816,093 | 9059 | ||
| 2005 | TX-231 | PI 163725 | Zacapa, Zacapa, Guatemala | 318,042 | 7895 | 592,566 | 1739 |
Note: PI no. = Plant Introduction number, National Plant Germplasm System.
BLASTN results (significance value <1e-6) using total stacks from Gossypium hirsutum cv. TM-1 or selected aa/bb markers across all taxa as subject.
| Database | TM-1 | aa/bb | TM-1 | aa/bb |
| JGI CDS | 1059 | 109 | 3822 | 343 |
| (4.60%) | (3.10%) | (35.40%) | (38.20%) | |
| JGI transcript | 2018 | 245 | 4943 | 448 |
| (8.80%) | (7.10%) | (45.80%) | (49.90%) | |
| BGI CDS | 1941 | 341 | 3767 | 325 |
| (8.50%) | (9.80%) | (34.90%) | (36.20%) | |
| Brassicaceae repeats | 102 | 0 | 170 | 5 |
| (0.45%) | (0.00%) | (1.50%) | (0.56%) | |
| Plastid | 6 | 1 | 171 | 6 |
| (0.03%) | (0.03%) | (1.50%) | (0.67%) | |
| Mitochondrial | 17 | 8 | 843 | 4 |
| (0.07%) | (0.23%) | (7.81%) | (0.45%) | |
| Total | 22,862 | 3474 | 10,792 | 897 |
Note: BGI = Beijing Genomics Institute; CDS = coding DNA sequences; JGI = Joint Genome Institute.
Results are searched against the databases indicated: JGI CDS and JGI transcript from G. raimondii (Paterson et al., 2012; Wang et al., 2012); Brassicaceae repeats from TIGR Brassicaceae Repeat Database ver2_0_0 (Ouyang and Buell, 2004); plastid from G. hirsutum plastid genome (Lee et al., 2006); and mitochondrial from G. hirsutum mitochondrial genome (Liu et al., 2013).
Numbers of BsrGI shared stacks (loci) and dual homozygous (aa/bb) marker loci across a set of intraspecific and interspecific combinations of Gossypium taxa.
| Pairwise combination | Shared stacks | aa/bb Markers |
| A1/D5 | 413 | 216 |
| Pima 3-79/A1 | 3329 | 1538 |
| Pima 3-79/D5 | 2123 | 1057 |
| Pima 3-79/K-56 | 10,623 | 859 |
| Pima 3-79/TM-1 | 12,408 | 2040 |
| Pima 3-79/TX-231 | 5322 | 1183 |
| TM-1/A1 | 3172 | 1550 |
| TM-1/D5 | 2003 | 1041 |
| TM-1/K-56 | 8910 | 2171 |
| TM-1/TX-231 | 5492 | 575 |
| TX-231/A1 | 2031 | 959 |
| TX-231/D5 | 1328 | 690 |
| TX-231/K-56 | 4836 | 1512 |
| K-56/A1 | 3421 | 1616 |
| K-56/D5 | 2072 | 1066 |
Numbers of HinP1I shared stacks (loci) and dual homozygous (aa/bb) marker loci across a set of intraspecific and interspecific combinations of Gossypium taxa.
| Pairwise combination | Shared stacks | aa/bb Markers |
| A1/D5 | 1201 | 502 |
| Pima 3-79/A1 | 2987 | 862 |
| Pima 3-79/D5 | 1387 | 351 |
| Pima 3-79/K-56 | 931 | 62 |
| Pima 3-79/TM-1 | 4921 | 740 |
| Pima 3-79/TX-231 | 856 | 167 |
| TM-1/A1 | 3198 | 899 |
| TM-1/D5 | 1528 | 323 |
| TM-1/K-56 | 906 | 182 |
| TM-1/TX-231 | 961 | 111 |
| TX-231/A1 | 740 | 256 |
| TX-231/D5 | 444 | 119 |
| TX-231/K-56 | 342 | 79 |
| K-56/A1 | 770 | 241 |
| K-56/D5 | 429 | 120 |
Fig. 2.BsrG1 and HinP1I GBS polymorphism in tetraploid Gossypium spp. The proportion of highly informative (aa/bb) markers relative to total shared loci (stacks) in intraspecific and interspecific pairwise comparisons is shown. 3-79 = G. barbadense cv. Pima 3-79; K-56 = G. barbadense accession K-56; TM-1 = G. hirsutum cv. TM-1; TX-231 = G. hirsutum accession TX-231.
Fig. 3.Representative examples of the five categories of sequence alignments observed in TM-1 vs. Pima 3-79 polymorphic markers with aa/bb marker type assignment from Stacks. Nucleotides on a black background indicate the site of the key Pima 3-79 polymorphism relative to the TM-1 reference sequence. Nucleotides on a gray background indicate additional mismatches relative to the TM-1 reference sequence. The top two lines in each category indicate the TM-1 and Pima 3-79 fragment sequences, respectively. The prefix B indicates BsrG1 markers, and H indicates HinP1I. Additional lines in the alignment represent fragments from diploid genomes along with chromosomal assignments. BGI_A = Gossypium arboreum (Li et al., 2014); JGI_D = G. raimondii (Paterson et al., 2012); BGI_D = G. raimondii (Wang et al., 2012); scaf = scaffold.
Categorization of marker alignments of aa/bb markers polymorphic between Gossypium hirsutum TM-1 and G. barbadense Pima 3-79. Alignments included TM-1 and 3-97 alleles, along with any BLAST hits (1e-6) to sequenced A- and D-genome diploid species (Paterson et al., 2012; Wang et al., 2012; Liu et al., 2013). The five categories are described in the text and illustrated in Fig. 3.
| Category | ||
| Total fragments aligned | 1413 | 549 |
| Category I | 1183 | 381 |
| Category II | 24 | 59 |
| Category III | 30 | 0 |
| Category IV | 99 | 16 |
| Category V | 77 | 93 |
| Category V without BLAST hits to diploids | 7 | 8 |
| Fragments assigned to a subgenome | 1312 | 397 |
| Fragments assigned to AT | 841 | 234 |
| Fragments assigned to DT | 471 | 163 |
Cleaved amplified polymorphic sequence validation of 22 aa/bb markers that are polymorphic between Gossypium hirsutum TM-1 and G. barbadense Pima 3-79.
| Locus | Primer sequences (5′–3′) | Enzyme | Predicted cut | Cut TM-1 | Cut 3-79 |
| Bsr1195 | F: CGTACACAAAGTATTTAGAGAATATAA | Pima 3-79 | X | ||
| R: CAAAAAGGTACGTTCCATGAAAAG | |||||
| Bsr1616 | F: CGTACACATGGTGAACACTTAGTAC | TM-1 | (Multiple amplicons) | ||
| R: GTAGACAAGAGAGCTACGAGATAAAC | |||||
| Bsr3721 | F: CACGTCCTAGGACACGGGCTAT | Pima 3-79 | X | ||
| R: GTGTGACCGTGTGTGGCACACTA | |||||
| Bsr5368 | F: CGTACAATTAGGTGTTTCGCTCTTAG | TM-1 | X | ||
| R: AGCTCTAGTATCATAACTACAGTTAGC | |||||
| Bsr7080 | F: CGTACATGGAACTTTTTAAGGAGGC | TM-1 | X | ||
| R: ACATTTAATGCAAGTGCATGTAT | |||||
| Bsr7402 | F: CGTACAAGACTCACCCACAAGT | TM-1 | X | ||
| R: GGCTTGATGCTGGGATTATATACAC | |||||
| Bsr9628 | F: CGTACAATAGAGTTACAATAAACTCG | Pima 3-79 | X | ||
| R: GTTTTTGCCGAACTTTATTCATAACA | |||||
| Bsr12910 | F: CGTACAGTCAACCGCCTTAAAAATTTA | TM-1 | X | ||
| R: CTTTTACGGTGTTTTTGTTTTGACATC | |||||
| Bsr13288 | F: CATCAGCATAAGGAACACGTGGCAC | Pima 3-79 | X | ||
| R: TTGACGGAATAACCAGACAAGAACA | |||||
| Bsr14160 | F: CGTACATGAGTACTAAAGAGATTGG | TM-1 | X | ||
| R: GATATCTTTAATAGGGGGTGCAAC | |||||
| Bsr17257 | F: CAAAGACCTCCCCCACCTACTTC | TM-1 | X | ||
| R: TCAGCACCCTGTGGTACCTCAAG | |||||
| Bsr17701 | F: CAACAACCTGCCTCACCTGCTTC | TM-1 | X | ||
| R: TTAGCACCTTATGGCATCTCAGGA | |||||
| Bsr18072 | F: CGTACAAGAACCTCCCCCACC | TM-1 | X | X | |
| R: CAGCACCCTGTGGCATCTCTG | |||||
| Bsr18083 | F: CGTACAAACCTGAGATTTCAGGTC | Pima 3-79 | X | ||
| R: CCCTGATATGTATTGGTCGGGC | |||||
| Bsr18484 | F: CGTACATTAACCCGGTTCAGGTG | Pima 3-79 | X | ||
| R: ACTGGATCCATTAGTTAGAATCGGG | |||||
| Bsr18818 | F: CGTACAGTTATAAGAGAAATTCCAC | TM-1 | X | ||
| R: CTCTTCAACCCCTTGTTTTGTGATC | |||||
| Bsr20063 | F: CGTACATGATAAGGACAAGAGTATT | Pima 3-79 | X | ||
| R: CAGTTTTGTCCGGTACGGTCTGGCA | |||||
| Bsr20113 | F: CGTACAACAATCATACAAGGAAT | TM-1 | X | ||
| R: GTCTCTAGACCCGTTCCTTCATG | |||||
| Bsr20829 | F: CGTACAACTCAAGTGTACCACT | Pima 3-79 | X | ||
| R: TTCCTGTTGAATTTATCTGAAATATC | |||||
| Hin2726 | F: CGCATGCATGTTAGCAAGCAGTG | Pima 3-79 | X | ||
| R: CGTGATTCGACGAAAACCAATC | |||||
| Hin3799 | F: CCAGTTCTATCATGGCAAGATTCC | TM-1 | X | ||
| R: GGAAGTTTCAACGAGAGAGTTGAAAG | |||||
| Hin9147 | F: CAGCCCACCACTTTTCCTTACC | TM-1 | X | ||
| R: TGTGCAGAATTGAGGGTTGCCT |
Predicted cut site is based on an alignment of GBS fragment sequences.
Bsr1616 yielded multiple PCR amplicons, none of which matched the expected size.