| Literature DB >> 22920992 |
Lorena B Parra-González1, Gabriela A Aravena-Abarzúa, Cristell S Navarro-Navarro, Joshua Udall, Jeff Maughan, Louis M Peterson, Haroldo E Salvo-Garrido, Iván J Maureira-Butler.
Abstract
BACKGROUND: Yellow lupin (Lupinus luteus L.) is a minor legume crop characterized by its high seed protein content. Although grown in several temperate countries, its orphan condition has limited the generation of genomic tools to aid breeding efforts to improve yield and nutritional quality. In this study, we report the construction of 454-expresed sequence tag (EST) libraries, carried out comparative studies between L. luteus and model legume species, developed a comprehensive set of EST-simple sequence repeat (SSR) markers, and validated their utility on diversity studies and transferability to related species.Entities:
Mesh:
Year: 2012 PMID: 22920992 PMCID: PMC3472298 DOI: 10.1186/1471-2164-13-425
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Characteristics of 50 EST-SSR primers developed in Shown for each primer pair are the library specificity, repeat motif, forward and reverse sequence, allele range size (bp), number of alleles, amplification in other Lupin species, and annotation
| l1l2itg33000 | L1 | (ACA)7 | CACGTCAGTCCTTGCACCTA | GCACAGCAACAACAACACAA | 129-132 | 2 | | |
| l1l2itg51784 | L1 | (TA)8 | CATCCTTCAAAAACCATTTCAA | AATGTTGATGAACGCGTGTG | 274-280 | 3 | | |
| l1l2itg52347 | L1 | (AT)8 | CTCATGTTTCTTGGGTGGAAA | CAATCATGTCTAAACCGGGAA | 209-215 | 4 | | |
| l1l2itg50343 | L1 | (AT)10 | ATATTAGCGGCCATGCTGTT | TGTTCATGTTGGTTGCAAGA | 235-239 | 3 | | |
| l1l2itg20858 | L1 | (AAC)12 | ACCCCACTTCTCCCAACTCT | TCCATGAATGAAATGGGGTT | 229-238 | 3 | Pollen-specific protein SF3 | |
| l1l2itg20038 | L1 | (TA)9 | TTCAGAAACAAAGGGGTTGC | TCCAGAAATTCTTCTACATCCCA | 179-183 | 3 | | |
| l1l2itg52625 | L1 | (TCA)12 | CTGGTCTTCTGTCGACTCCA | GACCAAGAAGTCAAGCTCGG | 109-124 | 4 | | |
| l1l2itg37631 | L1 | (CT)12 | TAAAGTGCCACCAACAAGCA | TTGTGTTGGTTGTGTGTAGAGAGA | 133-155 | 6 | | |
| l1l2itg27097 | L1 | (AAT)7 | TTCAACTACCGGTTGAACCAC | GCCCAGAATTAGGGTGCTTT | 206-209 | 2 | | |
| l1l2itg22424 | L1 | (GAA)7 | AAACGACCAACCGCATAAAG | GATGCGTGAAACTGCAAAGA | 240-249 | 3 | N-acetylglutamate synthase | |
| l1l2itg29703 | L1 | (GA)8 | ACCTTTGCGCCAAGATACAC | ATTGTGACGGTTTCACTCCC | 213-219 | 4 | | |
| l1l2itg28437 | L1 | (TA)9 | GGGCACATTTGACTCTTTCG | TCCGTGCAATGTCAATATCAA | 260-268 | 4 | | |
| l1l2itg36804 | L1 | (ATA)12 | CACATGAGAAGCAGCAATGAA | ATGCGGTGGAGTGGAAGTAA | 254-260 | 2 | | |
| l1l2itg21177 | L1, L2 | (CAT)8 | CCTTGAGGCCAATAAATGGA | TTAAGGAAGCTAGGGCCACA | 217-226 | 3 | Delta-8 sphingolipid desaturase | |
| l1l2itg39645 | L1 | (ATT)10 | AATCATGGCCTTTTTGCTTG | CGTCTTGCTCTGGTTCTTCC | 148-169 | 5 | | |
| l1l2itg35309 | L1 | (TA)8 | TTCATGGCAAGAAAAACATCT | AATCATCCATGCCATTTAACA | 271-281 | 4 | | |
| l1l2itg56943 | L1 | (GA)8 | GAGGCCCAAAAACAGAAACA | CCATTTGCGTTCGGTTCTAT | 270-272 | 2 | | |
| l1l2itg31693 | L1 | (TAT)8 | AGGGGCAAAGCTCAAAGACT | CATTCACATTTTATCCTCATTGACTC | 196-217 | 4 | | |
| l1l2itg10347 | L1 | (AT)8 | TGTGGTAAATGCAGGCTCAG | ATGCAACGGGAACCATAGTC | 184-186 | 2 | | |
| l1l2itg14618 | L1 | (CAT)7 | TTCCTCATCTCCCACACCTC | AGCTTCTGCTTGTAATCGGC | 237-252 | 4 | | |
| l1l2itg20466 | L1, L2 | (TA)9 | GTAATCATTCATGTATAATTGTAACACTC | CAATTCATTATCTGTATTATTACCCC | 180-186 | 3 | | Cytochrome B561 |
| l1l2itg53474 | L1, L2 | (GA)10 | CTGAAGTGAGGTTCGGGAAG | TCAATCACACATGCTTGTTCC | 230-234 | 3 | | Cullin-1 |
| l1l2itg51894 | L1 | (AT)10 | TGACTTTGATTGTTTAGCTTACAGG | TGAATGTCAAATGCAATATTAAGGA | 247-263 | 3 | | |
| l1l2itg24819 | L1 | (AT)8 | CATTCATTCTCTAATCTTTTGTGTCA | TAAAGCTTGTCTCTTGCCCG | 219-244 | 5 | | |
| l1l2itg55310 | L1 | (TA)9 | ACCAAAAGGGTGGGTGAAAT | CCTAACATTTGAACATATTTAAAACAA | 277-283 | 4 | | |
| l1l2itg14694 | L1, L2 | (TA)8 | AAGTAGGAAGATCGAATATGAACG | GGGAAAATATCGAGGTTTTCATC | 268-278 | 3 | RNA-binding protein | |
| l1l2itg35641 | L1 | (AT)8 | AGTTGCAATTCAACAACGCA | CATGCTCTATGGCAAGTGCT | 247-251 | 3 | | |
| l1l2itg38340 | L1 | (TAT)7 | AGCTCCACTTTTAGAATTGCG | TCTATTGTTACATGCACATTATCCC | 164-173 | 4 | | |
| l1l2itg26293 | L2 | (TCCGAA)15 | CCTGCAGTGGTAGAACCTGG | GAAGCAAGGTCCACAGAAGG | 123-183 | 6 | | 18S ribosomal RNA gene |
| l1l2itg42878 | L2 | (CATTCC)11 | CAACTCTTGTTTGCAGACCG | GCTACCCTTTCGGGACTAGC | 217-235 | 4 | | |
| l1l2itg13749 | L2 | (TTCCGC)8 | TTTTTACTCGACTCGCTCCC | CCAGTCGATTTAGCAGTCGC | 207-261 | 7 | | |
| l1l2itg32760 | L1, L2 | (CGGAAT)14 | TCATAATGAATTAAATTAACCCCC | TCCCTGACTCTGTCTTTGGG | 146-284 | 14 | | |
| l1l2itg00675 | L2 | (TCT)8(TCG)5 | AGAGAGATCCTCTTTGACGCC | GTGGTTAGCGAGAACCATCG | 187-199 | 4 | | BSD domain-containing protein |
| l1l2itg45631 | L2 | (ATC)10 | AAACCGAATTGTGGATCAGC | GGGGACTCTGGAAAATCAGG | 146-155 | 3 | Alphavirus core protein family | |
| l1l2itg20349 | L2 | (AAC)7 | ACTAAGGGAAAGGGATTCGG | CCAGGCAAGAACAAAAGAGG | 186-189 | 2 | LPA2 (low psii accumulation2) | |
| l1l2itg41827 | L2 | (TTG)7 | TTGAGTCATATCACCATAGCGG | CAACCACAAATGGAAAACCC | 242-245 | 2 | Lipase class 3 family protein | |
| l1l2itg47916 | L2 | (TCT)9 | GGTGGGTGAAAATGAAATGG | TAACCAAAATGGTTCGTCGG | 241-247 | 2 | | |
| l1l2itg42002 | L2 | (AAC)8 | CTTGCAGGGTCTTCTTACAGC | GGGGTTGTTTTTGGTGTCC | 243-246 | 2 | | |
| l1l2itg54849 | L2 | (ACA)7 | TTCTCCAATGATGAAATGCC | TTCACGGCTAAATACCAAGC | 177-183 | 2 | Microtubule-associated protein | |
| l1l2itg13638 | L2 | (TGT)9 | CCATGGTCATCATTAACCCC | CGAGTCGAGTTCGTTTACCC | 188-200 | 5 | f-box family protein | |
| l1l2itg26640 | L2 | (AG)7 | GGTCTGTTGGAGAAGGCTACC | CCACCAATGGGTAGACATACG | 203-209 | 3 | Small nuclear ribonucleoprotein | |
| l1l2itg29887 | L2 | (GCT)10 | CCCATCTGAAAGACTTACGGC | TCCCTTTTCATCCAGAGAGG | 243-249 | 2 | Ser/thr-protein kinase AFC2 | |
| l1l2itg50945 | L2 | (CCA)6(ACA)7 | CCAGAACAAGGAGAAGGTTCC | TTCTTCTTCCTCGCAGGC | 198-204 | 3 | Zinc finger, Transcription factor | |
| l1l2itg44905 | L2 | (CTT)9 | AAATCACAGAGCCAAGGAGG | TCAGCTTATTTTGTTTCCAAGC | 356-362 | 3 | Transcription factor | |
| l1l2itg09113 | L2 | (AT)8 | CATGACCCAATCTCAAACCC | GCATCTGGATCTGCTTAATTGG | 341-343 | 2 | | |
| l1l2itg03938 | L2 | (CCGATT)9 | CATGTGGGAAGACCAGAAGC | ACTACGCGCTGCTAATGTCC | 212-290 | 7 | Polygalacturonase | |
| l1l2itg32421 | L2 | (AATCGG)8 | AGAGAAGTAGGCATGGTGGC | GATCGGCCTATTCACTCAGC | 221-293 | 5 | | |
| l1l2itg29217 | L1, L2 | (AT)7 | ACACTCTCAAGGAAAAGGGC | CCATTTAACCGATAATGCTTGG | 340-344 | 2 | Lactoylglutathione lyase | |
| l1l2itg27515 | L2 | (TTC)17 | CATGCGTCCAATCTATCACC | AGTGGGAAACAAGGAAGTGG | 182-221 | 8 | PPR-containing protein | |
| l1l2itg41211 | L2 | (GAA)11 | TCCTCCTGCTTCAGAACG | AAATCCACGTCATCAATCCG | 209-230 | 6 |
cDNA 454 assembly statistics of L1, L2 and L1L2 libraries
| Number of sequenced bases | 205,618,165 | 530,678,975 | 736,297,140 |
| Number of reads | 755,206 | 1,468,202 | 2,213,408 |
| Number of reads assembled | 604,869 | 1,345,892 | 1,964,517 |
| Read average length | 276 | 361 | 332 |
| Number of contigs | 26,975 | 43,674 | 71,655 |
| Contig average size | 589 | 986 | 901 |
| Number of isotigs | 21,235 | 35,191 | 55,309 |
| Isotig average size | 589 | 986 | 901 |
| Number of isogroups | 15,295 | 24,653 | 36,886 |
| Isogroup average size | 589 | 989 | 905 |
| Average number of reads by contig | 22.4 | 30.8 | 27.4 |
| %GC | 30.7 | 39.9 | 37.5 |
| Annotated sequences | | | 32,862 |
| Gbrowse mapped sequences | 25,400 | ||
Figure 1 GO term annotations for L1L2. Isotigs were grouped under three categories: (a) molecular function, (b) biological processes, and (c) cellular components. Numbers between parentheses indicate the number of positive matches for each function.
Figure 2 Venn diagram summarizing the distribution of tBlastX matches between and four model species (). Numbers following the model species correspond to the size of the respective data base. Numbers within the Venn diagram indicate the number of sequences sharing similarity using tBLASTx. Numbers within parenthesis indicate the percentage of matches in terms of the total number of L. luteus sequences.
Figure 3 Microsyntenic DNA fragments mapped on the Medicago genome using a GBrowse platform. (a) L. luteus microsyntenic region 13 on M. truncatula chromosome 1; (b) L. luteus microsyntenic region 5 on M. truncatula chromosome 1; (c) L. luteus microsyntenic region 11 on M. truncatula chromosome 2.
Features of EST-SSRs identified in assembled L1L2 library
| Total number of examined sequences | 55,309 |
| Estimated transcriptome screened (kbp) | 49,841 |
| Number of sequences containing SSRs | 2,572 |
| Number of identified SSR | 2,774 |
| Number of EST-SSRs in coding regions | 1,435 |
| Number of sequences containing more than 1 SSRs | 147 |
| Number of SSRs present in compound formation | 195 |
| Frequency of SSR in transcriptome | 1/18 Kbp |
Distribution of repeat types and number of repeats within the L1L2 library
| Di-nucleotide | | | 363 | 204 | 120 | 72 | 91 | 851 (30.7) |
| Tri-nucleotide | | 826 | 369 | 131 | 69 | 25 | 57 | 1477 (53.2) |
| Tetra-nucleotide | | 43 | 9 | 3 | 1 | 2 | 8 | 66 (2.4) |
| Penta-nucleotide | 129 | 46 | 6 | 3 | 9 | 6 | 12 | 209 (7.5) |
| Hexa-nucleotide | 105 | 26 | 11 | 3 | 9 | 5 | 13 | 171 (6.2) |
Figure 4 Alignment of , and containing several repeat motifs. (a) isotig03739 with GA and AGA motifs; (b) isotig16318 with a TAA motif; and (c) isotig21236 with a GAA motif.
Figure 5 Neighbour Joining tree relating the 64 accessions included in the diversity study. Numbers above branches correspond to bootstrap values. Accessions are identified by a letter L followed by numbers. Letters around accessions identify country of origin based on seed bank or breeding histories (RUS: Russia, ISRL: Israel, HUNG: Hungary, CHIL: Chile, GER: Germany, SPN: Spain, PORT: Portugal, MORO: Morocco, POL: Poland, BYS: Belarus, UKR: Ukraine). The scale is in distance units.