| Literature DB >> 26872939 |
João Vitor Maldonado dos Santos1,2, Babu Valliyodan3, Trupti Joshi4,5, Saad M Khan6, Yang Liu7, Juexin Wang8, Tri D Vuong9, Marcelo Fernandes de Oliveira10, Francismar Corrêa Marcelino-Guimarães11, Dong Xu12,13, Henry T Nguyen14,15, Ricardo Vilela Abdelnoor16,17.
Abstract
BACKGROUND: Soybean [Glycine max (L.) Merrill] is one of the most important legumes cultivated worldwide, and Brazil is one of the main producers of this crop. Since the sequencing of its reference genome, interest in structural and allelic variations of cultivated and wild soybean germplasm has grown. To investigate the genetics of the Brazilian soybean germplasm, we selected soybean cultivars based on the year of commercialization, geographical region and maturity group and resequenced their genomes.Entities:
Mesh:
Year: 2016 PMID: 26872939 PMCID: PMC4752768 DOI: 10.1186/s12864-016-2431-x
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1Summary of the major modifications caused by SNPs and InDels. a SNPs (blue) and InDels (red) distributed among the 20 soybean chromosomes. b Numbers of transition/transversion mutations: pyrimidine/purine (blue), purine/pyrimidine (red), pyrimidine/pyrimidine (green) and purine/purine (purple). c Percentage of SNPs per region in the soybean genome. d Percentage of InDels per region in the soybean genome
Total SNPs and InDels for each Brazilian soybean cultivar
| Type | SNPs | InDels | ||||||
|---|---|---|---|---|---|---|---|---|
| Non-coding region | Coding region | None | Total | Non-coding region | Coding region | None | Total | |
|
| 848,752 | 5,231 | 99,327 | 953,310 | 203,207 | 25,458 | 1,390 | 230,055 |
|
| 1,267,827 | 8,302 | 158,063 | 1,434,192 | 300,160 | 38,513 | 2,088 | 340,761 |
|
| 1,166,390 | 7,772 | 148,274 | 1,322,436 | 275,225 | 35,666 | 1,779 | 312,670 |
|
| 778,152 | 6,204 | 100,434 | 884,790 | 196,677 | 25,615 | 1,584 | 223,876 |
|
| 1,078,635 | 7,169 | 128,689 | 1,214,493 | 252,198 | 32,059 | 1,731 | 285,988 |
|
| 1,017,791 | 5,595 | 115,310 | 1,138,696 | 222,364 | 27,243 | 1,354 | 250,961 |
|
| 1,318,067 | 8,113 | 162,389 | 1,488,569 | 311,172 | 39,236 | 1,909 | 352,317 |
|
| 1,321,233 | 7,824 | 158,653 | 1,487,710 | 296,309 | 38,087 | 1,812 | 336,208 |
|
| 839,016 | 5,276 | 102,261 | 946,553 | 202,591 | 26,176 | 1,377 | 230,144 |
|
| 1,373,660 | 8,511 | 159,418 | 1,541,589 | 299,360 | 37,447 | 1,905 | 338,712 |
|
| 1,296,919 | 7,775 | 152,967 | 1,457,661 | 304,194 | 38,021 | 1,898 | 344,113 |
|
| 1,273,571 | 7,868 | 157,019 | 1,438,458 | 301,255 | 38,665 | 1,891 | 341,811 |
|
| 1,326,229 | 8,574 | 151,052 | 1,485,855 | 299,477 | 36,717 | 2,006 | 338,200 |
|
| 1,376,297 | 8,346 | 162,189 | 1,546,832 | 314,856 | 39,732 | 1,909 | 356,497 |
|
| 1,305,772 | 8,447 | 150,706 | 1,464,925 | 298,825 | 37,151 | 2,023 | 337,999 |
|
| 1,338,601 | 7,955 | 159,887 | 1,506,443 | 317,096 | 39,952 | 1,952 | 359,000 |
|
| 1,414,796 | 9,372 | 165,725 | 1,589,893 | 327,783 | 40,606 | 2,221 | 370,610 |
|
| 1,091,441 | 7,882 | 136,232 | 1,235,555 | 264,083 | 33,767 | 1,807 | 299,657 |
|
| 1,208,853 | 7,825 | 144,216 | 1,360,894 | 281,240 | 35,661 | 1,758 | 318,659 |
|
| 1,241,667 | 8,250 | 153,768 | 1,403,685 | 291,494 | 37,394 | 1,761 | 330,649 |
|
| 1,341,883 | 8,165 | 160,115 | 1,510,163 | 312,875 | 39,400 | 1,922 | 354,197 |
|
| 1,279,510 | 7,546 | 150,883 | 1,437,939 | 295,688 | 37,126 | 1,791 | 334,605 |
|
| 1,162,970 | 8,576 | 141,328 | 1,312,874 | 275,267 | 34,213 | 1,974 | 311,454 |
|
| 949,130 | 6,184 | 108,680 | 1,063,994 | 222,730 | 27,813 | 1,398 | 251,941 |
|
| 1,341,733 | 8,376 | 156,262 | 1,506,371 | 301,238 | 37,248 | 1,942 | 340,428 |
|
| 1,168,303 | 8,209 | 139,875 | 1,316,387 | 283,149 | 35,058 | 1,873 | 320,080 |
|
| 1,485,334 | 9,409 | 178,350 | 1,673,093 | 340,609 | 42,878 | 2,177 | 385,664 |
|
| 1,008,968 | 6,154 | 107,398 | 1,122,520 | 231,321 | 27,871 | 1,396 | 260,588 |
Non-coding regions: corresponding to allelic variations up to 5 kb upstream or downstream of genes and intergenic regions modifications; Coding region: corresponding to UTR regions, exons, introns, and splice site modifications; None: no description available for the region
Fig. 2Twenty-four SNPs identified in E1-E3 loci, and the regulatory region of the E4 gene. Upstream: SNPs were detected up to 5 kb upstream of the coding region; Non-synonymous: SNP variants causing a codon that produces a different amino acid; Intron: SNPs detected inside an intron; 3’ UTR: SNPs found in the 3’UTR; 5’UTR: SNPs was found in the 5’ UTR; Splice Site Region: sequence variants in which a change has occurred within the region of the splice site, either within 1-3 bases of the exon or 3-8 bases of the intron
Fig. 3Population structure analysis of 28 Brazilian soybean cultivars. a Neighbor-joining phylogenetic tree generated for the 28 Brazilian soybean accessions. b Principal Component Analysis (PCA) from the 28 Brazilian soybean cultivars, c Bayesian clustering (FastStructure, K = 3) for the 28 Brazilian soybean cultivars
Summary of regions under positive selection processes with FST and θπ values
| Chromosome | Start | End | Number of SNPs | θπ (oldest cultivars) | θπ (newest cultivars) | FST |
|---|---|---|---|---|---|---|
| 07 | 40,100,001 | 40,110,000 | 41 | 0.00219 | 0.00000 | 0.7071 |
| 40,110,001 | 40,120,000 | 12 | 0.00064 | 0.00000 | 0.7071 | |
| 40,140,001 | 40,150,000 | 26 | 0.00139 | 0.00000 | 0.7071 | |
| 40,150,001 | 40,160,000 | 31 | 0.00165 | 0.00013 | 0.7071 | |
| 40,160,001 | 40,170,000 | 36 | 0.00192 | 0.00006 | 0.7071 | |
| 40,630,001 | 40,640,000 | 21 | 0.00112 | 0.00000 | 0.7071 | |
| 15 | 2,950,001 | 2,960,000 | 35 | 0.00187 | 0.00014 | 0.7071 |
| 2,960,001 | 2,970,000 | 16 | 0.00085 | 0.00000 | 0.7071 | |
| 17 | 3,010,000 | 3,020,000 | 17 | 0.00060 | 0.00000 | 0.8695 |
| 3,030,001 | 3,040,000 | 23 | 0.00082 | 0.00000 | 0.8194 | |
| 3,040,001 | 3,050,000 | 41 | 0.00146 | 0.00002 | 0.8695 | |
| 3,050,001 | 3,060,000 | 13 | 0.00046 | 0.00000 | 0.7486 | |
| 5,560,001 | 5,570,000 | 76 | 0.00279 | 0.00000 | 0.8620 | |
| 5,570,001 | 5,580,000 | 31 | 0.00110 | 0.00000 | 0.8695 | |
| 5,580,001 | 5,590,000 | 26 | 0.00092 | 0.00000 | 0.8695 | |
| 5,610,001 | 5,620,000 | 22 | 0.00078 | 0.00000 | 0.8275 | |
| 5,620,001 | 5,630,000 | 34 | 0.00121 | 0.00000 | 0.8695 | |
| 5,660,001 | 5,670,000 | 39 | 0.00140 | 0.00000 | 0.8677 | |
| 5,670,001 | 5,680,000 | 26 | 0.00092 | 0.00000 | 0.8695 | |
| 5,680,001 | 5,690,000 | 35 | 0.00128 | 0.00000 | 0.8383 | |
| 5,710,001 | 5,720,000 | 28 | 0.00100 | 0.00003 | 0.8695 | |
| 5,730,001 | 5,740,000 | 20 | 0.00070 | 0.00000 | 0.8321 | |
| 5,740,001 | 5,750,000 | 45 | 0.00160 | 0.00004 | 0.8695 | |
| 5,750,001 | 5,760,000 | 26 | 0.00094 | 0.00000 | 0.8572 | |
| 5,760,001 | 5,770,000 | 74 | 0.00263 | 0.00000 | 0.8695 | |
| 5,770,001 | 5,780,000 | 24 | 0.00088 | 0.00001 | 0.8636 | |
| 5,780,001 | 5,790,000 | 39 | 0.00139 | 0.00000 | 0.8676 | |
| 5,790,001 | 5,800,000 | 25 | 0.00089 | 0.00000 | 0.8695 | |
| 5,800,001 | 5,810,000 | 63 | 0.00224 | 0.00000 | 0.8671 | |
| 5,810,001 | 5,820,000 | 50 | 0.00178 | 0.00000 | 0.8695 | |
| 5,820,001 | 5,830,000 | 48 | 0.00171 | 0.00000 | 0.8695 | |
| 5,830,001 | 5,840,000 | 48 | 0.00171 | 0.00003 | 0.8679 | |
| 5,840,001 | 5,850,000 | 27 | 0.00096 | 0.00000 | 0.8695 | |
| 5,850,001 | 5,860,000 | 24 | 0.00085 | 0.00007 | 0.8695 | |
| 5,860,001 | 5,870,000 | 69 | 0.00249 | 0.00010 | 0.8664 | |
| 5,870,001 | 5,880,000 | 32 | 0.00114 | 0.00000 | 0.8695 | |
| 5,880,001 | 5,890,000 | 66 | 0.00238 | 0.00000 | 0.8663 | |
| 5,890,001 | 5,900,000 | 76 | 0.00270 | 0.00003 | 0.8447 | |
| 5,900,001 | 5,910,000 | 58 | 0.00206 | 0.00000 | 0.8695 | |
| 5,910,001 | 5,920,000 | 14 | 0.00050 | 0.00007 | 0.8050 | |
| 18 | 2,190,001 | 2,200,000 | 107 | 0.00571 | 0.00010 | 0.7032 |
F population fixation index coefficient; θπ: nucleotide diversity; oldest cultivars: Brazilian soybeans released before 1980; newest cultivars: Brazilian soybean cultivars released after 2000
Fig. 4Two regions between 3.01-3.09 Mb (a) and 5.53-5.92 Mb (b) on chromosome 17 under positive selection. The red line corresponds to the nucleotide diversity of the newest cultivars and the blue line the oldest cultivars. The black line is the FST values between the oldest and newest cultivars
Fig. 5Copy number variations detected on chromosome 16 for the oldest and newest Brazilian cultivars. The x-axis represents the genomic position and y-axis the log-ratio of the read counts. The red dots are the copy number call of each segment
Number of unique SNPs, InDels and CNVs for each Brazilian soybean cultivar
| Name | SNPs | InDels | Total | CNVs | Total | ||
|---|---|---|---|---|---|---|---|
| Deletion | Insertion | Deletion | Insertion | ||||
|
| 3,586 | 471 | 462 | 933 | 11 | 27 | 38 |
|
| 7,036 | 881 | 796 | 1,677 | 4 | 7 | 11 |
|
| 3,653 | 482 | 388 | 870 | 35 | 18 | 53 |
|
| 62,279 | 4,224 | 4127 | 8,351 | 100 | 63 | 163 |
|
| 3,731 | 588 | 541 | 1,129 | 22 | 4 | 26 |
|
| 10,778 | 1,130 | 946 | 2,076 | 10 | 53 | 63 |
|
| 5,328 | 775 | 654 | 1,429 | 8 | 43 | 51 |
|
| 20,388 | 1768 | 1,489 | 3,257 | 21 | 2 | 23 |
|
| 74,314 | 7,651 | 7,438 | 15,089 | 23 | 9 | 32 |
|
| 318 | 81 | 57 | 138 | 4 | 6 | 10 |
|
| 3,116 | 391 | 350 | 744 | 12 | 3 | 15 |
|
| 10,662 | 1,069 | 927 | 1,996 | 6 | 9 | 15 |
|
| 31,811 | 3,237 | 2,791 | 6,028 | 23 | 5 | 28 |
|
| 344 | 101 | 58 | 159 | 5 | 1 | 6 |
|
| 11,050 | 1,277 | 1,098 | 2,375 | 18 | 9 | 27 |
|
| 1,486 | 200 | 174 | 376 | 3 | 2 | 5 |
|
| 42,826 | 4,287 | 3,785 | 8,071 | 32 | 25 | 57 |
|
| 1,882 | 253 | 234 | 487 | 15 | 17 | 32 |
|
| 12,590 | 1,487 | 1,210 | 2,697 | 8 | 10 | 18 |
|
| 36,447 | 3,920 | 3,685 | 7,605 | 20 | 10 | 30 |
|
| 458 | 102 | 76 | 178 | 3 | 3 | 6 |
|
| 41,325 | 2,973 | 2,637 | 5,610 | 25 | 8 | 33 |
|
| 8,918 | 1,195 | 1,110 | 2,305 | 37 | 103 | 140 |
|
| 22,691 | 2,504 | 2,121 | 4,625 | 29 | 19 | 48 |
|
| 18,590 | 1,538 | 1,342 | 2,880 | 32 | 30 | 62 |
|
| 6,835 | 626 | 466 | 1,094 | 11 | 5 | 16 |
|
| 96,105 | 8,324 | 7,602 | 15,926 | 48 | 9 | 57 |
|
| 3,215 | 423 | 400 | 823 | 6 | 22 | 28 |
|
|
|
|
|
|
|
|
|