Literature DB >> 25822501

Interacted QTL mapping in partial NCII design provides evidences for breeding by design.

Su Hong Bu1, Xinwang Zhao, Zhao Xinwang2, Can Yi1, Jia Wen1, Jinxing Tu, Tu Jinxing2, Yuan Ming Zhang3.   

Abstract

The utilization of heterosis in rice, maize and rapeseed has revolutionized crop production. Although elite hybrid cultivars are mainly derived from the F1 crosses between two groups of parents, named NCII mating design, little has been known about the methodology of how interacted effects influence quantitative trait performance in the population. To bridge genetic analysis with hybrid breeding, here we integrated an interacted QTL mapping approach with breeding by design in partial NCII mating design. All the potential main and interacted effects were included in one full model. If the number of the effects is huge, bulked segregant analysis were used to test which effects were associated with the trait. All the selected effects were further shrunk by empirical Bayesian, so significant effects could be identified. A series of Monte Carlo simulations was performed to validate the new method. Furthermore, all the significant effects were used to calculate genotypic values of all the missing F1 hybrids, and all these F1 phenotypic or genotypic values were used to predict elite parents and parental combinations. Finally, the new method was adopted to dissect the genetic foundation of oil content in 441 rapeseed parents and 284 F1 hybrids. As a result, 8 main-effect QTL and 37 interacted QTL were found and used to predict 10 elite restorer lines, 10 elite sterile lines and 10 elite parental crosses. Similar results across various methods and in previous studies and a high correlation coefficient (0.76) between the predicted and observed phenotypes validated the proposed method in this study.

Entities:  

Mesh:

Year:  2015        PMID: 25822501      PMCID: PMC4379165          DOI: 10.1371/journal.pone.0121034

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

Genetic mating design plays an important role in crop genetics and breeding. In hybrid breeding, this design was widely used to evaluate general combining ability of parents and specific combining ability of the F1 between two parents. Therefore, many elite hybrid cultivars were bred and utilized in crop production. In classical quantitative genetics, mating design is one of main components. It provided a lot of information about two-order genetic parameters for quantitative traits. However, only the collective effects of all the polygenes were estimated. The introduction of molecular markers has facilitated the mapping of quantitative trait loci (QTL) in numerous species, and substantial progress has been achieved in bi-parental segregation population but not in the mating design. As we know, North Carolina (NC) design II has been considered to be one of the most powerful mating designs for combining ability and heterosis analyses, and its application has expanded into many crops. Therefore, there is a critical need for in-depth study of the methodology for mapping QTL in this design. During the past several decades, many attempts have been made to detect QTL for quantitative traits in bi-parental segregation populations. For example, single marker analysis [1], two-marker analysis [2], interval mapping [3], composite interval mapping [4,5], multiple interval mapping [6], Bayesian method [7,8], and Bayesian-based likelihood approach [9-11]. However, this bi-parental segregation population is rarely used alone in commercial breeding, and therefore the results from these single-cross experiments have limited roles in breeding practice [12]. To overcome this shortcoming, crop cultivar population was used to conduct genome-wide association study (GWAS) [13,14], and the main and interacted effects of detected QTL were used to carry out breeding by design [14-16]. However, this approach is useful only for inbred line breeding but not for hybrid breeding, because only additive and additive-by-additive (aa) effects [14] but not dominant-related effects [17-21] were estimated in the above GWAS. Up to now many mating designs have been proposed, such as four-way cross [22], triple testcross (TTC) [23], diallel design [24,25], and NC mating design [26]. Current studies on the topic focus mainly on three aspects. The first is the classical genetic analysis [27-29], for example, combining ability analysis [30]. These analyses dissected the genetic foundation of quantitative traits, which provided useful information for crop breeding. However, the position and effect of individual QTL are unclear. To answer this question, the QTL mapping approach is available, which is the second main area of the work on the mating design. In four-way cross, He et al. [12] extended the main-effect QTL mapping of Xu [31] into the interacted QTL mapping. In TTC, Z 1, Z 2 and Z 3 were used to detect augmented additive, augmented dominant and dominance-by-additive (da) interaction effects, respectively, in Kusterer et al. [32] and Melchinger et al. [33], and to unbiasedly estimate all the main and epistatic effects in He et al. [34]. In the NCIII, pair mean Z 1 and pair difference Z 2 were used to detect augmented additive effect and augmented dominant effect in Melchinger et al. [35], and epistasis in Garcia et al. [36] and He et al. [37]. In addition, Rebaϊ & Goffinet [38] and Lenarcic et al. [39] developed a general regression-based method and Bayesian approach, respectively, for QTL detection in diallel design; Li et al. [40] and Wang et al. [41] proposed analysis of variance approach for the detection of main and interacted QTL of quantitative and endosperm traits in NCIII and TTC, respectively; and Reif et al. [42] used TTC with near-isogenic lines as base population to detect epistasis. However, relatively little has been known about NCII mating design. Recently, different base populations along with several testers were used to dissect the genetic foundation of heterosis using QTL mapping of GCA and SCA, such as BC1F8 [43], RIL [44] and introgression lines [45,46]. More recently, the comparison across different base populations was conducted as well [47]. However, these studies are based on main-effect QTL model. Finally, the mating design has been adopted in crop breeding, for example, four-way cross in maize breeding, and NCII mating design in rice and maize hybrid breeding. However, genetic analysis and crop breeding have been often performed separately. In hybrid breeding, one group of parents is crossed with another group of parents, namely NCII mating design, in order to select elite hybrid cross. However, partial (or unbalanced) crosses are often conducted in breeding practice. In this study, partial F1 hybrids along with their parents, a partial NCII mating design, were used to conduct genetic analysis. Here all the main and interacted effects were included in one full model. Two-dimension interacted effects between each pair of QTL were considered as interacted term. If the number of effects in the model was 10 times more than sample size, two groups of extreme individuals were used to test which effects were related to the trait. All the selected effects were further shrunk by empirical Bayesian, so significant effects could be identified. A series of Monte Carlo simulations was performed to validate the new method. The validated approach was used to dissect the genetic basis of oil content in rapeseed in partial NCII mating design. Based on the above information, novel parents and cross combinations would be predicted.

Results

Effect of QTL heritability on mapping QTL

In the first simulation experiment, the effect of QTL heritability on QTL mapping in the NCII population was evaluated by letting QTL heritability be set as 0.02, 0.05 and 0.08. Note that the number of effects in the full model is 84,050, which is 116 times more than sample size. At this situation, high throughput QTL-effect screening approach described in this study was adopted. The results are shown in Fig. 1 and Supporting Information S1 Table. A general trend was found. In other words, the power of QTL detection increases as QTL heritability increases. Relatively small estimates for the average and standard deviation of absolute bias between estimated and true effects as well as false positive rate (FPR) were observed in the above three situations. In the same QTL heritability, the power is higher for main-effect QTL than for interacted QTL. In addition, the lowest power in the detection of the dominant-by-dominant (dd) interaction is observed in all the situations. The reason may be due to the low proportion of heterozygous genotypes in the mapping population.
Fig 1

Effect of QTL heritability on mapping QTL in the NCII.

Power of QTL detection (a); false positive rate (b); and average (c) and standard deviation (d) of absolute bias between estimated and true effects.

Effect of QTL heritability on mapping QTL in the NCII.

Power of QTL detection (a); false positive rate (b); and average (c) and standard deviation (d) of absolute bias between estimated and true effects.

Effect of sample size on mapping QTL

In the second simulation experiment, we explored the effect of sample size on QTL mapping by letting sample size be set as 400, 500 and 600. Note that the proportion of the paternal lines, maternal lines and their hybrids was set at 1:1:2. The others were the same as those in the first simulation experiment. The results are shown in Fig. 2 and Supporting Information S2 Table. A general trend was also found, for example, the power of QTL detection increases as sample size increases. Relatively small estimates for the average and standard deviation of absolute bias between estimated and true effects as well as the FPR were found in the above three situations.
Fig 2

Effect of sample size on mapping QTL in the NCII.

Power of QTL detection (a); false positive rate (b); and average (c) and standard deviation (d) of absolute bias between estimated and true effects.

Effect of sample size on mapping QTL in the NCII.

Power of QTL detection (a); false positive rate (b); and average (c) and standard deviation (d) of absolute bias between estimated and true effects.

Effect of population structure on mapping QTL

In the third simulation experiment, we investigated the effect of population structure on QTL mapping. The results are shown in Fig. 3 and Supporting Information S3 Table. If all the parents were viewed as mapping population, only additive and aa effects could be detected. It is reasonable. This is because only homozygous genotypes were included in the mapping population. If all the F1 hybrids were viewed as mapping population, additive and dominant effects could be identified with satisfactory powers. Although all kinds of interacted effects could be detected, their powers were not high, with the highest power for aa effect and with the lowest power for dd effect. Meanwhile, the average and standard deviation of absolute bias between estimated and true effects were larger for dominant-related effects than for additive and aa effects. If parents and F1 hybrid were mixed with the same proportion, their powers in the detection of aa, additive-by-dominant (ad) and da effects were significantly higher than those in the only F1 hybrids.
Fig 3

Effect of population structure on mapping QTL in the NCII.

Power of QTL detection (a); false positive rate (b); and average (c) and standard deviation (d) of absolute bias between estimated and true effects.

Effect of population structure on mapping QTL in the NCII.

Power of QTL detection (a); false positive rate (b); and average (c) and standard deviation (d) of absolute bias between estimated and true effects.

Mapping QTL for oil content in Brassica napus

One Brassica napus breeding dataset from Professor Jinxing Tu at Huazhong Agricultural University was used for the further demonstration. The dataset was collected from a partial NCII breeding design that contained 298 sterile lines, 143 restorer lines and their 284 F1 hybrids. The phenotype analyzed was oil content. A total of 205 markers were used in the analysis. The total number of effects included in the genetic model is 84,050. Therefore, high throughput QTL-effect screening approach described in this study was adopted. In the first step, 72 parental lines or F1 hybrids with the highest oil content and 72 parental lines or F1 hybrids with the lowest oil content were selected from the mapping population. Using the two groups of individuals, χ 2 test of independence was carried out to identify whether one marker under consideration was associated with the trait. In the second step, all the markers associated with the trait were included in the genetic model of the full mapping population. Therefore, association mapping could be conducted. All the results were listed in Table 1.
Table 1

Mapping QTL for rapeseed oil content in partial NCII mating design.

QTLTypePosition (marker)Allelic frequencyBulked segregation analysisAssociation mappingSimilar results with other methodsSimilar results in previous studies
Locus 1Locus 2 χ 2 P-valueLODEffectr2(%)EBLASSOGEMMARegression-basedCV
1aCB10597C0.5417.460.00023.790.26191.19 Ra2E04~S13M08-1-70[56];e8m17-309~Ra3H09[51]
2ABo3b0.4015.520.00043.13−0.26801.10
3axy2b0.7313.400.001211.390.55973.99
4aBnGMS488A0.8611.010.00419.270.58312.77
5aBRAS078A0.058.820.012214.04−1.20474.12 BRAS078A[50]; e8m17-309 ~ Ra3H09 [51];AgCan20a ~ Bras026 [52]
6aOl10-C010.847.720.02117.60−0.49162.22 HMR612a ~ HMR612b [53]; E7M5c ~ sN0240A [50]
7aBo2d0.646.680.03544.00−0.28521.28
8dCB10597C0.544.430.03523.10−0.56721.07
9aaCN64c×SA63a0.260.3218.420.00016.28−0.34021.77 HMR300c ~ MR133.2 [53]
10aaBRAS063a×Na10-C06A0.290.6212.680.00182.23−0.20950.68 * Bras063a [52]; EM16/EM17c~sN12353b [54]; GIFLP106 [57];Na10-C06[51]
11aaBo3a×CB10288B0.610.4312.430.00207.250.38882.10 * HMR403b~MR229 [53]
12aaBnGMS175A×Ra2-G08A0.590.2512.090.00244.88−0.31241.42
13adCB10431A×CB10234A0.820.7111.860.00273.480.66911.22IGF9014c~pw179b [56]; CB10431[52];CB10234[52];CB10234~ZAAS763[50]
14aa20-1c×Ra3-E05D0.340.1711.490.00323.570.25791.01 * * ODD20/GB2-238~EM1/PM4-400 [55]
15aaBRAS014B×Na14-H11B0.460.8511.180.00374.140.27081.10 IGF9014c~pw179b [56]; PM88/PM34-484~SA7/PM56-466 [55];Bras115~W09.CD1 [52]; O|10F04~B0SF1574 [57]; Na14H11[57]; CN32a~E7M5f [50];
16daCB10139B×BnGMS3B0.390.8310.800.00454.200.73971.01 SF19775 [57]; PM88/PM34-378~CB10139-176 [55]
17aaCB10229A×Ra3-E05A0.630.5010.700.00486.79−0.38692.15 snap1200~GIFzip47a [57]; SA89~Na12C01a [51]
18aaCB10036B×CB10343A0.250.7512.290.00213.730.27941.18 * IGF5154c~pX141eE[56];AgCan9~Bras102b[52];EM2/ME14a~Ra2-A01 [54]; CZ1b705119~CZ1b684396 [52]
19aaCN75b×CB10373B0.670.5112.280.00225.460.34311.61 HMR438a~HMR310[53]
20aaCN1b×Ra2-E12B0.670.6410.830.00456.46−0.37941.87 * EM1/BG1-405~SA7/PM63-298 [55]
21aaCN64d×20-1b0.580.4610.650.00493.620.25971.00 *
22aaOl12-D05B×BnGMS385B0.600.8210.550.00513.49−0.32931.07 * e18m6-189~CB10028 [51]; E32M48.255~CB10179 [52];niab013~SF25867[57]
23aaMR119d×Na12-A02B0.460.5610.160.00625.16−0.30791.44 ME16/EM17c~sN12353b[54];Na12-A02[51]
24daE5a×Na14-H11A0.460.749.190.01012.560.80830.76 * Na14-H11[57];MR216a~MR144[52]
25adxy2b×CB10343B0.400.509.190.01017.351.72741.83
26aaMR097×Ol12-F02B0.810.659.030.01105.660.32321.65 E46M64g~Bras089[52];B087P06-1~SA89[51];Ol12-F02A~Mr216b[52];niab028~FITO516c[51]
27aaCB10493C×BnGMS340B0.150.468.780.01243.050.25020.96 * sORG49a[50]
28daCN64d×Ol11-C02A0.330.808.500.01433.05−0.67490.98 * * IGF9014c~E2HM32-320[56]
29aa32_1a×Na10-C06B0.820.5411.580.00314.610.29971.37 * ODD20/PM16-97~ME2/PM45-384 [55];e4m5-260~CB10632 [51]; ZAAS815b~ZAAS893 [50]
30ddRa2E12×CB10065B0.220.909.600.00193.001.10790.78Ra2E12 [56]; FITO131 [51];E38M621~AgCan50 [52];CB10530a~EM9/ME37a [54];CB10065[52];BoGMS1025~SF17359[57]
31aaBnGMS352B×Ra3-E05C0.870.479.180.01022.63−0.20470.67 * e18m6-189~e18m5-374[51];ODD20/GB2-238~EM1/PM4-400[55]
32daCB10045A×Ra3-E05B0.220.159.010.01114.09−0.87611.39E2M3/g~EM11/Me23a [54]; E42M50.55~E41M50.206 [52];sORG49a[50]
33ddBo2d×BnGMS175A0.620.439.000.00272.13−0.51420.52 *, * *
34aaBo3a×CB10065A0.680.0512.450.00202.650.22380.77 * * CB10065 [52]; BoGMS1025~BrSF50-42 [57];Na12G11~SN12508 [52]; R04.1840~R06.1360 [52]
35aaBRMS-036c×CB10364B0.130.9611.400.00333.35−0.24880.95 * BRMS036[50];CB10364**[54];H004I05-1~BnGMS312[51]
36adCB10493C×BnGMS3B0.940.619.600.00822.410.63480.96 * IGF2021e~S10M03-1-360[56];e10m22-313~CB10028[51]
37daCN46b×CB10597C0.290.468.000.01832.890.75350.81Ra2E04~S13M08-1-70[56];e8m17-309~Ra3H09[51]
38aaBnGMS103B×CB10277B0.480.628.840.01203.51−0.25781.12 *, * CB10277[54]
39ddRa2-E12A×Ol11-C02A0.850.148.470.00363.35−0.93211.09RA2E12[56];E38M621~Na12B05[52];IGF9014c~E2HM32-320[56]
40aaMD21a×Ol12-D05A0.690.228.110.01733.62−0.43040.96 BrBAC138~GIFLP106[57];E32M48.255~SN11670[52];BRAS031a~E2M3e[50]
41aaBRAS063a×CB10139B0.730.6711.890.00264.58−0.28011.20 *
42aaCN64d×CN59a0.080.228.660.01322.55−0.21340.68
43aaxy2a×20-1c0.250.8612.320.00212.300.21780.71 *
44aaCN63b×Na12-A02C0.270.688.470.01452.68−0.23850.76 * Na12-A02[51]
45da20-1b×CB10026C0.650.8012.220.00222.150.51950.74CN32a~E7M5f[50];I20.760~D20.760[52];sN3761b~sR6293b[56]

a: additive; d: dominant; aa: additive-by-additive; ad: additive-by-dominant; da: dominant-by-additive; dd: dominant-by-dominant; r2: the proportion of total phenotypic variance explained by a single QTL.

EBLASSO: Fast empirical Bayesian LASSO [11];GEMMA: genome-wide efficient mixed-model association study [48]; Regression-based: Regression-based association study [49]; CV: cross validation;

√: same QTL was detected by other methods;

*: locus linked to the detected locus was identified by other methods.

a: additive; d: dominant; aa: additive-by-additive; ad: additive-by-dominant; da: dominant-by-additive; dd: dominant-by-dominant; r2: the proportion of total phenotypic variance explained by a single QTL. EBLASSO: Fast empirical Bayesian LASSO [11];GEMMA: genome-wide efficient mixed-model association study [48]; Regression-based: Regression-based association study [49]; CV: cross validation; √: same QTL was detected by other methods; *: locus linked to the detected locus was identified by other methods. 8 main-effect QTL and 37 interactedQTL were found to be associated with oil content, and explained 17.74% and 42.27% of phenotypic variance, respectively. Of these QTL, the proportion of phenotypic variance explained by each QTL varied from 0.52% to 4.12%, the LOD score varied from 2.13 to 14.04, and most QTL were additive (7 QTL) or aa (25 QTL). A few dominant-related QTL might be due to the low proportion (only 17.99%) of heterozygous genotypes in the mapping population. Correlation coefficient of 0.76 between the estimated genotypic value and the observation supported the proposed approach in this study. To further confirm the above results, three other approaches, including EBLASSO [11], genome-wide efficient mixed-model association study [48] and regression-based association study [49], were used to re-analyze this dataset. All the results were showed in Table 1. Among all the main-effect QTL, three were identified simultaneously by all the four methods and seven were detected by at least three approaches. Among all the interacted QTL, one was detected simultaneously by all the four methods, eight were identified by at least two approaches; and 14 were partially confirmed because one same locus and two linked loci associated with these interacted QTL were found. More importantly, some similar results were observed in previous studies (Table 1). One marker linked to main-effect QTL and 16 markers linked to the interacted QTL in this study were same as those in previous studies, and three markers linked to main-effect QTL and 32 markers linked to the interacted QTL were close to those in previous studies [50-57]. For example, additive QTL around marker BRAS078A was consistent with that in Zhao et al. [50], Wang et al. [51] and Delourme et al. [52]. Using the above mapping results, the genotypic values for all the missing F1 hybrids could be predicted. These predicted values were further used to calculate general combining ability (GCA) and specific combining ability (SCA). Based on these estimates, novel parents and elite F1 hybrids could be predicted (Table 2). Note that novel restorer line R092 could produce elite hybrid crosses: B1341 × R092, B0984 × R092, B0641 × R092, B0857 × R092 and B0066 × R092.
Table 2

Elite restorer and sterile lines and hybrid combinations.

Elite restorer lineElite sterile lineElite hybrid combination
IDGCAIDGCAIDBVSCA
R0923.71B03931.67R092×B134149.321.69
R5872.03B10531.66R465×B068048.523.32
R5521.73B13411.65R092×B098448.300.80
R1101.71B4161.62R552×B33848.202.63
R0021.61B3381.57R092×B064148.161.99
R04461.50B06801.54R552×B135848.033.35
R5161.40B13081.53R092×B085747.992.00
R4651.38B09841.52R002×B068547.883.40
R6271.34B05521.50R092×B006647.882.14

Discussion

There have been several advantages for the current study. First, one interacted QTL mapping approach for quantitative traits in partial NCII mating design had been proposed in this study. Most genetic analyses of previous studies in the designs are combining ability analysis in the polygenic system and little has been known about detecting individual QTL. Then, interaction genetic analysis in this study had been integrated with crop breeding. This overcomes the shortcoming that genetic analysis in bi-parental segregation population has limited roles in breeding practice [12]. Although some similar studies have been reported [13-15,58], most previous studies are for inbred line breeding. As for the prediction of hybrid performance, to use multiple parents and their progenies as genetic population is a good strategy. However, a single marker analysis in Schrag et al. [30,59] and progenies of F2-derived lines in Windhausen et al. [60] were adopted. Clearly, new development appeared in this study. Third, how to estimate a large of parameters in oversaturated genetic model had been considered. In this study, the number of effects in the genetic model is 116 times than sample size. To solve this issue, bulked segregant analysis (BSA) along with empirical Bayesian method were used to estimate all the parameters. This approach was confirmed to be feasible in both Monte Carlo simulation studies and real data analysis. Although high proportion of phenotypic variation was contributed by interaction terms in real data analysis, main and interacted QTL could be clearly identified across various approaches and various groups, and the QTL detection power was larger for main QTL than for interacted QTL in a series of simulation experiments. Actually, this high proportion phenomenon was also observed in Mackay [61]. Meanwhile, 14 of 37 interacted QTL were frequently identified in a series of cross validation experiments, and these interacted QTL were similar to the commonly interacted QTL across four approaches in this study (Table 1), although prediction accuracy in cross validation needs to be addressed. To improve the accuracy, genome-wide prediction may be available in the future project. Finally, the new approach might be extended into other mating designs, i.e., unbalanced or balanced factorial crosses, NCI design, diallel crosses and recurrent breeding population, although the new method was designed for partial NCII mating design. In recurrent breeding population, there were heterozygous genotypes in the parents so that the F1 hybrid might be a mixture of multiple genotypes. At this situation, family average idea in the widely-used F2:3 design [62] was available. Combining ability analysis has been found to be an effective method in crop breeding. When NCII mating design was completely carried out, it is easy to calculate general combining ability (GCA) and specific combining ability (SCA). At this situation, novel parents and elite hybrid crosses could be easily predicted. These results could be used to direct crop breeding practice. In crop breeding practice, however, partial NCII mating design (unbalanced factorial crosses) is frequently conducted. At this case, how to estimate combining ability is pending, although a mixed linear model approach for phenotypic values was adopted in Schrag et al. [30,59]. In this study, we proposed one method to deal with this issue. That is, the information from the interacted QTL mapping was used to estimate the genotypic values for all the missing F1 hybrids, so elite parental combination could be found. Furthermore, GCA could be estimated and favorable parents could be predicted [63,64]. If all the effects with non-zero estimates were used to predict the genotypic values for all the missing F1 hybrids, this is similar to genome-wide selection [65-68]. The above genetic analysis was for a single trait. In crop breeding, multiple traits would be improved simultaneously. At this case, we needed to pyramid favorable alleles of all the detected QTL for multiple traits. For example, if oil content, thioglycoside, erucidic acid, yield per plant and thousand kernel weight were considered simultaneously in the real data analysis. Of 268 detected QTL for the five traits (data not shown), B1348 and R484 were found to have 199 and 204 favorable alleles, respectively. These two sterile restorer lines might be considered in crop breeding practice. If the number of effects in a genetic model is much larger than sample size, it is difficult to estimate these effects. Although at present there have been some methods available, for example, Bayesian shrinkage estimation [7], LASSO [69,70], penalized maximum likelihood [9], empirical Bayesian [10], EBLASSO [11] and empirical Bayesian elastic net [71], these methods are widely used in bi-parental segregation populations but not in genetic mating population. In this study, one biological approach, named BSA or DNA pooling, was used to choose effects that are associated with the trait of interest. Although BSA was proposed in bi-parental segregation populations [72], this analysis was also useful in large scale association studies [73,74]. When considering interacted QTL in genetic model, main and interacted QTL could be clearly identified. In this study, therefore, this analysis along with the test of independence makes most effects be excluded from the full genetic model, and the reduced model is estimable. This approach in this study is an alternative way in the parameter estimation of oversaturated genetic model. In quantitative genetics, epistasis refers to any statistical interaction between genotypes at two (or more) loci [61]. In our study, aa, ad, da and dd interactions are actually statistical terms rather than epistasis, such as defined by Cockerham’s model [75], although some mathematical relationships exist [76]. In F2 population, Kao & Zeng [77] gave a very classic example for mapping epistasis, in which orthogonality between the main effect and epistasis was emphasized, because orthogonality between main effect and epistasis may be important for statistical clarification and interpretation. Note that this orthogonality depends on allelic frequency in the studied population [61]. However, only statistical interaction was considered in this study.

Materials and Methods

Mapping population and trait evaluation

In NCII mating design, all sterile lines (298) in Brassica napus need to be crossed to all restorer lines (143). However, it is infeasible in breeding practice. Here only 284 F1 hybrids were conducted at Huazhong Agricultural University (Wuhan, China). In other words, each of 143 restorer lines was crossed to a pair of sterile lines to produce 284 hybrids. Therefore, the mapping population was a partial NCII mating design (unbalanced factorial crosses), including 298 sterile lines, 143 restorer lines and their 284 F1 hybrids. Seed oil contents for each parent and F1 hybrid were measured by near infrared reflectance spectroscopy, for technical detail the reader was referred to the original study of Tillmann [78].

SSR markers

205 SSR primer pairs were examined to screen for polymorphisms among all the 441 parents and the genotypes of all the F1 hybrids were deduced from their parents. SSR primers were from Chen et al. [79] and Delourme et al. [52], and primer sequences were obtained from http://www.brassica.info/ssr/SSRinfo.htm (prefixed by Ra, Ol, Na, BN, MB, BRMS- and MR) and http://www.ukcrop.net/perl/ace/search/BrassicaDB [80]. Primer pairs prefixed ‘‘BRAS’’ and ‘‘CB’’ were from the electronic supplementary material of Piquemal et al. [81], and those prefixed ‘‘s’’ were obtained from Agriculture and Agri-Food Canada (http://www.brassica.agr.gc.ca/index_e.shtml). PCR experiment was described in Chen et al. [79].

Genetic model

Let y be phenotypic observation of the ith accessions (parent or F1) in the above partial NCII design. The genetic model for y is expressed as where μ is the population mean; m is the number of putative QTL; a and d are the additive and dominant effects of the jth QTL (j = 1,∙∙∙,m), respectively; x and z are the dummy variables of the ith individual for a and d , respectively; (aa), (ad), (da) and (dd) are aa, ad, da, and dd interaction effects between the jth and kth QTL (j = 1,∙∙∙,m−1;k > j), respectively; and ε is residual error with a N(0,σ 2) distribution. For the sake of clarity of notation, we redefine the design matrix and the regression coefficients as follows. Let Y = (y 1, y 2, ∙∙∙, y ), β = μ and X = (1, 1, ∙∙∙, 1); γ is the main and interacted effects, and Z is the dummy variable for γ. The above model is now rewritten as

Parameter estimation by empirical Bayesian

There are several methods available in the estimation of parameters in model (2), e.g., penalized maximum likelihood [9,82], empirical Bayesian [10], hierarchical generalized linear model [83,84]. Here we adopt empirical Bayesian, for technical detail the reader is referred to the original study of Xu [10]. The method is briefly described here. The parameters β and σ 2 are always included in the model, the uniform prior is assigned to the two parameters: P(β) ∝ 1 and P(σ 2) ∝ 1. We adopt the normal prior for each of the genetic effects (γ ) in model (2): . The scaled inverse χ 2 prior distribution is further assigned to : . Clearly, Y in model (2) follows a multivariate normal distribution with mean μ = Xβ and variance-covariance . Let θ = (β, γ, σ 2). Therefore, the main steps for parameter estimation are described as below. Step (0): Let ξ = (τ,ω) = (0,0), , , and γ and were initialized (k = 1,2,∙∙∙,2m 2); Steps (1): Using and , was estimated by . This is the E-step; Step (2): Update β, σ 2 and : , β = (X V −1 X) −1 X V −1 Y, and . This is the M-step; Step (3): Repeat the E-step and the M-step until convergence is reached.

Likelihood ratio test

In this study, a two-stage approach was adopted to conduct significance test of each effect. First, empirical Bayesian was used to select the significant effects in model (2). Then, all the selected effects were tested by using likelihood ratio test. For technical detail the reader is referred to the original study of Zhang & Xu [9] and Lü et al. [14]. For simplicity, the critical LOD score for declaring a significant effect at the 0.05 level was set at 2.0.

High throughput QTL-effect screening

A full genetic model should include potential pair-wise interaction effects of all loci. If the number of main and interacted effects is 10 times less than sample size, empirical Bayesian works well. Note that the model is saturated quickly as the number of loci increases. For example, in this study the number of SSR marker loci is 205, the numbers of potential main and interacted effects are 410 and 83,640, respectively. Therefore, a variable selection technique is usually considered to exclude those interactions with negligible effects. The procedures and steps were described as below. Test of independence between effect and trait. First, extreme phenotypic individuals, 10% highest and 10% lowest, were selected from the mapping population. Then, contingency table was constructed based on the extreme individual and marker genotype. Third, χ 2 test of independence was conducted to test whether the targeted marker was associated with the trait. Finally, 100 main effects or 100 epistatic effects, with the minimum P-values, were selected to enter the next step. This method is BSA; Empirical Bayesian estimation and likelihood ratio test. All the main or interacted effects selected in the first step were included in one genetic model and estimated by empirical Bayesian. Among these effects, the effects with non-zero estimates were remained. Likelihood ratio test was used to determine whether these effects were significantly associated with the trait. The critical LOD value was set at 2.0. Correction of trait phenotype using the significantly associated effects. The corrected phenotypes for all the individual were , where b was vector of effects for the significantly associated markers, and W was the designed matrix for b; Repeat the step (1) to (3) until no more additional significantly associated effects were detected; Empirical Bayesian analysis for selected main and epistatic effects. All the significantly main and interacted effects were included one genetic model and estimates by empirical Bayesian. The software for parameter estimation is available as S1 Software.

Hybrid prediction (HP)

In hybrid breeding, elite parents were predicted from general combining ability (GCA) and elite parental combinations were predicted from specific combining ability (SCA). In NCII mating design, both GCA for each parent and SCA for each parental combination could be calculated. In a partial NCII mating design, however, some F1 hybrids were missing so that GCA and SCA were not calculated. Using the information of QTL detected above, these missing values could be predicted. Once all the information was obtained, SCA and GCA could be calculated. Therefore, elite parents and parental combinations could be predicted.

Monte Carlo simulation design

We performed three simulation experiments in this study. In the first simulation experiment, the effect of QTL heritability on the new method was assessed. The QTL size (), being the proportion of total phenotypic variance explained by the QTL, was 0.02, 0.05 and 0.08, respectively. In each case, two additive, two dominant, one aa, one ad, one da, and one dd QTL were simulated. The genetic variance of the ith QTL, , was calculated from , where residual variance σ 2 = 1. The allelic effects were calculated by relating to the allelic frequencies and effects. All the QTL were overlapped with the markers and listed in Table 3. All the genotypes of 441 parents and 284 F1 hybrids in the partial NCII were exactly same as those in real data analysis in this study. The simulated phenotypic value of each parent or F1 hybrid was the sum of the corresponding QTL genotypic values and residual error, with an assumed normal distribution. Each simulation run consisted of 100 replicates. For each simulated QTL, we counted the samples in which the LOD statistic surpassed 2.0. The ratio of the number of such samples to the total number of replicates represented the empirical power of this QTL. The FPR was calculated as the ratio of the number of false positive effects to the total number of zero effects. We also calculated absolute bias between estimated and true effects of each QTL in each sample. Therefore, the average and standard deviation of absolute biases across 100 replicates could be obtained. In the second simulation experiment, we evaluated the effect of sample size on the new method by letting the sample size be set as 400 (80 restorer lines + 160 sterile lines + 160 F1 hybrids), 500 (100 + 200 + 200) and 600 (120 + 240 + 240). All the QTL sizes were 0.05. Other parameters were the same as those in the first simulation experiment. In the third simulation experiment, we investigated the effect of population structure on the new method by letting the population structure be set as all the parents (300 restorer lines + 300 sterile lines), parents + F1 (150 restorer lines + 150 sterile lines + 300 F1 hybrids) and all the F1 (600 F1 hybrids). Other parameters were the same as those in the second simulation experiment.
Table 3

Parameter setupsin the Monte Carlo simulation studies.

Parameter setupCase
123
Number of QTL8 QTL (same setup)
QTL type2 additive, 2 dominant, 1 additive-by-additive, 1 additive-by-dominant, 1 dominant-by-additive, 1 dominant-by-dominant (same setup)
QTL position (marker)CB10597C, Bo3b, Ra2E12, CB10427A; MR049D × BnGMS439A, Ra2-G08A × Ra3-E05C, Bn1b × CB10431A, CB10036A × CB10045A (same setup)
QTL size (%)2, 5, 858
sample size725400, 500, 600600
mapping populationParents+F1 Parents+F1 all the parents, parents+F1, all the F1 hybrids

Effect of QTL heritability on mapping QTL in NCII mating design.

(DOCX) Click here for additional data file.

Effect of sample size on mapping QTL in NCII mating design.

(DOCX) Click here for additional data file.

Effect of population structure on mapping QTL in NCII mating design.

(DOCX) Click here for additional data file.

Software for mapping interacted QTL in partial NCII design.

To first install JAVA Runtime Environment (jdk-7u71-windows-x64.exe) by default install directory, and to then install Maltlab Runtime Environment (R2014b (8.4) For Windows. The readers may download the first file at http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-1880260.html and the second file at http://cn.mathworks.com/products/compiler/mcr/index.html, respectively. If the readers want to use another jdk version, please change the corresponding content in the run.bat file. EMS memory size in the running computer is more than 8 G. (ZIP) Click here for additional data file.
  61 in total

1.  Single-locus heterotic effects and dominance by dominance interactions can adequately explain the genetic basis of heterosis in an elite rice hybrid.

Authors:  Jinping Hua; Yongzhong Xing; Weiren Wu; Caiguo Xu; Xinli Sun; Sibin Yu; Qifa Zhang
Journal:  Proc Natl Acad Sci U S A       Date:  2003-02-25       Impact factor: 11.205

2.  Identification of markers linked to disease-resistance genes by bulked segregant analysis: a rapid method to detect markers in specific genomic regions by using segregating populations.

Authors:  R W Michelmore; I Paran; R V Kesseli
Journal:  Proc Natl Acad Sci U S A       Date:  1991-11-01       Impact factor: 11.205

3.  Dominance of Linked Factors as a Means of Accounting for Heterosis.

Authors:  D F Jones
Journal:  Genetics       Date:  1917-09       Impact factor: 4.562

4.  Improved genetic mapping of endosperm traits using NCIII and TTC designs.

Authors:  Xuefeng Wang; Wen Song; Zefeng Yang; Yamin Wang; Zaixiang Tang; Chenwu Xu
Journal:  J Hered       Date:  2009-03-30       Impact factor: 2.645

5.  Genetic composition of yield heterosis in an elite rice hybrid.

Authors:  Gang Zhou; Ying Chen; Wen Yao; Chengjun Zhang; Weibo Xie; Jinping Hua; Yongzhong Xing; Jinghua Xiao; Qifa Zhang
Journal:  Proc Natl Acad Sci U S A       Date:  2012-09-10       Impact factor: 11.205

6.  Dominance, overdominance and epistasis condition the heterosis in two heterotic rice hybrids.

Authors:  Lanzhi Li; Kaiyang Lu; Zhaoming Chen; Tongmin Mu; Zhongli Hu; Xinqi Li
Journal:  Genetics       Date:  2008-09-14       Impact factor: 4.562

7.  Empirical Bayesian elastic net for multiple quantitative trait locus mapping.

Authors:  A Huang; S Xu; X Cai
Journal:  Heredity (Edinb)       Date:  2014-09-10       Impact factor: 3.821

8.  Detection of QTL for six yield-related traits in oilseed rape (Brassica napus) using DH and immortalized F(2) populations.

Authors:  Wei Chen; Yan Zhang; Xueping Liu; Baoyuan Chen; Jinxing Tu; Fu Tingdong
Journal:  Theor Appl Genet       Date:  2007-07-31       Impact factor: 5.699

9.  The use of combining ability analysis to identify elite parents for Artemisia annua F1 hybrid production.

Authors:  Theresa Townsend; Vincent Segura; Godfree Chigeza; Teresa Penfield; Anne Rae; David Harvey; Dianna Bowles; Ian A Graham
Journal:  PLoS One       Date:  2013-04-23       Impact factor: 3.240

10.  Effectiveness of genomic prediction of maize hybrid performance in different breeding populations and environments.

Authors:  Vanessa S Windhausen; Gary N Atlin; John M Hickey; Jose Crossa; Jean-Luc Jannink; Mark E Sorrells; Babu Raman; Jill E Cairns; Amsal Tarekegne; Kassa Semagn; Yoseph Beyene; Pichet Grudloyma; Frank Technow; Christian Riedelsheimer; Albrecht E Melchinger
Journal:  G3 (Bethesda)       Date:  2012-11-01       Impact factor: 3.154

View more
  7 in total

1.  Comparative transcriptomic analysis reveals the molecular mechanism underlying seedling biomass heterosis in Brassica napus.

Authors:  Jie Xiong; Kaining Hu; Nesma Shalby; Chenjian Zhuo; Jing Wen; Bin Yi; Jinxiong Shen; Chaozhi Ma; Tingdong Fu; Jinxing Tu
Journal:  BMC Plant Biol       Date:  2022-06-09       Impact factor: 5.260

2.  Correction: Interacted QTL Mapping in Partial NCII Design Provides Evidences for Breeding by Design.

Authors: 
Journal:  PLoS One       Date:  2015-04-29       Impact factor: 3.240

3.  pLARmEB: integration of least angle regression with empirical Bayes for multilocus genome-wide association studies.

Authors:  J Zhang; J-Y Feng; Y-L Ni; Y-J Wen; Y Niu; C L Tamba; C Yue; Q Song; Y-M Zhang
Journal:  Heredity (Edinb)       Date:  2017-03-15       Impact factor: 3.821

4.  Development of a multiple-hybrid population for genome-wide association studies: theoretical consideration and genetic mapping of flowering traits in maize.

Authors:  Hui Wang; Cheng Xu; Xiaogang Liu; Zifeng Guo; Xiaojie Xu; Shanhong Wang; Chuanxiao Xie; Wen-Xue Li; Cheng Zou; Yunbi Xu
Journal:  Sci Rep       Date:  2017-01-10       Impact factor: 4.379

5.  Genetic Dissection of Hybrid Performance and Heterosis for Yield-Related Traits in Maize.

Authors:  Dongdong Li; Zhiqiang Zhou; Xiaohuan Lu; Yong Jiang; Guoliang Li; Junhui Li; Haoying Wang; Shaojiang Chen; Xinhai Li; Tobias Würschum; Jochen C Reif; Shizhong Xu; Mingshun Li; Wenxin Liu
Journal:  Front Plant Sci       Date:  2021-11-30       Impact factor: 5.753

6.  EcoTILLING revealed SNPs in GhSus genes that are associated with fiber- and seed-related traits in upland cotton.

Authors:  Yan-Da Zeng; Jun-Ling Sun; Su-Hong Bu; Kang-Sheng Deng; Tao Tao; Yuan-Ming Zhang; Tian-Zhen Zhang; Xiong-Ming Du; Bao-Liang Zhou
Journal:  Sci Rep       Date:  2016-07-07       Impact factor: 4.379

7.  Hybrid Performance of an Immortalized F2 Rapeseed Population Is Driven by Additive, Dominance, and Epistatic Effects.

Authors:  Peifa Liu; Yusheng Zhao; Guozheng Liu; Meng Wang; Dandan Hu; Jun Hu; Jinling Meng; Jochen C Reif; Jun Zou
Journal:  Front Plant Sci       Date:  2017-05-18       Impact factor: 5.753

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.