Literature DB >> 25305041

Functional multi-locus QTL mapping of temporal trends in Scots pine wood traits.

Zitong Li¹, Henrik R Hallingbäck², Sara Abrahamsson³, Anders Fries², Bengt Andersson Gull³, Mikko J Sillanpää⁴, M Rosario García-Gil⁵.

Abstract

Quantitative trait loci (QTL) mapping of wood properties in conifer species has focused on single time point measurements or on trait means based on heterogeneous wood samples (e.g., increment cores), thus ignoring systematic within-tree trends. In this study, functional QTL mapping was performed for a set of important wood properties in increment cores from a 17-yr-old Scots pine (Pinus sylvestris L.) full-sib family with the aim of detecting wood trait QTL for general intercepts (means) and for linear slopes by increasing cambial age. Two multi-locus functional QTL analysis approaches were proposed and their performances were compared on trait datasets comprising 2 to 9 time points, 91 to 455 individual tree measurements and genotype datasets of amplified length polymorphisms (AFLP), and single nucleotide polymorphism (SNP) markers. The first method was a multilevel LASSO analysis whereby trend parameter estimation and QTL mapping were conducted consecutively; the second method was our Bayesian linear mixed model whereby trends and underlying genetic effects were estimated simultaneously. We also compared several different hypothesis testing methods under either the LASSO or the Bayesian framework to perform QTL inference. In total, five and four significant QTL were observed for the intercepts and slopes, respectively, across wood traits such as earlywood percentage, wood density, radial fiberwidth, and spiral grain angle. Four of these QTL were represented by candidate gene SNPs, thus providing promising targets for future research in QTL mapping and molecular function. Bayesian and LASSO methods both detected similar sets of QTL given datasets that comprised large numbers of individuals.

Entities: Chemical Disease Mutation Species

Keywords: Scots pine; functional QTL mapping; multi-locus model; shrinkage estimation; wood quality traits

Mesh：

Year: 2014 PMID： 25305041 PMCID： PMC4267932 DOI： 10.1534/g3.114.014068

Source DB: PubMed Journal: G3 (Bethesda) ISSN： 2160-1836 Impact factor: 3.154

Wood is one of the most abundant and versatile organic materials on earth. It is mainly composed of secondary cell wall consisting primarily of cellulose, lignin, and glycoprotein. Its great importance for the world economy and society has motivated the establishment of a large number of conifer breeding programs worldwide with the primary aim of increasing wood production and improving its quality (Mullin ; Rosvall 2011). With respect to the genetic improvement of growth and form properties, conifer breeding has achieved notable successes. However, the costs and time required for evaluating wood properties in a large number of individuals have prevented wood quality to be incorporated into most operational breeding programs despite evidence that wood properties such as density, microfibril angle, fiber dimensions, and spiral grain harbor considerable genetic variation and moderate-to-high heritability (Gaspar ; Bouffier ; Gapare ; Fries 2012; Hong ). The assessment of wood properties is further complicated by substantial and systematic variation within trees such as early-latewood and juvenile-mature wood transitions (Zobel and Sprague 1998). Developing affordable molecular tools that could be used in, for example, early screening of tree seedlings is therefore considered a potential solution to assist wood quality improvement operationally. Moreover, access to new instruments for efficient wood characterization offers new opportunities to analyze large numbers of trees for more traits and with higher detail. Quantitative trait loci (QTL) analysis is a technique applied to elucidate the genetic basis of quantitative traits by placing the position of putative genes underlying a complex trait onto a genetic map (Lander and Botstein 1989). QTL identification can be used to develop molecular strategies for early tree selection of economically important tree properties. QTL analyses based on anonymous DNA markers, such as amplified length polymorphisms (AFLP) or random amplified polymorphic DNA (RAPDs), identified major QTL, each explaining from 5% to 27% of the variance for pine wood traits such as density, ring width, microfibril angle, and lignin content in loblolly pine (Pinus taeda L.) (Knott ; Kaya ; Sewell , 2002; Brown ), maritime pine [Pinus pinaster (Ait.)] (Plomion ; Brendel ), radiata pine (Pinus radiata D. Don.) (Emebiri ; Kumar ), and Scots pine (Pinus sylvestris L.) Lerceteau and Szmidt 2000; Lerceteau ). More recently, high-throughput technologies have contributed to the development of a third generation of genic markers such as expressed sequence tag polymorphisms (ESTs) and single nucleotide polymorphisms (SNPs) that may increase the resolution and informativity of QTL mapping considerably. Nonetheless, only a few QTL studies have been published that have utilized ESTs and SNP in pines (Markussen ; Brown ; Pot ). Although these studies incorporated a very limited number of gene-based markers, they have nonetheless become a useful source of candidate genes for wood related traits. Previous QTL studies of wood properties have been limited to measurements at single time points or to trait averages of a heterogeneous wood sample (e.g., stem sections, increment cores). Thus, the dynamic and systematic within-tree variation due to age and seasons was not taken into account, limiting the extent to which results could be generalized. For instance, given wood trait data from increment cores, a cumbersome process of repeated analyses on separate annual rings has been required to assess the stability of a QTL over time (Sewell , Brown , Ukrainetz ). However, multiple observations of wood properties obtained by dendrochronological studies could instead be fitted to mathematical functions or curves dependent on the year of wood formation, on cambial age, or on distance from pith, a procedure called functional QTL mapping (Ma ). The QTL analysis could then be performed on these functions, thus accounting for the systematic within-tree variability and increasing the relevance of detected QTL in comparison with analyses performed on samples specific for a certain tree age or annual ring. Functional QTL analyses have been conducted in Scots pine (Sillanpää ) and Populus (Wu ; Ma , 2004; Lin and Wu 2006) but were focused on the dynamic nature of growth rather than that of wood properties. The functional/longitudinal QTL analysis needs to take both between- and within-individual variation into consideration. This can be implemented by a two-step multilevel approach (Gee ; Heuven and Janss 2010). The phenotypic temporal trend of each individual is in the first step estimated as a curve. The curve parameters are then considered as latent traits for which marker effects are evaluated in a second independent analysis by some common QTL mapping tools. A possible alternative to such a two-step model is the linear mixed effects model (LMM) (Furlotte ; Sikorska ), which fits the temporal trend and estimates marker effects simultaneously in one single, albeit complex, procedure. Both two-step multilevel and LMM models are well-suited to the simultaneous effect estimation of several loci (Zeng 1994; Kao ). Single locus methods, such as classical interval mapping (Lander and Botstein 1989), are simple and computationally efficient, but in cases where the sample size is small, investigated markers are numerous, and significance testing is conservative, single locus effects may suffer from severe overestimation (denoted the “Beavis effect”) (Beavis 1994; Slate 2013). Multiple locus approaches such as stepwise regression (Segura ), LASSO (Tibshirani 1996), and Bayesian methods (Xu 2003; Guan and Stephens 2011) can improve in this matter. For these methods, variable selection and shrinkage estimation play key roles in avoiding oversaturated models and locus effect overestimation. In this study, we considered the LASSO variable selection for the marker effect estimation step of the multilevel approach and a Bayesian spike and slab prior approach for the linear mixed model (henceforth abbreviated as BLMM). To our best knowledge, these advanced multi-locus methods, and especially the extended functional mapping methods, have not been used in any previous QTL study for mapping wood properties. In QTL analyses, a typical procedure is to estimate the effects of the loci first, and then to quantify multi-locus association detection uncertainty or significance by applying some sort of decision rules (e.g., hypothesis testing). Tests designed for LASSO and Bayesian variable selection often require some adjustments with respect to multiplicity (Scott and Berger 2010; Bühlmann ). For the multilevel LASSO (mLASSO) model, one may consider a multiple-split test method (Meinshausen ), a covariance test method (Lockhart ), or stability selection method (Meinshausen and Bühlmann 2010). For the BLMM, which uses indicator variables, a false discovery rate control approach (Ventrucci ) can be pursued. The primary aim of this study was to determine the extent to which QTL with major effects could be detected for a set of important conifer wood properties considered as dynamic functional traits dependent on time (i.e., cambial age). A secondary aim was to examine and compare two different multi-locus functional QTL mapping methods, mLASSO and BLMM on a real wood property data set, and to characterize and compare several decision rules that can be applied within the framework of mLASSO and BLMM. To achieve these two aims, functional QTL mapping was performed on a Scots pine (Pinus sylvestris L.) full-sib family established in a single large plot in northern Sweden using a set of AFLP and SNP molecular markers paired with phenotypic assessment of the wood of multiple annual rings in increment cores or by repeated assessments on standing trees.

Materials and Methods

Field test

The studied field test, named Flurkmark (S23F881485), is located 25 km north of Umeå in northern Sweden (latitude 64°02′N, longitude 20°13′E, altitude 115 m above sea level). The site is a mesic dwarf-shrub type with site index T21 (Hägglund and Lundmark 1982) and the plant material consists of 1000 offspring (F1) of one full-sib family. The parental (P) trees AC3065 and Y3088 were plus-trees originating from north Sweden (latitudes 65°08′N and 64°09′N, respectively) selected for the Swedish Scots pine breeding program based on their vitality, rapid growth, small branch diameters, and wide branch angles. One-yr-old seedlings were planted in 1988 using an approximate spacing of 2 × 2 m. QTL studies have previously been conducted on parts of this material with respect to other traits (Lerceteau and Szmidt 2000; Lerceteau ; Sillanpää ).

Wood sampling and measurements

A subset of 286 trees was sampled for wood cores. To limit the impact of environmental variation, trees selected for sampling were all situated together in a part of the trial that exhibited the highest survival (98% in 2007) and growth. Wood samples were obtained during autumn 2004 using an electric drill combined with a 10-mm-diameter increment borer (Haglöf Sweden, AB). Samples were taken from a height of 0.8 m above the ground. They were subsequently sent to Innventia, where their properties were analyzed; the first part of the evaluations was performed and the data were made ready for the genetic studies to follow. The samples were air-dried, sawed, and polished into vertical 2 × 7 mm. From these strips, radial data profiles of wood traits (Table 1, Supporting Information, File S1) such as wood density (WD), radial fiberwidth (FWr), tangential fiberwidth (FWt), fiberwall thickness (FTh), microfibril angle (MFA), and dynamic modulus of elasticity (MOE) were acquired with the Silviscan instrument (Evans 1994, 2006). In addition to the traits assessed on increment cores, the spiral grain angle (GA) was measured at breast height (1.3 m above the ground) under bark in an extended subset of trees (492) using a nondestructive weight-and-wedge device (Chalmers University of Technology, Sweden) (File S2). Measurements were made in accordance with Hannrup in 2006 and 2007 in the north, south, east, and west directions using the means of each year as measurement traits.

Table 1

List of the wood traits, their abbreviations, measurement unit, and the number of values per tree used in analysis

Traits	Abbreviation	Unit	Values/Tree
Annual ring width	RW	mm	9
Earlywood percentage	EP	%	9
Latewood percentage	LP	%	9
Mean wood density	WD	kg m⁻³	9
Earlywood density	EWD	kg m⁻³	9
Latewood density	LWD	kg m⁻³	9
Mean radial fiber width	FWr	µm	9
Earlywood radial fiber width	EFWr	µm	9
Latewood radial fiber width	LFWr	µm	9
Mean tangential fiber width^a	FWt	µm	9
Mean fiberwall thickness	FTh	µm	9
Earlywood fiberwall thickness	EFTh	µm	9
Latewood fiberwall thickness	LFTh	µm	9
Microfibril angle	MFA	°	3
Modulus of elasticity	MOE	GPa	3
Grain angle^b	GA	°	2

Early and latewood components of FWt were omitted because the variation within annual rings was negligible.

For GA, only the annual rings formed in 2006 and 2007 were studied, but the number of trees assessed (492) was considerably greater than for the other traits (286).

Early and latewood components of FWt were omitted because the variation within annual rings was negligible. For GA, only the annual rings formed in 2006 and 2007 were studied, but the number of trees assessed (492) was considerably greater than for the other traits (286).

Initial processing of wood property data

The first step in the evaluation of the radial profile wood trait data was to identify the interfaces between all annual rings and their parts of earlywood and latewood using methods developed by Innventia based on analysis of radial variations in wood density. Then followed calculations of the widths of each annual ring (RW) and its parts of earlywood and latewood, as well as the proportions of earlywood (EP) and latewood (LP) of each ring. Further, wood trait averages were calculated for each annual ring, for most traits averages were also calculated for their earlywood and latewood (Table 1). Then, the ring locations were manually cross-checked and the data were corrected for some errors. Finally, the annual ring data were also sorted according to the calendar year of their formation. The cambial age variation in annual rings of the same calendar year (reflecting that the trees reached the sampling height at different ages) was small, with a variation range of maximum 4 yr. The effect of cambial age was therefore assumed to be synchronous with the effect of calendar year. The nine annual rings formed during the calendar years 1995–2003 (approximately corresponding to 5–13 rings from pith) were chosen for further study because observations from those rings were reasonably complete (>95%). For MFA and MOE, the measurement resolution was low (1 observation/mm) and only allowed the study of averages of three adjacent annual rings (1995–1997, 1998–2000 and 2001–2003, respectively). In contrast, the resolution for WD, FWr, and FTh (20 observations/mm) was high enough to enable the separation of observations into earlywood and latewood components within each annual ring (Table 1) and to calculate wood trait averages for the components. There are many definitions for designating wood as earlywood and latewood (Mork 1928; Green and Worral 1964). The transition between pronounced earlywood and latewood is, however, gradual and easily influenced by site, weather, and other factors. If the annual ring is divided into two components only, then the averages for both will be influenced by all these factors, which is negative for many types of studies. To deal with this, Innventia introduced designation criteria including a third transitionwood component in between earlywood and latewood (Olsson 2000). Each location i within an annual ring was designated as earlywood or latewood depending on whether the wood density measured at that location (WD) exceeded thresholds related to the span between the minimum wood density (WD) and the maximum wood density (WD) within each individual annual ring:In the current study, the transition wood component was not studied per se due to its unspecific character. The data for all traits were checked with respect to normal distribution. Earlywood and latewood ratio (EP and LP) were found to deviate considerably from normality and were therefore arcsine square root–transformed (Snedecor and Cochran 1980) prior to further analysis.

DNA extraction, marker development, and scoring

Total DNA was extracted from vegetative buds of the studied full-sib individuals. The buds were pealed and freeze-dried before being grinded and DNA was extracted using the CTAB method (Doyle and Doyle 1990). An array of 768 single nucleotide polymorphism (SNP) markers was developed for Illumina Golden Gate assay at the University of Oulu, Department of Biology (S. T. Kujala, T. Knürr, K. Kärkkäinen, D. B. Neale, M. J. Sillanpää, and O. Savolainen, unpublished data), and it is publicly available at the Evoltree website (http://www.evoltree.eu/). During the development process, 237 SNPs were extracted from 56 gene fragments that had been sequenced in individuals of different populations by Palmé , Pyhäjärvi , 2011), and Kujala and Savolainen (2012). These gene fragments were selected mainly for their value as candidates for timing of bud set and cold tolerance. In addition, 531 SNPs from 341 other gene fragments were obtained from the Comparative Resequencing in Pinaceae Project (CRSP) in the laboratory of D. B. Neale, University of California at Davis (http://dendrome.ucdavis.edu/NealeLab/crsp/). The SNP genotyping was performed using GoldenGate Illumina technology at Centre National de Genotypage (CNG) in Evry, France. The automatic allele calling for each locus was accomplished with the GenCall software (Illumina, San Diego, CA). In addition to the SNPs, several amplified length polymorphism markers (AFLP) were developed. The AFLPs were produced according to Vos . The following 15-primer enzyme combinations were used: E-act/M-cctg; E-act/M-cccg; E-act/M-ccgc; E-act/M-ccgg; E-act/M-ccag; E-acg/M-cctg; E-acg/M-cccg; E-acg/M-ccgc; E-acg/M-ccgg; E-acg/M-ccag,; E-aca/M-cctg; E-aca/M-cccg; E-aca/M-ccgc; E-aca/M-ccgg; and E-aca/M-ccag. The amplified fragments were sent to the DNA facility at Iowa State University and run on ABI3100 Genetic Analyzer. The mapping data were analyzed with GeneMarker v1.6 (SoftGenetics, State College, Pennsylvania). A large set of 492 progeny individuals were thus genotyped using 508 AFLP markers and a small subset (91) of these individuals were also genotyped using the previously developed 768-SNP array (File S3 and File S4). The individuals of the smaller subset were all sampled for wood traits by Silviscan. Furthermore, both parents (AC3065 and Y3088) of the field trial progeny were genotyped using both SNP and AFLP markers.

Sorting and mapping of marker data

Marker sorting and mapping were performed with all the available genotype data simultaneously, but with the aim of constructing two sorted genotype datasets, one pure AFLP dataset for the larger subset of individuals (henceforth abbreviated as the A-set) and one mixed SNP+AFLP dataset intended for the smaller subset (abbreviated as the S+A set). Because the studied full-sib family was generated by two noninbred and highly heterozygotic parents, a two-way pseudo-testcross mapping strategy was used (Grattapaglia and Sederoff 1994; Grattapaglia ). Markers for which genotyping scoring success was inadequate (<80%) and poorly genotyped individuals (<70%) were excluded from further study. Even though the inclusion of 1:2:1-segregating SNPs made the construction of a consensus map possible, it was nonetheless reasonable to assume that different sets of QTL segregated within each of the unrelated and heterozygotic parents. Therefore, the marker linkage mapping was performed on maternal and paternal sections separately (see File S5 for more detailed information about the linkage mapping). In summary, 153 AFLP markers genotyped on 455 individuals (the A dataset) and 153 AFLP and 166 SNP markers genotyped on 91 individuals (the S+A dataset) were retained in the analysis after filtering and sorting. Two-hundred fifty-one markers were distributed on 26 maternal and 24 paternal linkage groups (LGs), whereas 68 markers could not be assigned to any linkage group (unmappable).

Multilevel functional QTL detection by LASSO variable selection

Following the multilevel model approaches of Gee , Heuven and Janss (2010), Hurtado , and Sillanpää , we fitted our multiple phenotype measurements within each individual to a simple linear curve dependent on time t:for individuals i = 1,…,n, and repeated measurements k = 1,…,m (m = 9 for an individual with a complete set of measurements). The least square estimates of the intercept μ and the slope μ were considered to be new latent traits aimed at evaluating the overall trait mean across annual rings and the trait rate of change, respectively. According to Heuven and Janss (2010), the intercepts and slopes are less correlated compared with the repeated measurements in the original scale and should have a constant variance, which would reduce the necessity to account for residual dependencies in the model. To make μ correspond to the biologically meaningful mean across annual rings, the time variable t was recoded and centralized to range from −4 to 4 instead of using the original calendar years (1995–2003). Individuals with more than five missing measurements were excluded from the analysis. Examples of the wood property development by time for a subset of traits are presented in Figure 1.

Figure 1

Trajectories of four wood traits by time including (A) wood density, (B) earlywood percentage, (C) radial fiberwidth, and (D) fiberwall thickness. For each trait, individual trajectories are shown in light blue lines, and the mean trajectory is shown in a black line. Next, to describe the association between the latent traits and the molecular markers, the following multiple linear regression models were used to map intercept () and slope () traits separately as:where x is the genotype value of individual i and marker j (j = 1,…,p) coded as 0 and 1 for two possible genotypes, respectively, and are the intercept terms, and are the effects of marker j, and and are the residuals for intercept and slope traits, respectively. For the three wood traits that only have two or three phenotype assessments per tree (GA, MFA, and MOE), we performed separate LASSO analyses for each assessment. Because the ordinary least squares (OLS) method usually cannot provide accurate estimates for a number of markers near to or larger than the number of individuals assessed (the p > n-problem) (Hastie ), we instead applied a popular penalized regression approach named LASSO as a marker (variable) selection method (Tibshirani 1996):where the penalty terms and , (, >0) reduce the number of markers included in the model by shrinking unimportant effects to zero. The tuning parameters or , determine the degree of shrinkage and model sparsity. It was chosen explicitly by the 10-fold cross-validation (CV) model selection procedure (Li and Sillanpää 2012). The LASSO computation was performed by the MATLAB/R package “glmnet” (Friedman ).

Single-step functional QTL detection by Bayesian linear mixed effect model

Instead of relying on a two-step multilevel model, we may alternatively estimate the temporal trend and effects of markers simultaneously in one linear mixed effect model (BLMM). By substituting equations (2) back to equation (1), we obtainwhich is equivalent to the longitudinal random intercept and random slope model (Fahrmeir and Kneib 2011; Furlotte ; Sikorska ; Wang 2012). The fixed intercept and slope parameters and describe the population time trends, and the random effect parameters and describe the individual specific time trends. The individual specific random intercept and slope terms are important because they: describe the deviation of individual trends over the population trend; construct a certain temporal covariance structure among phenotypes; and take the possible heterogeneity in the data caused by missing (environmental) covariates into account (Fahrmeir and Kneib 2011). We formulated a full Bayesian procedure for model inference. The fixed intercept and slope parameters were assigned with noninformative priors: , . The residual variance was assigned with a Jeffreys’ noninformative prior: . The random parameters were assigned with multivariate normal priors with common variance component over all the individuals: . An inverse-Wishart prior was specified for the covariance matrix: with fixed hyperparameters (identity matrix) and . Under such prior settings of random intercept and slope parameters, the marginal variance of y is , and covariance between y and y () is , implying that the random intercept and slope terms describe serial correlation and heterogeneity of variance for the repeated phenotype measurements in the mixed model (Weiss 2005, p. 255). For marker effect parameters, we assigned mixture priors (or so-called spike and slab priors):Such spike and slab priors play a similar role as the l1 penalty in LASSO to perform variable selection (Habier ). The binary indicator variable r (r = 0, 1) determines whether marker j should be included in the model. A Bernoulli prior was assigned to r: . We fixed w to be 0.5, so that the estimates of r were mainly determined by the data. Furthermore, the variance parameter was assigned with inverse-gamma prior: , with fixed hyperparameters a = 0.1, and b = 0.1. For traits including MFA, MOE, and GA with two or three phenotype assessments per tree, we simplified equation (4) by including only tree-specific intercept terms but not tree-specific slope terms:In practice, a Markov Chain Monte Carlo (MCMC)-based method was used for estimating all the parameters defined in our Bayesian model. The detailed information of the MCMC sampling methods is available in the File S5.

Quantification of uncertainty and hypothesis testing

In this study, we also treated the question of how to best quantify QTL uncertainty and significance by hypothesis testing based on LASSO and BLMM, respectively. Empirical studies (Xu 2007; Li and Sillanpää 2012) have shown that LASSO-CV tends to select many small effect markers in addition to the major QTL. The small marker effects may contribute to phenotype prediction but may not represent the foremost QTL candidates. Therefore, some post variable selection QTL inference is needed to judge which markers selected by LASSO can be declared as putative QTL with any confidence. Constructing a test statistic for a marker under the LASSO scheme is, however, not a straightforward task, because LASSO estimates, unlike the least squares method, do not asymptotically follow any standard parameter distribution (Chatterjee and Lahiri 2011). The simplest method attempted was to first select a subset of markers by using LASSO on the whole data set, and then re-estimate the effects of those markers one-by-one with a simple regression model by the OLS method and perform the standard t-test (including the standard Bonferroni correction for multiple testing). This method was used as a baseline for comparisons (p-value is abbreviated as Single-p) and is similar to the tests of many single locus methods (Furlotte ; Slate 2013). Recently, several theoretically more solid approaches have been developed for indirectly quantifying the uncertainties of the markers selected by LASSO. We investigated the following three methods. The first is the multiple-split test (p-value is abbreviated as MST-p) of Meinshausen , which involves dividing the sample at random into two parts with equivalent number of individuals. The first half of the data are used to determine the optimal tuning parameter and to choose a subset of markers by LASSO, whereas the second half of the data are used to t-test those markers. For this study, the sample split procedure was repeated 100 times and the p-values (adjusted by Bonferroni correction) were combined to minimize the effect of random sampling of the data. The second is the covariance test (p-value is abbreviated as COV-p) proposed by Lockhart , which tests the significance of a marker entering the LASSO model and differs from the classical t-test of whether the marker effect equals zero. The COV-p test statistic is constructed directly based on the LASSO solution path, which is a compromise between shrinkage estimation and variable selection. Therefore, the COV-p for those selected markers should be more accurate than the p-values calculated by the t-tests. The third is the stability selection proposed by Meinshausen and Bühlmann (2010), involves drawing a subsample of half the number of individuals available and applying the LASSO on it to select a set of markers repeatedly (1000 times). Then, the stability selection probability (SSP) of each marker being selected was calculated and used to judge the support of QTL. Meinshausen and Bühlmann (2010) suggested a decision rule based on SSPs to control the expected number of false positives. In practice, we calculated MST-p and SSP by our own Matlab codes [please see Bühlmann for more information about these two methods]. The COV-p was calculated by the R package “covTest” (http://cran.r-project.org/web/packages/covTest/index.html). For the BLMM model (4), we took the advantage of the sampling of indicator variables r defined in the prior (5). The empirical marginal posterior distribution of r: is often viewed as a posterior inclusion probability (PIP) (Guan and Stephens 2011) and has a similar interpretation as the previously mentioned SSP (Meinshausen and Bühlmann 2010). A main difference, however, is that the PIP is calculated from the MCMC samples based on the whole data without the need of repeated sub-sampling. However, the probability can be interpreted as an approximation of a local false discovery rate (LFDR) (Efron ; Ventrucci ) for each individual marker j = 1,…,p. We can calculate a global level posterior false discovery rate (BFDR) (Ventrucci ; Heuven and Janss 2010) by combining LFDRs for a group of markers. A BFDR-based decision rule can be derived to judge significant markers to control BFDR under a certain threshold α (α=0.05) so that the multiplicity adjustments are achieved. Specifically, the LFDRs for each marker are sorted in ascending order. The average value of the LFDRs for the first T markers (T = 1,…,p) is defined as a BFDR for these markers. We find the highest possible value of BFDR (the average of the T smallest LFDRs), which is still smaller than the given threshold, and the corresponding markers are defined to be significant. The QTL uncertainty evaluation of the results of our particular analysis was performed at two levels. If a marker attained a value less than 0.2 for any of the test statistics Single-p, MST-p, COV-p, or BFDR, then it was declared as a suggestive QTL. For the suggestive level, an SSP inclusion ratio of at least 0.66 and 0.58 was required for AFLP and SNP+AFLP datasets, respectively, which, according to the formula of Bühlmann , guaranteed the expected number of false selected markers to be less than 2. However, for a marker to be declared a significant QTL, requirements were stricter. For the LASSO analyses, at least two of the test statistics were required to show p-values ≤ 0.05 (Single-p, MST-p or COV-p) or an SSP inclusion ratio of at least 0.83 and 0.66 for AFLP and SNP+AFLP datasets, respectively, which guaranteed the expected number of false selected markers to be less than 1. For the Bayesian analysis, a BFDR value less than 0.05 was required. Declarations of suggestive or significant QTL were made separately for each dataset and analysis method. We calculated the percentage of phenotypic variation explained by both suggestive and significant QTL for intercept and slope traits, respectively, as:respectively. Note that for the BLMM methods, we have and

Results

From what could be observed from general trends of the studied wood material, the percentage ratio of earlywood (EP) decreased by 3.0% per year and latewood increased by 2.1% per year (Figure 1, Table 2). This trend in combination with the mean latewood density (LWD, 755 kg/m3) being much higher than that of earlywood density (EWD, 328 kg/m3) likely caused the overall ring wood density (WD) to increase by age by approximately 14.5 kg/m3·yr, although EWD and LWD did not show such pronounced trends per se. In a fashion similar to wood density, the fiber cell wall thickness (FTh) also increased (0.09 µm/yr) by increasing ring number from pith. The mean grain angle under bark (GA) at the tree age of 17–18 yr was 0.82°, implying a mild left-handed spiral grain.

Table 2

Time-adjusted population means, average trends by increasing tree age, and individual (phenotypic) standard deviations for means and trends of each trait

Traits	Unit	Population Mean,E(μi0)	Annual Ring Mean Ranges	SD(μi0)	Population Trend, E(μi1)	SD(μi1)
RW	mm	3.3	2.9–4.0	0.5	−0.1	0.1
EP^a	%	56.3	38.6–66.6	4.4	−3.0	1.4
LP^a	%	16.9	10.0–30.8	2.7	2.1	1.1
WD	kg m⁻³	448.0	380–551	27.9	14.5	6.1
EWD	kg m⁻³	327.7	307–354	18.1	−0.5	3.7
LWD	kg m⁻³	754.1	612–854	53.2	6.5	11.6
FWr	µm	30.2	29.0–31.6	0.9	∼0.0	0.2
EFWr	µm	32.7	30.8–34.2	0.8	0.3	0.2
LFWr	µm	23.1	20.1–26.4	1.1	0.5	0.3
FWt	µm	25.9	24.7–26.6	0.7	0.2	0.1
FTh	µm	2.2	1.9–2.8	0.2	0.1	<0.1
EFTh	µm	1.7	1.6–1.9	0.1	∼0.0	<0.1
LFTh	µm	3.5	2.9–4.2	0.3	0.1	0.1
MFA	°	21.2	18.9–22.3^b	4.0
MOE	GPa	10.6	9.1–13.4^b	1.9
GA	°	0.82	0.72–0.90	0.98

For the sake of illustration, nontransformed values are given.

The given ranges comprise means of three adjacent annual rings rather than single annual rings.

For the sake of illustration, nontransformed values are given. The given ranges comprise means of three adjacent annual rings rather than single annual rings.

QTL detection for intercepts (means)

In total, 24 suggestive (minimum one p-value < 0.2) and five significant (minimum two p-values < 0.05) QTL were detected across different intercept traits (including single time point traits), datasets, and QTL mapping methods (Figure 2, Figure 3). The fractions of phenotypic intercept trait variation that was explained by QTL (H) ranged from zero to 0.15 (Table 3). Notably, several appreciably strong QTL were observed for the intercept of WD and EWD (three suggestive and one significant QTL) and for FWr and EFWr (three suggestive and two significant QTL) (Figure 2). These QTL explained 0.07–0.15 fractions of the phenotypic variance for the method/dataset combinations where they were observed. For EWD, both BLMM and mLASSO suggested AFLP GGG191 (unmapped) to be significant (Table 4, part A) and suggestive, respectively, in the A-set. Likewise, the SNP 0_11919_01-122 at the maternal LG 14 was indicated to have a significant association with FWr and to be suggestive for EFWr according to the mLASSO analysis performed on the S+A-set (Figure 2). The AFLP AGG142 (unmapped) was, however, only significantly associated with EFWr. The same QTL markers (suggestive or significant) were shared between EWD and WD in two instances and between FWr and EFWr in one instance. Because the QTL effect signs of whole ring and earlywood components were the same, these QTL should contribute to a positive genetic correlation between the whole ring and earlywood components for WD and FWr.

Figure 2

Figure 3

Marker effects (β) for wood grain angle (GA), microfibril angle (MFA), and dynamic modulus of elasticity (MOE), selected by the multilevel LASSO model in A datasets (black) and S+A datasets (gray) and of markers showing LFDR <0.5 for the Bayesian linear mixed effect model in the A datasets (red) and S+A datasets (magenta) are plotted against their estimated locations on maternal (m) and paternal (p) linkage groups (LG). 1 Morgan is approximately the length of LG 1m. Markers in section u were not mappable to any LG. Significant and suggestive QTL are shown as diamonds and squares, respectively, whereas all other selected markers are shown as circles. All markers in considerable linkage (recombination frequency <0.3) with a significant or suggestive QTL are framed in a rectangle. The mLASSO results are illustrated for one time point only (year 2006 for GA and the period 1998–2000 for MFA and MOE).

Table 3

The ratios of phenotypic variance

	H²_QTL for Intercept/Means (μi0)			H²_QTL for Slopes (μi1)
Trait	mLASSO A	BLMM A	mLASSO S+A^b	mLASSO A	BLMM A	mLASSO S+A^b
RW	0	0	0	0	0	0
EP	0	0	0	0.02	0.22	0
LP	0	0	0	<0.01	0	0
WD	<0.01	0.09	0	0.01	0	0
EWD	0.02	0.15	0	0.01	0.51	0
LWD	0	0	0.01	0	0	0.02
FWr	0	0	0.10	<0.01	0	<0.01
EFWr	0	0	0.07	0	0	0.02
LFWr	0.02	0	0	0	0	0
FWt	<0.01	0.01	0.01	<0.01	0	0
FTh	<0.01	0	0	0.03	0	0
EFTh	0.01	0	0	0	0	0
LFTh	0	0	0.01	0	0	0
MFA^a	0–0.01	0.02	0–0.01	—	—	—
MOE^a	0–0.01	0.01	0–0.01	—	—	—
GA^a	0.03	0.08	0.06–0.07	—	—	—

Ratios of phenotypic variance explained by suggestive and significant QTL jointly for the means/intercepts and slopes of all traits using mLASSO and BLMM methods for pure AFLP (A) and SNP+AFLP datasets (S+A). Trait/method/dataset combinations for which significant QTL were found are highlighted in bold.

Ranges are given for multilevel analyses that were performed during separate time points.

The BLMM analysis of the S+A dataset did not detect any QTL and is thus not shown.

Table 4

Description of significant QTL

General QTL info							Multilevel LASSO Statistics				BLMM Statistics
QTL	Marker^a	Trait	Data Set	LG^b	Position (cM)	Alleles^c	Multilevel Effect^d	Single-p^e	COV-p^e	SSP^e	BLMM Effect	BFDR^e
Part A. QTL for trait intercepts and single time points
1.	GGG191^A	EWD	A	u	—	p/a	4.3 kg m⁻³	0.052′	0.235	0.688′	7.7 kg m⁻³	0.040*
1.	GGG191^A	EWD	S+A	u	—	p/a	n.s.	—	–	—	0.5 kg m⁻³	0.651
2.	0_11919_01-122^S	FWr	S+A	14m	11.7	C/T	0.39 µm	0.080′	0.009*	0.664*	0.35 µm	0.429
2.	—	FWr	A	14m	—	—	No AFLPs in the same LG
3.	AGG142^A	EFWr	S+A	u	—	p/a	0.27 µm	0.010*	<0.001*	0.682*	0.10 µm	0.624
3.	AGG142^A	EFWr	A	u	—	p/a	n.s.	—	—	—	0.04 µm	0.690
4.	TCG51^A	GA^f	A	u	—	p/a	0.30 to 0.34°	<0.001*	<0.001*	0.88–0.91*	0.51°	<0.001*
4.	TCG51^A	GA^f	S+A	u	—	p/a	0.05°	1	0.902	0.187	0.07°	0.861
5.	Axs_47_502^S	GA^f	S+A	3m	40.6	A/C	−0.41 to −0.44°	0.002–0.006*	<0.001*	0.76–0.82*	−0.52°	0.227
5.	—	GA^f	A	3m	—	—	No AFLPs in the same LG
Part B. QTL for trait slopes
6.	GCG64^A	EP^g	A	u	—	p/a	0.23 y⁻¹	0.006*	0.006*	0.908*	0.32 y⁻¹	0.145′
6.	GCG64^A	EP^g	S+A	u	—	p/a	n.s.	—	—	—	∼0.00 y⁻¹	0.978
7.	TGG57^A	EWD	A	u	—	p/a	1.0 kg m⁻³ y⁻¹	0.199′	0.215	0.712′	1.6 kg m⁻³ y⁻¹	0.047*
7.	TGG57^A	EWD	S+A	u	—	p/a	n.s.	—	—	—	0.4 kg m⁻³ y⁻¹	0.691
8.	2_10306_01-354^S	LWD	S+A	1p	474.4	A/C	3.2 kg m⁻³ y⁻¹	0.071′	0.033*	0.747*	3.0 kg m⁻³ y⁻¹	0.623
8.	—	LWD	A	1p	—	—	Closest AFLP (AGC141) far away (33.6 cM)
9.	0_18350_01-393^S	FWr	S+A	8p	0.0	A/G	−0.02 µm y⁻¹	0.160′	0.035*	0.674*	−0.02 µm y⁻¹	0.887
9.	—	FWr	A	8p	—	—	No AFLPs in the same LG

Data include the name of the QTL marker, the trait and dataset where it was found, its linkage group (LG) and position within the linkage group, the alleles conferring and not conferring the effect, respectively, QTL effect estimates for multilevel LASSO and Bayesian linear mixed effect model (BLMM), and marker uncertainty quantities for Bonferroni-adjusted single ordinary least squares re-estimated t-test (Single-p), covariance test (COV-p), stability selection (SSP), and Bayesian global false discovery rates (BFDR), respectively. The primary QTL detections are marked in bold.

The marker type is shown in capital superscript after the marker name: A = AFLP; S = SNP.

m = maternal LG; P = paternal LG; u = unmappable.

p/a = presence/absence.

n.s. = not selected by LASSO.

′ = suggestive; * = significant.

In case the QTL was detected for both GA assessments, effect ranges are given.

Effects are given in the transformed scale.

Trait intercept marker effects (β) for whole ring, earlywood and latewood density, whole ring radial, earlywood radial, latewood radial, whole ring tangential fiberwidths, whole ring, earlywood and latewood fiberwall thickness (WD, EWD, LWD, FWr, EFWr, LFWr, FWt, FTh, EFTh, LFTh) selected by the multilevel LASSO model in A datasets (black) and S+A datasets (gray) and of markers showing LFDR <0.5 for the Bayesian linear mixed effect model in the A datasets (red) and S+A datasets (magenta) are plotted against their estimated locations on maternal (m) and paternal (p) linkage groups (LG); 1 Morgan is approximately the length of LG 1m. Markers in section u were not mappable to any LG. Significant and suggestive QTL are shown as diamonds and squares, respectively, whereas all other selected markers are shown as circles. All markers in considerable linkage (recombination frequency <0.3) with a significant or suggestive QTL are highlighted within a rectangle. Marker effects (β) for wood grain angle (GA), microfibril angle (MFA), and dynamic modulus of elasticity (MOE), selected by the multilevel LASSO model in A datasets (black) and S+A datasets (gray) and of markers showing LFDR <0.5 for the Bayesian linear mixed effect model in the A datasets (red) and S+A datasets (magenta) are plotted against their estimated locations on maternal (m) and paternal (p) linkage groups (LG). 1 Morgan is approximately the length of LG 1m. Markers in section u were not mappable to any LG. Significant and suggestive QTL are shown as diamonds and squares, respectively, whereas all other selected markers are shown as circles. All markers in considerable linkage (recombination frequency <0.3) with a significant or suggestive QTL are framed in a rectangle. The mLASSO results are illustrated for one time point only (year 2006 for GA and the period 1998–2000 for MFA and MOE). Ratios of phenotypic variance explained by suggestive and significant QTL jointly for the means/intercepts and slopes of all traits using mLASSO and BLMM methods for pure AFLP (A) and SNP+AFLP datasets (S+A). Trait/method/dataset combinations for which significant QTL were found are highlighted in bold. Ranges are given for multilevel analyses that were performed during separate time points. The BLMM analysis of the S+A dataset did not detect any QTL and is thus not shown. Data include the name of the QTL marker, the trait and dataset where it was found, its linkage group (LG) and position within the linkage group, the alleles conferring and not conferring the effect, respectively, QTL effect estimates for multilevel LASSO and Bayesian linear mixed effect model (BLMM), and marker uncertainty quantities for Bonferroni-adjusted single ordinary least squares re-estimated t-test (Single-p), covariance test (COV-p), stability selection (SSP), and Bayesian global false discovery rates (BFDR), respectively. The primary QTL detections are marked in bold. The marker type is shown in capital superscript after the marker name: A = AFLP; S = SNP. m = maternal LG; P = paternal LG; u = unmappable. p/a = presence/absence. n.s. = not selected by LASSO. ′ = suggestive; * = significant. In case the QTL was detected for both GA assessments, effect ranges are given. Effects are given in the transformed scale. For all other traits assessed with functional mapping (RW, EP, LP, LWD, LFWr, FWt, FTh, EFTh, and LFTh) QTL were fewer (in total eight suggestive) (Figure 2) and were influencing their respective traits to a relatively limited degree (H in the ranges 0.00–0.02) (Table 3). However, all three suggestive QTL markers for fiberwall thickness traits were also suggestive of or significant for QTL (e.g., GGG191, unmapped) for the corresponding wood density traits (Figure 2). The respective QTL effects for FTh and WD had the same sign, thus suggesting a positive genetic relationship. No QTL were observed for the intercepts of annual ring width (RW), earlywood (EP), and latewood percentage (LP) (Table 3). For the traits subjected to single-time-point LASSO analyses (GA, MFA, and MOE), the results of only one time point is shown (Figure 3) because the QTL patterns observed for the other time points were similar. Among these traits, grain angle (GA) exhibited the highest number of QTL (three suggestive and two significant), explaining up to 0.08 fraction of the phenotypic variation (Table 3). The SNP Axs_47_502 (LG 3m) and the AFLP TCG51 (unmapped) were observed to be significant QTL for GA in both the assessed years. The latter QTL exhibited the strongest overall significance values (Table 4, part A) being the only QTL significant with respect to the extremely conservative multiple-split testing method (MST-p = 0.0014 in the year 2006). This is also the reason why MST-p values were not included in Table 4. In contrast, microfibril angle (MFA) and modulus of elasticity (MOE) only exhibited two suggestive QTL each (Figure 3), which were not consistently observed across time points (not shown) and explained, at most, a 0.02 fraction of the phenotypic variation.

QTL detection for slopes

For trait slopes, 11 suggestive and four significant QTL were detected (Figure 4). Interestingly, a substantial portion of the phenotypic variation for earlywood percentage (up to 0.22) (Table 3) was explained by the single QTL-AFLP GCG64 (unmapped) (Table 4, part B) significant in the mLASSO analysis and exhibiting an effect size of 13% in comparison with the trait mean (transformed scale). For the EWD slope, one significant (AFLP TGG57, unmapped) and four suggestive QTL jointly accounted for up to 0.51 fraction of the phenotypic variation.

Figure 4

Trait slope marker effects (γ) for earlywood, latewood percentage ratio, whole ring, earlywood and latewood density, whole ring radial, earlywood radial, whole ring tangential fiberwidths, whole ring, and earlywood fiberwall thickness (EP, LP, WD, EWD, LWD, FWr, EFWr, FWt, FTh, EFTh) selected by the multilevel LASSO model in A datasets (black) and S+A datasets (gray) and of markers showing LFDR <0.5 for the Bayesian linear mixed effect model in the A datasets (red) and S+A datasets (magenta) are plotted against their estimated locations on maternal (m) and paternal (p) linkage groups (LG). 1 Morgan is approximately the length of LG 1m. Markers in section u were not mappable to any LG. Significant and suggestive QTL are shown as diamonds and squares, respectively, whereas all other selected markers are shown as circles. All markers in considerable linkage (recombination frequency <0.3) with a significant or suggestive QTL are framed in a rectangle. For the slopes of LWD and FWr, one significant and one significant plus one suggestive QTL, respectively, were detected (Figure 4), but the H estimates for these associations were 0.02 at most (Table 3). Among those, the SNP 2_10306_01-354 (LG 1p) significantly associated with LWD nonetheless exhibited a large effect of 3.2 kg/m3·yr (49% in comparison to the mean), likely as a result of the considerable phenotypic variation for the LWD slope [see in Table 2]. For all the other traits (LP, WD, EFWr, LFWr, FTh, EFTh, and LFTh), QTL findings for slopes were scarce (six suggestive QTL in total) (Figure 4) and weak (H ≤ 0.03) (Table 3).

Comparing QTL mapping methods and datasets

For the statistically high-powered AFLP dataset, BLMM and mLASSO methods both appeared to detect approximately equal numbers of suggestive or significant QTL per trait (in the range of 0–5) (Figure 2, Figure 3, Figure 4). The same set of A-set markers were also frequently detected as significant QTL according to both modeling methods or were at least observed as significant for one method and suggestive for the other (Table 4). However, given that one or more QTL were detected in the A-set, the estimated proportions of explained phenotypic variances of the BLMM method (Table 3) were usually higher (0.01–0.15 and 0.22-0.51 for intercept and slope traits, respectively) than the corresponding mLASSO H estimates (≤0.03). Moreover, the BLMM effect estimates of the A-set were often higher than the corresponding mLASSO estimates per se (Figure 2, Figure 3, Figure 4). Such inconsistencies can be attributed to the different shrinkage penalties or priors involved in the two methods. LASSO assumes the shrinkage factor λ to be equal across all markers, whereas the Bayesian approach allows each marker to have its own individual indicator variable. Therefore, the BLMM can estimate marker effects in a more adaptive manner, i.e., by shrinking less if the unaltered effect of a marker is already large. In contrast to the observations made on the A-set, BLMM did not even detect a single suggestive QTL in the statistically low-powered SNP+AFLP dataset, whereas the mLASSO method detected similar numbers of suggestive or significant QTL in both the S+A-sets and A-sets (Table 3). To further dissect potential causes of this discrepancy, we conducted additional analyses combining the Bayesian spike and slab approach and BFDR testing with the two-step method. Interestingly, this two-step Bayesian spike and slab model was, similar to mLASSO, able to detect a number of QTL even in the S+A-set (not shown), implying that the previously observed discrepancies reflect differences in the modeling procedure (one-step vs. two-step) rather than statistical approach (LASSO vs. Bayesian). The Bayesian spike and slab model and LASSO should thus have similar power for detecting QTL given that the same model procedure is pursued.

Comparison between decision-making/hypothesis testing methods

To have an approximate comparison between the testing methods, we simply screened all the traits and counted the total number of significant or suggestive QTL for the large A- and small S+A datasets, respectively (for LASSO analyses of single time-point traits, only the counts for GA measured in 2006 and MFA and MOE observed in 1998–2000 were included). For the A-set, the ranking of testing methods based on the greatest total number of detected significant and suggestive QTL fell in the order BFDR > Single-p > SSP > COV-p > MST-p (Table 5). For the S+A-set with fewer numbers of individuals, the ranking order was COV-p > Single-p > SSP > MST-p > BFDR. The results for the BFDR of the extra two-step Bayesian analysis were quite similar to that of the Single-p and SSP tests of the mLASSO analysis (not shown).

Table 5

Counts of significant or suggestive QTL summed over all the traits for AFLP data and SNP+AFLP data

Data Type	Multilevel LASSO				BLMM
Data Type	Single-p	MST-p	COV-p	SSP	BFDR
Counts of significant QTL
AFLP	2	1	4	2	3
SNP+AFLP	2	0	5	4	0
Total number of suggestive or significant QTL
AFLP	17	1	12	14	18
SNP+AFLP	7	0	12	5	0

Discussion

To the knowledge of the authors, this study presents the first functional/longitudinal QTL analysis of a conifer wood properties dataset with repeated phenotype measurements made possible by use of efficient measurement methods. Such analyses, so far, have only been performed for growth trajectories (Sillanpää ). Compared with earlier QTL studies, functional mapping analyses utilize all the longitudinal data for a trait simultaneously and may better account for temporal trends and correlation structures across years. It can thus detect QTL that are stable over time (i.e., the QTL associated with intercept traits) with greater statistical evidence, but it can also identify QTL that interact with time (e.g., the QTL associated with slope traits) (Wu and Lin 2006). Summary statistics on the wood properties of the studied Scots pine full-sib family (Figure 1, Table 2) showed levels and trends well in line with expectations of young Scots pines (compare with Fries 2012) in the cambial age range of approximately 5–13 yr. The tree age of this material (17 yr) is also considered to be appropriate for assessment and selection in breeding (Rosvall 2011), making the results of this study also relevant from a breeding point of view. Given the cambial age range of the increment cores, our wood trait slope parameters should illustrate the rate of the biological transition from juvenile to mature wood by increasing cambial age. Previous research on wood properties has indicated wood trait radial trends of mature trees to be best-fitted by exponential or logistic models (Burdon ). However, the nine investigated annual rings only partly cover the juvenile-to-mature wood transition phase, which in Scots pine may extend over a time period as long as 20 yr (Hannrup ). Moreover, wood properties are also known to be strongly influenced by annual environmental fluctuations such as climate (Feng ), which could have caused the deviations from linearity observed in Figure 1 as readily as increasing cambial age. Therefore, a simple linear function was chosen for the functional mapping of this study despite the minor nonlinearities observed for the trait trajectories. The functional mapping of this study moreover investigated two advanced but closely related multi-locus model approaches: a multilevel LASSO model (mLASSO) and a newly developed Bayesian linear mixed model (BLMM). The use of multi-locus models with shrinkage of QTL marker effects (LASSO and BLMM) could act as a safeguard against the systematic effect overestimation associated with single-locus models paired with conservative multiple significance testing (Beavis 1994; Slate 2013). Compared with the approach of Sillanpää , our new BLMM approach can be more efficient because of the prior conjugacy and its fast implementation in Matlab. The BLMM is also closely connected to many other Bayesian functional mapping approaches such as those of Yang and Xu (2007) and Li and Sillanpää (2013). However, those studies modeled the marker effects as nonlinear curves over time, whereas the marker effects of our study simply constitutes parameters that describe a linear trend per se. Another difference is that they assumed the effects of the same marker on slope and intercept to be correlated in the priors, whereas we assume them to be independent. The principle of the two-step mLASSO of this study is similar to QTL mapping approaches introduced by Gee , Heuven and Janss (2010), and Hurtado , even though they used effect estimation methods other than LASSO.

QTL observed for the intercept of important wood traits

Results from the wood intercept trait analyses, aimed at detecting QTL stable over a specified period of time, largely indicated zero to five QTL per trait explaining a small fraction of the phenotypic variation and with limited effects on the traits. This observation agrees with many published conifer QTL and association mapping studies (Brown ; Pot ; González-Martínez ; Ukrainetz ; Dillon ; Beaulieu ) and, given that only major QTL usually are detected, it suggests wood intercept traits to be largely polygenic. In our study, EWD and FWr exhibited a greater number of appreciably strong QTL stable over time in comparison with the latewood proportion of those traits or to other wood traits (Figure 2, Table 3). However, the three QTL markers significant for EWD, FWr, and EFWr exhibited individual effects in a range of 0.8%–2.3% in relation to the overall population mean of the respective traits. Consequently, selection assisted by markers developed from these QTL would likely not achieve any dramatic improvements if used in isolation. The observations of numerous and stable EWD QTL agree with previous results where separate annual ring QTL analyses were performed on the wood of a full-sib family (Brown ) and a three-generation pedigree (Sewell ) of loblolly pine. Also, Ukrainetz detected several QTL for both EWD and LWD in several separate annual ring QTL analyses on eight full-sib families of Douglas fir [Pseudotsuga menziesii (Mirb) Franco var menziesii]. The observed tendency of WD and FWr to exhibit sets of QTL markers similar to those of EWD and EFWr, respectively, indicates genetic regulation of the earlywood component of the annual ring to be highly influential on properties of the annual ring as a whole. This is not very surprising because earlywood constituted the major component in most annual rings studied (Table 2). Similarly, the observation of three QTL markers suggestively or significantly associated with both fiberwall thickness and wood density traits is consistent with the strong phenotypic and genetic correlations repeatedly observed between wood density and the thickness of the tracheid cell walls (for examples of Scots pine, see Fries 2012; Hong ). The apparent universal applicability of these correlations therefore suggests that the QTL colocalizations we observed for these trait pairs (WD/EWD, FWr/EFWr, and WD/FTh) more likely stem from pleiotropy than from close linkage. Also, for spiral grain angle, which is a trait closely connected to the twist propensity of sawed timber during drying (Dinwoodie 2000; Warensjö and Rune 2004), a number of influential QTL (two significant and three suggestive) were observed (Figure 3). To the knowledge of the authors, no QTL for grain angle have been published for any conifer species. In contrast to the estimated marker effects of the intercept traits, effects of the significant GA QTL (0.30°–0.52°) (Table 4, part A) indicate that an appreciable reduction of GA could be achieved by marker-assisted selection in the material of this study and potentially improve the shape stability of small sawed timber (Hallingbäck ). However, it must still be cautioned that the GA measurements were taken directly beneath the bark at breast height during two consecutive growth seasons (cambial age range of 12–14 yr) and the detected QTL thus may not be stable over a wider range of cambial ages. For other traits observed at few time points such as MFA and MOE, only a limited set of suggestive QTL were observed.

QTL observed for the slopes of important wood traits

The slope of wood traits over cambial ages or the rate of juvenile-to-mature wood transition has, to our knowledge, never been included in any published QTL or association mapping analyses, even though one association mapping study of radiata pine (Dillon ) considered the transition in a different manner. They regarded the juvenile-to-mature wood transition as a discrete phase change whose timing could be predicted using latewood density as an indicator trait (Gapare ), and we have rather dissected the dynamics of the transition process itself. In the results of this study, it was notable that large portions of the phenotypic variation for the slopes of earlywood percentage and earlywood density were explained by a limited number of suggestive and significant QTL (Table 3 and Figure 4) indicating the existence of major effect loci. Also, the effect of the QTL marker significantly associated with the slope of LWD (SNP 2_10306_01-354) (Table 4, part B) was notably large despite the low corresponding H estimates. The results for EP and LWD are particularly intriguing because these traits are well-known to exhibit decreasing and increasing trends, respectively (Figure 1B, Table 2) as a result of the wood maturation process (Zobel and Sprague 1998; Burdon ). Taking the significant EP QTL (GCG64) (Table 4, part B) as an example and assuming conditions similar to those of this study, trees with the band present genotype (heterozygote) are predicted to decrease their EP from 56.3% at cambial age 9 to 42.2% at age 14 (5 yr later), whereas trees with the band absent genotype (homozygote) would decrease their EP even faster (from 56.3% to 40.3%) during the same time period. Admittedly, this calculation example may be oversimplistic and one should remember that QTL observed in a single full-sib family are usually not generalizable to other families, populations, or environments. However, the results of this study nonetheless illustrate potential to regulate the speed of certain aspects of wood maturation by marker-assisted selection.

Comparison between QTL mapping methods and decision rules

As shown in Materials and Methods, the mLASSO model and BLMM model are in principle closely connected approaches. The mLASSO model is a two-step approach because it estimates the temporal trends and marker effects on those trends separately, whereas the BLMM is a one-step approach estimating the temporal trends and marker coefficients simultaneously. Furthermore, in mLASSO, the intercept and slope latent parameters are treated independently. However, in BLMM, intercepts and slopes are jointly analyzed in a bivariate model, so the dependency between them may be more properly taken care of (Piepho ). In general, it is believed that BLMM should be a more precise approach for analyzing this type of longitudinal data, and that the mLASSO model can sometimes be used as an approximation to BLMM (Sikorska ). In our work, mLASSO and BLMM detected similar sets of QTL under conditions of higher statistical power (the A-set with data for at least 250 individuals), whereas under conditions of low statistical power (S+A-set with data for less than 100 individuals regardless of the trait) only the mLASSO was able to detect any QTL (Figure 2, Figure 3, Figure 4 and Table 4). Supported by results of additional two-step Bayesian analyses, it is possible that the one-step BLMM, due to the simultaneous estimation of a higher number of parameters than the mLASSO, provided estimates with larger variances and lost power to detect QTL when sample sizes were insufficient. However, it also cannot be excluded that the small sample size of the S+A-set combined with the approximative behavior of the two-step methods (like mLASSO) could have caused a greater-than-expected number of falsely detected QTL. In any case, the inconsistency between BLMM and mLASSO at low statistical power suggests that QTL observed in the S+A dataset should be interpreted cautiously. Another methodological objective of this study was to evaluate the performance of several hypothesis-testing or uncertainty assessment methods for QTL identification. Regarding the mLASSO framework, the Single-p is very similar to many single locus hypothesis testing methods commonly used, whereas MST-p, COV-p, and SSP are more specifically designed for multi-locus LASSO methods. One common problem regarding these tests is their inconsistency with penalized estimates of marker effects obtained at a specific value of tuning parameter λ in LASSO. The development of more reliable LASSO test statistics is still an ongoing research topic (Lockhart ). Under the BLMM framework, it is possible to derive a BFDR-based decision rule by relying on a single MCMC simulation without any extra effort such as re-sampling. Given the results of this study, one can conclude that MST-p was much more conservative than all other methods. This was expected because MST-p relies on a stricter assumption than the others (Bühlmann ). In future research, simulation studies are needed to perform more intensive comparisons between the BLMM and the mLASSO approaches and between their associated significance testing methods.

Potential links between QTL and candidate genes

Among the nine significant QTL observed in this study, four were linked to SNP markers (Table 4), which would enable speculations about the potential function of the candidate genes from which the SNPs were developed. One should, however, note that none of these QTL SNPs were mapped in the proximity to any of the AFLP markers, which could have verified the QTL in the statistically high-powered A-set. Moreover, because conifer genomes are very large, the genetic linkage within our full-sib family likely extends over numerous genes that, due to the limited statistical power of the S+A-set, all could be considered as alternative loci for the QTL. Consequently, the interpretations made here need to be taken with caution. The significant SNP-QTL for the intercept of FWr (0_11919_01-122) is situated in a gene coding for a F-box protein, an enzyme link to the pathway of lignin precursor biosynthesis (Andersson Gunnerås 2005). This suggests that a putative F-box association to fiber width may be mediated through the control of lignin biosynthesis, an integral part of the secondary cell walls of plants. The SNP-QTL (Axs_47_502) significantly associated with GA is an UDP-D apiose/UDP-D-xylose synthase (AXS) that catalyzes the conversion of UDP-D-glucuronicacid to UDP-D-apiose and UDP-D-xylose. D-xylose is the second most common component of hemicellulose abundant in the cell wall, particularly in the S2 layer, which heavily influences the properties of the fiber cells (Rowell 2005). By theoretically investigating mechanical aspects of fiber cell division and maturation, Schulgasser and Witztum (2007) suggested the S2 microfibril angle (i.e., MFA) to be involved in the development of grain angle. An association between GA and AXS observed in this study is thus conceivable. Among the significant QTL for trait slopes, the SNP-QTL 2_10306_01-354 and 0_18350_01-393 were associated with LWD and FWr, respectively. The SNP associated with LWD is located in a gene coding the NINJA protein, which acts as a transcriptional repressor and whose activity is mediated by a functional TPL-binding EAR repression motif. Both NINJA and TPL proteins function as negative regulators of jasmonate responses and are therefore involved in the regulation of gene expression of proteins related to stress and growth (Wasternack and Hause 2013). The significant association of NINJA protein with WD in this study could be understood considering that wood density is a consequence of cell wall growth. The SNP associated with FWr is located in a gene coding a lipase class 3 family protein. Lipases are involved in the accumulation and storage of lipid triglycerides and steryl esters into an organelle in the cell called the lipid body (LB). Among other functions, LBs are involved in providing building blocks for the enlargement of cell membranes (van der Schoot ), such as the plasma membrane of the cell walls that form the wood. These speculations on the nature of the significant SNPs control of certain traits will certainly require further research advances in the molecular biology of wood development to be confirmed. However, we do not doubt the value of our findings as candidate targets for such studies. Previous studies of QTL and association analysis for wood properties resulted in the identification of multiple candidate genes, some of which have been confirmed across several studies in conifers (Brown ; González-Martínez ; Dillon ; Beaulieu ). The majority of these previously assayed candidate genes were included in our array, but we found no significant trait associations for them. The only exception is the finding made by Kumar that a gene coding for a lipid transfer protein was differentially expressed between Scots pine juvenile and mature wood. This is consistent with the SNP located in the lipid transfer protein coding gene and associated with the slope of FWr in this study. The authors argued about a possible gene role on cell wall deposition but admitted that this interpretation was vague and required further research for its validation.

Conclusions and future research

With respect to the genetic dissection of wood properties, the functional QTL mapping approach appeared promising and produced interesting results. Significant QTL were observed for the wood trait intercepts of EWD, FWr, and EFWr that appeared fairly stable within the studied cambial age range (approximately 5–13 yr). Two highly significant QTL were also detected for GA, which has never been included in wood property QTL studies. Furthermore, significant QTL were detected for the wood trait slopes of EP, EWD, LWD, and FWr, indicating marker associations with the rate of wood maturation. Four of the significant QTL were observed using SNP markers developed from candidate genes, thus making them interesting targets for further linkage or association mapping efforts or for molecular biology studies of the wood development in conifers. In addition to the genetic implications of this study, two functional mapping methods and several methods for quantifying certainty or significance of QTL were compared. The one-step BLMM and two-step mLASSO methods detected similar sets of QTL given that a large number of individuals were studied and the adaptive shrinkage method of BLMM appeared to allow reasonably large effect estimates in case QTL were significant. When the number of individual studies was small, however, the BLMM method did not detect any QTL indicating greater requirements than mLASSO in terms of the number of assessed individuals. Future research in this regard could use simulated data to further elucidate the differences between the two functional mapping approaches.

38 in total

1. Functional mapping of quantitative trait loci underlying the character process: a theoretical framework.

Authors: Chang-Xing Ma; George Casella; Rongling Wu
Journal: Genetics Date: 2002-08 Impact factor: 4.562

2. A likelihood approach for mapping growth trajectories using dominant markers in a phase-unknown full-sib family.

Authors: C-X Ma; M Lin; R C Littell; T Yin; R Wu
Journal: Theor Appl Genet Date: 2003-10-28 Impact factor: 5.699

Review 3. Overview of LASSO-related penalized regression methods for quantitative trait mapping and genomic selection.

Authors: Zitong Li; Mikko J Sillanpää
Journal: Theor Appl Genet Date: 2012-05-24 Impact factor: 5.699

4. Association genetics in Pinus taeda L. I. Wood property traits.

Authors: Santiago C González-Martínez; Nicholas C Wheeler; Elhan Ersoz; C Dana Nelson; David B Neale
Journal: Genetics Date: 2006-11-16 Impact factor: 4.562

5. AFLP: a new technique for DNA fingerprinting.

Authors: P Vos; R Hogers; M Bleeker; M Reijans; T van de Lee; M Hornes; A Frijters; J Pot; J Peleman; M Kuiper
Journal: Nucleic Acids Res Date: 1995-11-11 Impact factor: 16.971

6. Association genetics of wood physical traits in the conifer white spruce and relationships with gene expression.

Authors: Jean Beaulieu; Trevor Doerksen; Brian Boyle; Sébastien Clément; Marie Deslauriers; Stéphanie Beauseigle; Sylvie Blais; Pier-Luc Poulin; Patrick Lenz; Sébastien Caron; Philippe Rigault; Paul Bicho; Jean Bousquet; John Mackay
Journal: Genetics Date: 2011-03-08 Impact factor: 4.562

7. A stage-wise approach for the analysis of multi-environment trials.

Authors: Hans-Peter Piepho; Jens Möhring; Torben Schulz-Streeck; Joseph O Ogutu
Journal: Biom J Date: 2012-09-25 Impact factor: 2.207

8. Regularization Paths for Generalized Linear Models via Coordinate Descent.

Authors: Jerome Friedman; Trevor Hastie; Rob Tibshirani
Journal: J Stat Softw Date: 2010 Impact factor: 6.440

9. Patterns of divergence among conifer ESTs and polymorphism in Pinus sylvestris identify putative selective sweeps.

Authors: Anna E Palmé; Mark Wright; Outi Savolainen
Journal: Mol Biol Evol Date: 2008-09-04 Impact factor: 16.240

10. Segregation and linkage analysis for longitudinal measurements of a quantitative trait.

Authors: Conway Gee; John L Morrison; Duncan C Thomas; W James Gauderman
Journal: BMC Genet Date: 2003-12-31 Impact factor: 2.797

9 in total

1. A robust multiple-locus method for quantitative trait locus analysis of non-normally distributed multiple traits.

Authors: Z Li; J Möttönen; M J Sillanpää
Journal: Heredity (Edinb) Date: 2015-07-15 Impact factor: 3.821

2. Bayesian regularization for a nonstationary Gaussian linear mixed effects model.

Authors: Emrah Gecili; Siva Sivaganesan; Ozgur Asar; John P Clancy; Assem Ziady; Rhonda D Szczesniak
Journal: Stat Med Date: 2021-12-12 Impact factor: 2.373

3. Toward integration of genomic selection with crop modelling: the development of an integrated approach to predicting rice heading dates.

Authors: Akio Onogi; Maya Watanabe; Toshihiro Mochizuki; Takeshi Hayashi; Hiroshi Nakagawa; Toshihiro Hasegawa; Hiroyoshi Iwata
Journal: Theor Appl Genet Date: 2016-01-20 Impact factor: 5.699

4. A Gaussian process model and Bayesian variable selection for mapping function-valued quantitative traits with incomplete phenotypic data.

Authors: Jarno Vanhatalo; Zitong Li; Mikko J Sillanpää
Journal: Bioinformatics Date: 2019-10-01 Impact factor: 6.937

5. Genome-wide association study identified novel candidate loci affecting wood formation in Norway spruce.

Authors: John Baison; Amaryllis Vidalis; Linghua Zhou; Zhi-Qiang Chen; Zitong Li; Mikko J Sillanpää; Carolina Bernhardsson; Douglas Scofield; Nils Forsberg; Thomas Grahn; Lars Olsson; Bo Karlsson; Harry Wu; Pär K Ingvarsson; Sven-Olof Lundqvist; Totte Niittylä; M Rosario García-Gil
Journal: Plant J Date: 2019-07-28 Impact factor: 6.417