| Literature DB >> 25492368 |
Joachim W Bargsten, Jan-Peter Nap, Gabino F Sanchez-Perez, Aalt D J van Dijk.
Abstract
BACKGROUND: Elucidation of genotype-to-phenotype relationships is a major challenge in biology. In plants, it is the basis for molecular breeding. Quantitative Trait Locus (QTL) mapping enables to link variation at the trait level to variation at the genomic level. However, QTL regions typically contain tens to hundreds of genes. In order to prioritize such candidate genes, we show that we can identify potentially causal genes for a trait based on overrepresentation of biological processes (gene functions) for the candidate genes in the QTL regions of that trait.Entities:
Mesh:
Substances:
Year: 2014 PMID: 25492368 PMCID: PMC4274756 DOI: 10.1186/s12870-014-0330-3
Source DB: PubMed Journal: BMC Plant Biol ISSN: 1471-2229 Impact factor: 4.215
Figure 1Prioritizing QTL candidate genes via associating traits to biological processes. (A) Principle of method used: Biological processes (indicated as different colored boxes) are annotated for genes in QTL regions for a trait-of-interest. Using these gene functions, trait-biological process associations are obtained based on enrichment of biological processes among the genes linked to a particular trait, integrating information from multiple QTL regions. Genes annotated with overrepresented biological processes are prioritized. (B) Number of QTL regions connected to traits in the rice QTL compendium used for this analysis. The scale of the horizontal axis in the histogram is clipped at 50, so traits with more than 50 QTL regions associated (~2% of the total) are not included. (C) Number of genes connected to traits in the rice QTL compendium. The scale of the horizontal axis in the histogram is clipped at 5000, so traits with more than 5000 genes associated (~5% of the total) are not included.
Associations between traits and biological processes
|
| |
|---|---|
|
| 179 |
|
| 1591 |
|
| 1767 |
|
| 1522 |
|
| |
|
| 2519 |
|
| 153 |
|
| 918 |
aAs intermediate step in candidate gene prioritization, traits and biological processes (BPs) were associated using overrepresentation of biological processes found for genes connected with each trait in the rice Gramene QTL compendium.
bOnly BP terms which were associated with less than 1% of the genes in the genome were used as input terms in our analysis (i.e., a filter on the maximum allowed generality of the biological process was applied).
Candidate gene prioritization: comparison with QTL fine-mapping
|
|
|
|
|
|---|---|---|---|
| Leaf size: | 214 | 21 | regulation of flower development |
| LOC_Os01g11940 [ | lysine biosynthetic process via diaminopimelate | ||
| Leaf size: | 214 | 21 | organic acid catabolic process |
| LOC_Os01g11946 [ | |||
| Number of spikelets per panicle: LOC_Os01g12160 [ | 246 | 8 | systemic acquired resistance |
| Gel consistency: | 167 | 14 | monosaccharide metabolic process |
| LOC_Os06g04200 [ | glycolipid biosynthetic process | ||
| membrane lipid biosynthetic process | |||
| glucose metabolic process | |||
| Gelatinization temperature: | 53 | 3 | monosaccharide metabolic process |
| LOC_Os06g12450 [ | glycolipid biosynthetic process | ||
| membrane lipid biosynthetic process | |||
| Heading date: | 330 | 13 | positive regulation of RNA metabolic process |
| LOC_Os08g07740 [ | positive regulation of nucleobase-containing compound metabolic process | ||
| positive regulation of (macromolecule/cellular) metabolic process | |||
| Yield, plant height: | 188 | 8 | positive regulation of macromolecule/cellular/nitrogen compound biosynthetic process |
| LOC_Os08g07740 [ | positive regulation of gene expression | ||
| Grain size and quality: | 300 | 29 | regulation of post-embryonic development |
| LOC_Os08g41940 [ | |||
| Viscosity parameter: | 120 | 4 | monosaccharide/glucose meta-/catabolic process |
| LOC_Os08g42410 [ | glycolysis | ||
| hexose catabolic process | |||
| alcohol catabolic process |
aFor each trait found in literature with a fine-mapped candidate gene, QTL traits in our dataset were obtained which were similar/related to the literature trait, and for which the fine-mapped gene occurred in one of the QTL regions. Only cases for which the candidate gene was correctly prioritized by our approach are shown, in combination with the biological processes involved. #genes, number of genes in the input QTL region. #sel, total number of genes prioritized in the QTL region. For complete overview of comparison with fine-mapped candidate genes, see Additional file 3: Table S3.
bLOC_Os08g07740 is found as fine-mapped candidate gene for two different traits.
Figure 2Associations between traits and biological processes. (A) Histogram of number of associations to biological processes (BPs) per trait. (B) Histogram of number of associations to traits per biological process.
Validated causal genes
|
|
|
|
|---|---|---|
| LOC_Os01g10840 | plant height | intracellular protein kinase cascade; pattern specification process; xylem and phloem pattern formation; signal transduction by phosphorylation |
| LOC_Os01g58420 | spikelet number | cellular response to ethylene stimulus |
| LOC_Os01g66120 | plant height | positive regulation of macromolecule biosynthetic process/nitrogen compound metabolic process/gene expression |
| LOC_Os02g43790 | spikelet number | cellular response to ethylene stimulus |
| LOC_Os03g03370 | relative water content | microgametogenesis |
| LOC_Os08g06380 | plant height | two-component signal transduction system (phosphorelay); ethylene mediated signaling pathway; cellular response to ethylene stimulus |
| LOC_Os09g26400 | chlorophyll content | NADPH regeneration; nicotinamide nucleotide metabolic process |
| LOC_Os11g08210 | plant height | positive regulation of macromolecule biosynthetic process/nitrogen compound metabolic process/gene expression |
aGenes prioritized for traits based on overrepresentation of biological processes in QTL regions for the trait for which validation is available based on literature results [59].
Figure 3Transcription factors as potentially causal genes. Specific TF families (horizontal axis) were found associated with specific traits (vertical axis). Heatmap shows which percentage of the associated TFs belongs to various TF subfamilies for traits with at least ten associated TFs, and at least one TF subfamily which constitutes more than 25% of all associated TFs for that trait. Only TF subfamilies which for at least one trait constituted more than 25% of all TFs, are shown.
Figure 4Analysis of QTL regions for rice trait days to heading. (A) Overview of prioritization results per QTL region. Each pair of horizontal bars indicates a QTL region; the black bar represents the total number of genes in the region, and the green bar the number of prioritized (potentially causal) genes. Inset: pie-diagram indicates the total number of genes (7113), and the fraction of those genes selected by the prioritization approach (579). (B) Overview of selected biological processes: REVIGO [68] scatterplot view in which each circle represents a BP; the distance between circles indicates similarity between BPs.
Genes predicted as causal genes for days to heading
|
|
|
|---|---|
| LOC_Os01g68620 | signal peptide peptidase-like 2B |
| LOC_Os01g70920 | cullin-1 |
| LOC_Os01g74020 | MYB family transcription factor |
| LOC_Os03g54170 | OsMADS34 - MADS-box family gene with MIKCc type-box |
| LOC_Os03g61570 | expressed protein |
| LOC_Os05g02300 | Core histone H2A/H2B/H3/H4 domain containing protein |
| LOC_Os07g41370 | OsMADS18 - MADS-box family gene with MIKCc type-box |
| LOC_Os07g46180 | PWWP domain containing protein |
| LOC_Os07g08880 | ES43 protein |
| LOC_Os09g39270 | ZOS9-20 - C2H2 zinc finger protein |
| LOC_Os10g40810 | GATA zinc finger domain containing protein |
aGenes prioritized in QTL regions for trait days to heading based on their predicted function ‘regulation of flower development’, and present as single gene annotated with this term in the respective QTL region. Without the last requirement, in total 79 genes were prioritized in the QTL regions for this trait based on the BP ‘regulation of flower development’ (Additional file 3: Table S5).