| Literature DB >> 31419934 |
Farouk Messad1, Isabelle Louveau1, Basile Koffi1, Hélène Gilbert2, Florence Gondret3.
Abstract
BACKGROUND: Improving feed efficiency (FE) is a major challenge in pig production. This complex trait is characterized by a high variability. Therefore, the identification of predictors of FE may be a relevant strategy to reduce phenotyping efforts in breeding and selection programs. The aim of this study was to investigate the suitability of expressed muscle genes in prediction of FE traits in growing pigs. The approach considered different transcriptomics experiments to cover a large range of FE values and identify reliable predictors.Entities:
Keywords: Feed efficiency; Machine learning; Muscle; Pig; Prediction; Transcriptome
Mesh:
Year: 2019 PMID: 31419934 PMCID: PMC6697907 DOI: 10.1186/s12864-019-6010-9
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Descriptive statistics for feed efficiency traits and growth performance
| Variable | Line |
| Mean | SEM | StDev | Minimum | Maximum |
|---|---|---|---|---|---|---|---|
| RFI-BV | Low RFI | 31 | − 66.5a | 3.6 | 20.1 | −108.7 | −39.5 |
| High RFI | 40 | 55.9b | 1.7 | 10.9 | 33.5 | 91.6 | |
| ADG | Low RFI | 31 | 885.0a | 15.9 | 88.7 | 700 | 1068 |
| High RFI | 40 | 827.4b | 15.7 | 99.2 | 543 | 1012 | |
| FI | Low RFI | 31 | 2288.3a | 34.2 | 190.4 | 1914 | 2658 |
| High RFI | 40 | 2362.7a | 40.5 | 255.9 | 1725 | 3026 | |
| FCR | Low RFI | 31 | 2.60b | 0.03 | 0.18 | 2.25 | 2.91 |
| High RFI | 40 | 2.87a | 0.03 | 0.21 | 2.46 | 3.28 | |
| FCRe | Low RFI | 31 | 25b | 0.3 | 2 | 22 | 29 |
| High RFI | 40 | 28a | 0.3 | 2 | 24 | 32 |
Abbreviations used: ADG Average daily gain (g/d), FI feed intake (g/kg), FCR Feed conversion ratio (kg/kg), FCRe Net energy feed conversion ratio (MJ/kg), RFI-BV Breeding value for residual feed intake (g/d). Data were obtained in n = 71 growing pigs from two lines divergently selected for residual feed intake (low/high) and reared under different conditions. ADG, FI and FCR were obtained from referenced publications [21, 22]. The FCRe was newly calculated using the net energy content of diets that was provided in the same publications. Genetic RFI values were newly calculated from performance recorded on pig littermates reared in the selection farm (Rouillé, France). For each trait, data obtained from pigs of both lines were compared by ANOVA; a, b: for a given trait, means with different superscript letter differed between low and high RFI lines (P < 0.05)
Fig. 1aDistribution of residual feed intake (RFI) b Distribution of feed conversion ratio (FCR). Barrows and females growing pigs from generations 6 to 8 of a divergent selection for RFI were considered. Pigs from the low or high RFI lines were fed different diets according to referenced publications [21, 22]. Black dot blot: pigs of the low RFI line; red dot blot: pigs of the high RFI line
Fig. 2Plot of the first two principal components unraveling whole variability in the merged molecular dataset. Pigs are represented on the scatter plot created with the first two principal components (PC) of a Principal Component analysis (PCA) which aggregated the whole transcriptomic data (20,405 annotated expressed probes) in the longissimus muscle of different studies. The first PC of the PCA (PC1) represented 46.8% of the whole transcriptomic variability and discriminated pigs from the low or high RFI selection lines. This allows considering PC1 of the PCA as a relevant summary of the main molecular probes involved across the pigs in divergence for RFI. Black dot blot: pigs of the low RFI line; red dot blot: pigs of the high RFI line black dot blot: pigs of the low RFI line; red dot blot: pigs of the high RFI line
Number of probes and encoded genes identified as VIP for feed efficiency traits
| Nb annotated probes | Nb unique genes | R2 | RMSE | |
|---|---|---|---|---|
| RFI-BV | 384 | 222 | 0.63 | 42.9 |
| 280 | 161 | 0.64 | 39.6 | |
| 50 | 27 | 0.65 | 39.3 | |
| FCR | 421 | 267 | 0.61 | 0.23 |
| 88 | 52 | 0.70 | 0.22 | |
| 50 | 33 | 0.67 | 0.22 | |
| FCRe | 318 | 218 | 0.49 | 2.2 |
| 50 | 29 | 0.52 | 2.0 | |
| 7 | 6 | 0.52 | 2.0 |
Machine learning procedure (gradient tree net boosting) was applied on microarrays dataset (20,405 expressed annotated probes) generated from the longissimus muscle of 71 growing pigs to identify models able to predict residual feed intake (RFI), feed conversion ratio (FCR) and net energy-based feed conversion ratio (FCRe). A randomly selected bootstrap pig sample (n = 50) was used for learning, whereas the remaining pigs (n = 21) was used for validation test. The first rounds led to model stabilization with 384 molecular probes as very important variables (VIP) for RFI-BV prediction, 421 probes for FCR prediction and 318 probes for FCRe prediction, respectively, out of the 20,405 expressed annotated probes. The second entry was an iterative step of the former procedure but considering the VIP that were identified in the first step as the new inputs. This increased the accuracy of the prediction evaluated by the root mean square error (RMSE) and the coefficient of determination (R2). The last entry was another iterative step using the VIP identified at the second step as the new inputs, which led to identify the smallest number of VIP able to predict the target trait with a good accuracy. The numbers (Nb) of annotated probes and their corresponding unique genes identified as VIP for the three feed efficiency traits were indicated. Lists of the VIP (probes and their corresponding gene name when applicable) are provided in Additional files 1, 2 and 3
TreeNet boosting procedure was applied to 20,405 annotated probes expressed in the longissimus muscle of 71 pigs to release very important predictors (VIP) that can be used to predict residual feed intake (RFI) values. A total of 384 molecular probes were identified. Iterative steps led to reduce the set to 50 molecular probes corresponding to 30 unique encoded genes. These genes were listed by the order of importance (score) in prediction. Expression levels of genes indicated in bold face were further measured by qPCR
Main overrepresented biological processes shared by genes selected as predictors of feed efficiency traits
| GO terms | Nb genes | E | Clustered genes | |
|---|---|---|---|---|
| RFI (clustered pathways among 222 VIP) | ||||
| GO:0051270~regulation of cell motion | 12 | 3.35 | < 0.001 | BCL2, F10, HBZGF, HDAC5, PDGFRB, SERPINE2, TGFR3 |
| GO:0007167~enzyme linked receptor protein signaling pathway | 14 | 2.36 | < 0.001 | AMHR2, NRP1, |
| GO:0001558~regulation of cell growth | 8 | 1.86 | 0.008 | NRP1, CD44, |
| GO:0009725~response to hormone stimulus | 12 | 1.81 | 0.004 | HDAC5, PLA2G4A, AR, GRB10, CCND2, |
| GO:0019318~hexose metabolic process | 8 | 1.79 | 0.008 | PDK1, TPI1, |
| GO:0007517~muscle organ development | 8 | 1.79 | 0.01 | SRPK3, GATA6, TIPARP, PDGFRB, TGFBR3, HBEGF, ZFPM2, CBY1 |
| GO:0032844~regulation of homeostatic process | 5 | 1.38 | 0.05 | PLA2G4A, CD44, BCL2, RYR2, CA2 |
| GO:0060284~regulation of cell development | 10 | 1.33 | < 0.001 | HDAC5, NRP1, |
| GO:0048705~skeletal system morphogenesis | 7 | 1.31 | 0.004 | TULP3, |
| GO:0006468~protein amino acid phosphorylation | 15 | 1.30 | 0.03 | SRPK3, PDK1, AMHR2, TWF1, TRIO, ADRBK1, CDKL2, |
| GO:0010035~response to inorganic substance | 9 | 1.24 | 0.003 | ACTB, PLA2G4A, SLC1A3, BCL2, UROS, RYR2, MGP, ADRBK1, CA2 |
| GO:0006954~inflammatory response | 8 | 1.20 | 0.08 | HDAC5, CD44, |
| GO:0031667~response to nutrient levels | 8 | 1.10 | 0.08 | PLA2G4A, |
| GO:0006650~glycerophospholipid metabolic process | 5 | 1.08 | 0.08 | PLA2G4A, ABHD5, ADNP, LPCAT2, PIK3R1 |
| GO:0016053~organic acid biosynthetic process | 6 | 1.08 | 0.04 | TPI1, SLC1A3, SCD, ABHD5, UROS, |
| FCR (clustered pathways among 267 VIP) | ||||
| GO:0007010~cytoskeleton organization | 13 | 2.4 | 0.001 | RND3, ACTC1, |
| GO:0007155~cell adhesion | 20 | 1.85 | 0.011 | TECTA, NRP1, OLR1, GMDS, LGALS4, CNKSR3, NLGN3, CLDN10, CLDN11, CD84, RND3, LAMA4, |
| GO:0060284~regulation of cell development | 8 | 1.78 | 0.038 | NRP1, LYN, ROBO1, |
| GO:0006006~glucose metabolic process | 7 | 1.66 | 0.03 | TPI1, PYGM, |
| GO:0060537~muscle tissue development | 8 | 1.66 | 0.003 | MYF6, ACTC1, GATA6, TIPARP, TTN, CHRNA1, HOMER1, PTEN |
| GO:0005977~glycogen metabolic process | 3 | 1.61 | 0.099 | PYGM, |
| GO:0000902~cell morphogenesis | 10 | 1.50 | 0.095 | |
| GO:0001568~blood vessel development | 9 | 1.42 | 0.034 | CCM2, LAMA4, NRP1, ROBO1, TIPARP, TGFA, DBH, FIGF, PTEN |
| GO:0045321~leukocyte activation | 8 | 1.35 | 0.077 | LYN, |
| GO:0001501~skeletal system development | 10 | 1.29 | 0.055 | GNAQ, |
| GO:0016052~carbohydrate catabolic process | 6 | 1.27 | 0.025 | GPD1L, OVGP1, TPI1, PYGM, PYGL, FUT1 |
| GO:0030163~protein catabolic process | 15 | 1.19 | 0.093 | FEM1C, SOCS3, WWP1, USP9X, SMURF1, UBE2J2, SPOPL, UBE2Q1, USP32, MYCBP2, RNF111 |
| FCR (clustered pathways among 218 VIP) | ||||
| GO:0051270~regulation of cell motion | 15 | 1.95 | 0.001 | RET, MSH2, MDGA1, ARID5B, NR4A2, KDR, DSTN, IGSF8, MACF1, |
| GO:0034613~cellular protein localization | 13 | 1.58 | 0.004 | COPA, CLTA, YWHAZ, LTBP2, AP1G1, |
| GO:0006163~purine nucleotide metabolic process | 7 | 1.54 | 0.02 | |
| GO:0001568~blood vessel development | 9 | 1.49 | 0.009 | EPAS1, BAX, CHM, ZFPM2, TNNI3, THBS1, MMP2, KDR, ACVR1 |
| GO:0006732~coenzyme metabolic process | 6 | 1.25 | 0.04 | DLD, ACLY, ALDH1L2, GCLM, MTHFD1L, MOCS1 |
| GO:0042592~homeostatic process | 17 | 1.22 | 0.02 | ENPP1, EPAS1, PTH1R, PRDX3, TNNI3, GCLM, MBP, KDR, RPS19, SLC4A11, RHCG, IL20RB, BAX, DLD, FABP4, IKBKB, CLN6 |
| GO:0003006~reproductive developmental process | 8 | 1.10 | 0.04 | HSPA2, MSH2, BAX, DLD, SF1, DHCR24, KDR, ACVR1 |
| GO:0048514~blood vessel morphogenesis | 7 | 1.08 | 0.04 | EPAS1, BAX, ZFPM2, TNNI3, THBS1, KDR, ACVR1 |
| GO:0043066~negative regulation of apoptosis | 11 | 1.04 | 0.01 | YWHAZ, MSH2, BAX, BTC, NR4A2, PRDX3, IKBKB, THBS1, GCLM, DHCR24, ACVR1 |
Very important genes (VIP) for prediction of feed efficiency traits (RFI: residual feed intake; FCR: feed conversion ratio; FCRe: net energy based-feed conversion ratio). Genes were clustered into functional groups using DAVID tool. The enrichment score (E > 1) for each cluster and P-value of the enrichment for the corresponding Gene Ontology (GO) terms are provided. Expression levels of genes indicated in bold font were further measured by qPCR
Average expression levels of target genes studied by qPCR
| RFI line | ||||
|---|---|---|---|---|
| Gene symbol | VIP for | Low | High | |
| ACACB | FCR | 0.92 | 0.68 |
|
| AKAP12 | RFI, FCRe | 0.74 | 0.93 |
|
| ATP1B1 | FCR, FCRe | 0.78 | 1.00 |
|
| BLCAP | RFI, FCR | 0.80 | 1.04 |
|
| CD40 | RFI | 1.06 | 1.64 |
|
| CSRNP3 | RFI, FCR | 1.33 | 2.49 + 0.17 |
|
| EZR | RFI, FCR, FCRe | 0.61 | 0.99 |
|
| FKBP5 | RFI, FCR, FCRe | 0.74 | 1.48 |
|
| FRAS1 | RFI, FCR | 1.06 | 1.17 | 0.40 |
| FYN | FCRe, FCR | 1.00 | 1.27 |
|
| HSD11B1 | FCR | 1.29 | 2.81 |
|
| IGF2 | RFI, FCR | 0.99 | 0.89 | 0.93 |
| IL4R | RFI, FCR | 0.94 | 1.22 |
|
| MUM1 | RFI | 0.97 | 1.22 |
|
| PDZD2 | RFI, FCR | 0.61 | 0.73 | 0.24 |
| PHKB | RFI, FCR | 0.74 | 0.87 |
|
| PSEN1 | RFI | 0.79 | 0.96 |
|
| RPL6 | FCR, FCRe | 0.90 | 0.91 | 0.90 |
| SERINC3 | RFI, FCR | 0.61 | 0.87 |
|
| SERPINA1 | FCR, FCRe | 1.32 | 0.89 |
|
| SOCS6 | RFI, FCR | 0.48 | 0.58 | 0.17 |
| TFG | FCR | 1.04 | 1.18 |
|
| TMED3 | FCR | 0.79 | 0.76 | 0.76 |
| UGDH | RFI, FCR | 0.87 | 1.14 |
|
Abbreviations used: FCR Feed conversion ratio, FCRe Net energy feed conversion ratio, RFI Residual feed intake, VIP Very important variable in prediction. Muscle transcriptomes from pigs (n = 71) of two lines divergently selected for RFI and reared under different conditions were considered. The qPCR technology was used to assess expression levels of target genes that were identified by a gradient tree boosting procedure as very important for prediction (VIP) of RFI, FCR or FCRe individual values. ANOVA was then used to evaluate the differences in expression levels of those genes between the two RFI lines
bold face highlights significant differences (P < 0.05) between lines
Fig. 3Venn diagrams to identify commonalities between lists of VIP for feed efficiency trait. Predictive models were built from microarrays transcriptomics dataset to identify the most important annotated expressing probes in the longissimus muscle able to predict breeding values of RFI, and feed-conversion-ratio (FCR) and net energy-based feed conversion ratio (FCRe) values. The lists of these probes identified as VIP (very important variables in prediction) were then uploaded by their corresponding gene name in the VENNY tool. Venn diagram was edited to enlighten commonalities between the lists of unique genes identified as VIP for the three traits
Lists of muscle genes identified as common predictors for feed efficiency traits
| Traits | Common VIP |
|---|---|
| RFI/FCR | ANKRD1; ANKRD42; ARF3; BCL2; |
| RFI/FCRe | |
| FCR/FCRe | ANKRD1; |
| RFI/FCR/FCRe | ANKRD1; DMTF1; |
Very important expressed muscle genes (VIP) identified as important for prediction of residual feed intake (n = 384 VIP), feed conversion ratio (FCR, n = 421 VIP) or net energy based-feed conversion ratio (FCRe, n = 318 VIP) were indicated. Genes indicated in bold font were further considered for qPCR analysis
Fig. 4Regression between observed and predicted FCR values. The predictive model was built from microarrays transcriptomics dataset by TreeNet boosting procedure to identify the most important annotated expressing probes in the longissimus muscle able to predict feed-conversion-ratio (FCR) from n = 71 pigs of two divergent selection lines for residual feed intake (RFI). The graph was then computed between observed and predicted FCR values. The red line represents pigs of the high RFI group, the black line represents pigs of the low RFI group. Bad prediction concerned seven pigs (encircled) of the high RFI line
Top contributing genes to the linear prediction of feed efficiency
| RFI | FCR | FCRe | |||
|---|---|---|---|---|---|
| 24 VIP1 | |||||
| Subset2 | |||||
| Gene | Gene | Gene | |||
| FKBP5 | < 0.001 | FKBP5 | < 0.001 | FKBP5 | < 0.001 |
| SERINC3 | 0.02 | MUM1 | 0.03 | MUM1 | 0.04 |
| IGF2 | 0.03 | AKAP12 | 0.03 | AKAP12 | 0.03 |
| CSRNP3 | 0.03 | FYN | 0.03 | PHKB | 0.08 |
| EZR | 0.09 | TMED3 | 0.08 | SOCS6 | 0.07 |
| RPL16 | 0.08 | PHKB | 0.08 | FYN | 0.08 |
| TFG | 0.02 | TFG | 0.02 | ||
| SOCS6 | 0.07 | TMED3 | 0.09 | ||
| ILR4 | 0.10 | ILR4 | 0.10 | ||
| FRAS1 | 0.12 | FRAS1 | 0.12 | ||
1A total of 24 target genes was used in a linear model for prediction of residual feed intake (RFI), feed-conversion ratio (FCR) and energy-based feed conversion ratio (FCRe)
2Stepwise selection was also used to retain the most significant variables in regression models for feed efficiency traits. Associated P-value for the entry of each variable (mRNA level of the gene) in the best model was indicated. All variables with P < 0.15 were considered