Literature DB >> 31414073

A Nutrigenomics Approach Using RNA Sequencing Technology to Study Nutrient-Gene Interactions in Agricultural Animals.

M Shamimul Hasan¹, Jean M Feugang¹, Shengfa F Liao¹.

Abstract

Thorough understanding of animal gene expression driven by dietary nutrients can be regarded as a bottom line of advanced animal nutrition research. Nutrigenomics (including transcriptomics) studies the effects of dietary nutrients on cellular gene expression and, ultimately, phenotypic changes in living organisms. Transcriptomics can be applied to investigate animal tissue transcriptomes at a defined nutritional state, which can provide a holistic view of intracellular RNA expression. As a novel transcriptomics approach, RNA sequencing (RNA-Seq) technology can monitor all gene expressions simultaneously in response to dietary intervention. The principle and history of RNA-Seq are briefly reviewed, and its 3 principal steps are described in this article. Application of RNA-Seq in different areas of animal nutrition research is summarized. Lastly, the application of RNA-Seq in swine science and nutrition is also reviewed. In short, RNA-Seq holds significant potential to be employed for better understanding the nutrient-gene interactions in agricultural animals.

Entities: Chemical Disease Gene Species

Keywords: RNA sequencing technology; agricultural animal; nutrient–gene interaction; nutrigenomics; transcriptomics

Year: 2019 PMID： 31414073 PMCID： PMC6686084 DOI： 10.1093/cdn/nzz082

Source DB: PubMed Journal: Curr Dev Nutr ISSN： 2475-2991

Introduction

The fast-growing human population worldwide demands more food of animal origin, which is especially true in the developing countries with rising living standards. The limited natural resources (e.g., the land area and clean water reserves), however, negatively affect the scale and productivities of animal production, leading to an urgent need for developing novel production strategies, such as molecular-based precision animal agriculture, to improve animal production efficiency (1, 2). As is known, animal life essentially is a set of gene expression processes. Although these processes are genetically preprogrammed, dietary nutrients are the “driving force” for the processes. An ultimate goal of animal nutrition study (a.k.a., animal nutriology, a branch of animal science) is to thoroughly understand how dietary nutrients affect or drive animal genetic programs during their life spans (3). Previous studies showed that a balanced uptake of nutrients is vital for maintenance of animal growth and health, whereas an imbalanced provision of nutrients can cause diseases and compromise animal health and production performance (4). Although it is known that dietary nutrients exert their functions through numerous nutrient-metabolic and cell-signaling pathways, our current knowledge is still not profound enough to unravel the immense complexities in the relations between dietary nutrients and animal genome expression. Molecular animal nutrition is a new branch of nutriology studying animal nutrition at the basic molecular biological level. In other words, it is to study animal biological processes at the gene expression level, or to study nutrient–gene interactions, with the aid of modern molecular biological science and technologies (5, 6). Study of the dynamic bidirectional nutrient–gene interactions can ultimately elucidate the basic molecular mechanisms by which dietary nutrients regulate animal gene expression, cellular biochemical responses, and, in turn, the physiological processes and phenotypic expression (7). To further improve animal production efficiency, advanced animal nutrition studies are indispensable. This article has been written to review the current research progress in the field of molecular animal and human nutrition, with an emphasis on the application of RNA sequencing (RNA-Seq) technology, a novel nutrigenomics approach, to study nutrient–gene interactions in agricultural animals. It is predicted that the novel RNA-Seq technology can offer plentiful opportunities for comprehensive investigations in the fields of molecular animal nutrition and associated animal systems biology.

The Nutrigenomics Approach to Study Nutrient–Gene Interactions

Animal cellular responses to the nutritional environment (i.e., the availability or scarcity of nutrients) are tightly linked through a series of biochemical and physiological events, which include nutrient digestion, absorption, intermediary metabolism, storage, and excretion, as well as information metabolism such as gene expression (8, 9). Nutrients and metabolites can exert direct or indirect actions to alter gene expression via up- or downregulating transcription processes (10–13). depicts an example of cellular nutrient–gene interaction mechanisms, which shows the indirect effect of available nutrient glucose (Glc) on transcriptional expression of several genes in various pathways (e.g., gluconeogenic, glycolytic, and lipogenic).

FIGURE 1

An example of cellular nutrient–gene interaction mechanisms. This diagram shows the regulation of gene expression by the available nutrients, nutrient metabolites, or nonnutrient compounds within a cell. For example, nutrients, such as Glc, become available in the circulatory system through either the digestion of feed ingredients or the metabolic degradation of chemical components of tissues (e.g., liver, adipose, and muscle). Multiple Glc sensing mechanisms coexist: extraorganismal, extracellular, and intracellular (14). The extraorganismal Glc is sensed by oral taste receptors. With the extracellular mechanism, Glc is sensed by GLUT2 or GLUT4. With the intracellular mechanism, Glc is sensed by GCK. GCK further phosphorylates Glc to produce Glc-6-phosphate (Glc-6-P), which acts as a signaling molecule (metabolic messenger) to activate the downstream molecules and regulate the expression of genes (e.g., insulin, glycogen synthase, and glycogen phosphorylase) related to Glc metabolism (15, 16). Pathway [A] shows the role of Glc in insulin gene expression. Although multiple factors are involved in transcription of the insulin gene, Pdx-1 is a crucial one in pancreatic β-cells (17). Glc is transported to pancreatic β-cells by GLUT2 and initiates the signaling to induce Pdx-1 phosphorylation and translocation into the nucleus, where it binds to the insulin gene promoter, resulting in increased insulin transcriptional activation and increased insulin secretion (18, 19). Pathways [B] and [C] show the roles of Glc and insulin in the expression of glycolytic and lipogenic genes in hepatocytes. Insulin first binds to insulin binding receptor, initiating the signaling to recruit GLUT2 or GLUT4. The active GLUT2 or GLUT4 carry Glc into the cell. An elevated concentration of Glc in hepatic and adipose cells can indirectly upregulate the expression of genes encoding Glc transporters, glycolytic enzymes (e.g., L-PK), and lipogenic enzymes (e.g., FAS, ACC, and SCD1), while repressing the expression of genes related to the gluconeogenic pathway, such as PEPCK (20). ACC, acetyl-CoA carboxylase; FAS, fatty acid synthase; GCK, glucokinase; Glc, glucose; GLUT, Glc transporter; L-PK, L-type pyruvate kinase; Pdx-1, pancreatic duodenal homeobox factor-1; PEPCK, phosphoenolpyruvate carboxykinase; SCD1, stearoyl-CoA desaturase-1. The expression of groups of related genes generally leads to the establishment of phenotypic characters, and full investigations of such genes require the use of large-scale studies such as genetic polymorphism, genome-wide association (GWA) (21–23), and single nucleotide polymorphism (SNP) analyses (24); all these applications are suitable for the characterization of global relations between nutrients, genetics, and phenotypes. Downstream of these techniques, the small-scale investigation known as real-time PCR has allowed the identification of specific gene targets (25) to serve as nutrient-related biomarkers. Although the PCR technique is ideal for specific quantitative analyses of known target genes (26), it remains laborious, time-consuming, and limited to a small number of genes. On the other hand, the use of modern genomics science and techniques to study the effects of dietary nutrients on the cellular gene expression, metabolic responses, and, ultimately, phenotypic changes of living organisms is referred to as nutrigenomics, a branch of molecular animal nutrition (9). In contrast, the study of the effects of genetic variations on animal or human responses to different dietary components is referred to as nutrigenetics (27). For example, phenylketonuria patients carry a phenylalanine hydroxylase mutation that leads to a nonhydroxylation of phenylalanine to tyrosine, resulting in high concentrations of phenylalanine in blood and other tissues (28). In practice, application of nutrigenomics harnesses various “omics” techniques in multiple disciplines, including genomics, epigenomics, transcriptomics, proteomics, and metabolomics, in an independent or integrated manner, to analyze animal cellular and molecular responses to various dietary nutrients, revealing the global influence of nutrients on animal genomes, methylomes/epigenomes, transcriptomes, proteomes, and metabolomes, respectively (9, 29, 30). Without a doubt, the application of those high-throughput omics tools in the field of animal nutrition research can generate “big data” that can greatly enhance our understanding of nutrient regulation of nutrient-metabolic and cell-signaling pathways and their homeostatic control (31, 32).

The Transcriptomics Approach and RNA-Seq Technology

Among the omics techniques, transcriptomics is the most widely used for profiling animal gene expression at a defined nutritional state, because it can provide a whole picture of intracellular RNA transcript changes in response to dietary interventions (33). The foundation of transcriptomics is based on the central dogma theory (34), describing the sequential flow of genetic information expression: from DNA to DNA (called replication), from DNA to RNA (called transcription), and from RNA to protein (called translation). In this sequential expression flow, the transcription process generates a complete set of RNA (35) that consists of ribosomal RNA (rRNA; ∼80%), tRNA (∼15%), mRNA (∼4–5%), and other noncoding RNA (ncRNA; <1%). Importantly, only mRNA carries the identities of individual genes and acts as bridges to convey the genotypic message to the phenotypic expression of an organism. The transcription process occurs with the activation of transcription factors by external or internal molecules which include nutrients, metabolites, hormones, chemical drugs, etc. An activated transcription factor binds to a specific region within a DNA promoter of a target gene to initiate or inhibit the transcription of a single-strand mRNA (31). For this special reason, the transcriptomic patterns of diverse cells or tissues within an animal body vary with environmental conditions (including nutritional interventions) that affect the expression of genetic messages or the sequential flow of gene expression (36). Fluctuation in the intracellular RNA pool is the fundamental scientific basis of transcriptomic profiling of an animal whole transcriptome, and this profiling allows scientists to understand the steady-state level of gene expression under specific physiological conditions (37). The comprehensive transcriptomic analysis integrates all types of RNA (coding- and ncRNA) to determine the transcriptional structure of genes, in terms of mRNA splicing and other posttranscriptional modifications during growth and development or under different physiological or nutritional conditions (38). In short, profiling mRNA transcripts can give nutrition scientists a clear insight into the holistic gene expression status in response to external dietary components. Presently, various platforms of transcriptomics analysis can be used to acquire valuable nutrigenomics information. DNA microarray technology is a hybridization-based high-throughput method for transcriptomics analysis, which is widely used for profiling animal transcriptomes in certain physiological or pathological conditions (39, 40). The basic principle of microarray analysis is hybridization of cDNA samples with spotted specific oligonucleotide probes, such as short-oligonucleotide, long-oligonucleotide, or cDNA (41, 42). DNA microarray technology can be used for analysis of gene expression and genotyping for point mutations, SNPs, and short tandem repeats. Although it can generate high-resolution data on a large scale within a short period of time, the microarray method has some technical limitations, including its dependence on the existing knowledge of the genomic sequences. In addition, the high abundance of certain transcripts may create high background noise or signal saturation due to nonspecific hybridization. Furthermore, microarray analysis does not allow the detection of mRNA transcripts from repeated sequences that present a dynamic range. Therefore, microarray analysis cannot detect the very subtle changes at the gene expression level (38, 43, 44). RNA-Seq, on the other hand, has demonstrated extraordinary analytical potential relative to microarray technology for gene expression investigations. RNA-Seq has been used in the context of nutrient–gene interactions, with an incomparable power allowing for simultaneous identification of numerous gene expressions in response to specific nutrients, diets, or physiological conditions, such as energy restriction, vitamin and mineral deficiencies, and diseases (33, 45, 46). This massive data generation approach has a great advantage to speed up our acquisition of knowledge, which will assist animal nutritionists to harness the molecular mechanisms of nutrition for improving animal production efficiency.

Principle, History, and Procedure of RNA-Seq Technology

Principle and history of DNA sequencing technologies

The revolutionary development of DNA sequencing technologies from Sanger's capillary-based sequencing (a.k.a. first-generation sequencing) to high-throughput next-generation sequencing (NGS) is currently the hottest topic in the field of genomics and has a great impact on various other fields of science (30, 47, 48). In this history, Sanger's enzymatic method (49, 50) and Gilbert's chemical degradation method (51) are the 2 landmarks of innovation. Sanger's method has been most widely used as a dominant gold standard for DNA sequencing in the past 30–40 y (52) because of its lesser complexity when compared with Gilbert's method. The basic principle of Sanger's method is sequencing-after-synthesis, which is based on separation of different sizes of DNA fragments generated by chain-termination with dideoxynucleotide analogs (49), whereas Gilbert's method needs to terminally label DNA fragments and to cleave them at specific bases before separation by gel electrophoresis (51). The Roche/454 pyrosequencing, the Illumina sequencing-by-synthesis, and the ABI SOLiD sequencing-by-ligation were considered the 3 leading second-generation NGS platforms, which, based on emulsion PCR amplification, rely on parallel, cyclic interrogation of sequences from spatially separated clonal amplicons (30). The 454 system was the first NGS platform based on the sequencing-by-synthesis technique (47, 53). It differs from Sanger's method because it depends on the real-time detection of pyrophosphate release upon nucleotide incorporation rather than chain termination with dideoxynucleotides (30). In the SOLiD sequencing-by-ligation system, both forward and reverse PCR primers are tethered to a solid substrate by a flexible linker (termed bridge PCR), such that all amplicons arising from any single template molecule during the amplification (driven by a DNA ligase) remain immobilized and clustered to a single physical location on an array (47, 53). On the Illumina platform, the bridge PCR, nevertheless, is somewhat unconventional in relying on alternating cycles of extension with Bst polymerase and denaturation with formamide (47). The concept of sequencing-by-synthesis from a single DNA molecule (i.e., without a prior amplification step) is currently pursued by a number of biotechnology companies, and this approach is now called the third-generation NGS technology (30). The third-generation NGS platforms include the Heliscope sequencer, SMRT (single molecule real time) sequencer, RNAP (RNA polymerase) sequencer, Nanopore sequencer, VisiGen sequencer, multiplex polony technology, and Ion Torrent technology (30). Unlike the second-generation NGS technology, the third-generation NGS technology interrogates single DNA molecules in such a way that no synchronization (a limitation of second-generation NGS technology) is required, thereby overcoming the issues related to the biases introduced by PCR amplification and dephasing (30). For the detailed principles and application advantages of those aforementioned third-generation NGS platforms, readers are encouraged to read some excellent review articles authored by Pareek et al. (30), Voelkerding et al. (52), Ansorge (53), and van Dijk et al. (54). Besides the numerous applications of NGS technology in human and animal genomics research—particularly de novo genome sequencing; whole-genome resequencing or more targeted sequencing; genomic variation and mutation detection; genome-wide profiling of epigenetic marks and chromatin structure using methyl-seq, DNase-seq, and ChIP-seq (chromatin immunoprecipitation coupled to DNA microarray); and personal genomics (30, 55)—the NGS technology is also finding an application in profiling and cataloguing the complete transcriptomes of cells, tissues, or organisms using the RNA-seq approach (44, 46). Over time, the application of RNA-Seq has become much more convenient, less expensive, and will lead to unbiased investigation of the complex transcriptomes (38, 56).

The procedure of RNA-Seq technology

As shown in , the procedure or workflow of RNA-Seq can be described in 3 principal steps, namely, laboratory analysis of tissues, bioinformatics analysis of sequence data, and biological interpretation of bioinformatics data (57).

FIGURE 2

Schematic presentation of the RNA-Seq workflow. This diagram shows the 3 principal steps of RNA-Seq procedure for mRNA profiling, which are laboratory analysis of animal tissue samples, bioinformatics analysis of the sequence data, and biological interpretation of the bioinformatics-analyzed gene expression data. Refer to the main text of this article for details. Poly-A, polyadenylated; RNA-Seq, RNA sequencing; SNP, single nucleotide polymorphism.

Laboratory analysis of animal tissues

Firstly, the fresh or frozen-thawed animal tissue samples should be processed for total RNA extraction, which will generate a heterogeneous RNA population that includes rRNA, tRNA, mRNA, and ncRNA. This RNA population is used for cDNA library preparation. Technically, only the high-quality RNA samples with an RNA integrity number >7 (out of 10) are used for further analyses. The mRNA fraction is directly harvested through targeting polyadenylated (poly-A) RNA with the use of polythymidine oligos that are covalently attached to a given substrate (e.g., magnetic beads), or indirectly go through selective rRNA depletion with exonucleases able for specific degradation (e.g., using the mRNA ONLY kit, Epicentre). The selective ribo-depletion method has an advantage for delivering all other types of RNA including mRNA, tRNA, and small ncRNA such as microRNA (miRNA) and short-interfering RNA (siRNA), while allowing for the discovery of new RNA transcripts that are not yet known (56). Based on the available literature, the basic features and specifications of some current sequencing platforms are summarized in (52, 53, 56, 58–61). Because the high-throughput sequencing methods usually generate a specific length of short reads, the long mRNA transcripts are usually fragmented to generate the required length for specific sequencing platforms. The fragmented mRNA transcripts are then reverse-transcribed to construct a cDNA library. The cDNA library will be sequenced to generate raw RNA-Seq data containing millions of short reads by using one of the sequencing platforms, such as those listed in Table 1.

TABLE 1

Currently available sequencing platforms used for RNA-Seq technology

RNA-Seq platforms (supplier company)	Short read length	Sequencing chemistry	Sequencing principle	Library type	Year²
454 GS FLX (Roche)	700 bp	Pyrosequencing, chemiluminescence	Incorporation of normal nucleotides	SE, PE, Mx	2005 (52)
Illumina Genome Analyzer (Illumina)	50–300 bp	Polymerase-based sequence-by-synthesis	Incorporation of fluorescent nucleotides	SE, PE, MP, Mx	2006 (53, 60)
ABI/SOLiD System (Thermo Fisher Scientific)	50 bp	Sequencing by ligation	Fluorescent short linkers	SE, PE, Mx	2007 (53, 56)
Ion Torrent (Thermo Fisher Scientific)	400 bp	Ion semiconductor	Measuring pH change	SE, PE, Mx	2010 (58, 60)
PacBio RS (Pacific Biosciences)	5000 bp	Single molecule real-time	Incorporation of fluorescent nucleotides	SE	2010 (59, 61)

bp, base pair; MP, mate pair read library; Mx, multiplexed sample; PE, paired end read; RNA-Seq, RNA sequencing; SE, single end read.

The year by which the platform was first introduced and the literature (in parentheses) that referred to it.

Currently available sequencing platforms used for RNA-Seq technology bp, base pair; MP, mate pair read library; Mx, multiplexed sample; PE, paired end read; RNA-Seq, RNA sequencing; SE, single end read. The year by which the platform was first introduced and the literature (in parentheses) that referred to it.

Bioinformatics analysis of sequence data

At the end of sequencing and image processing, the raw sequence data may be of poor quality or have errors from previous procedure steps, including library preparation, sequencing reactions, PCR artifacts, untrimmed adapter sequences, sequence specific bias, and other contaminants, which can affect the downstream data analysis and biological interpretation. Therefore, at the first step of data analysis, several bioinformatics tools, such as HTQC (62), FastQC (63), or NGS QC (64), are usually used to check the raw data quality. Among these bioinformatics tools, FastQC is one of the most widely used quality control software packages and provides a modular set of analyses, such as sequence quality scores, sequence guanosine and cytosine content, sequence length distribution, overrepresented sequences, and adapter content. The data of low quality should be processed with trimming tools, such as Cutadapt (65) and Trimmomatic (66), to remove the reads with low-quality bases, adapter sequences, or other contaminating sequences. The high-quality short reads can then be mapped against the available reference genome to discover their true locations using a special algorithm-based bioinformatics software program, such as Bowtie (67), TopHat2 (68, 69), or MapSplice (70). The algorithms of these tools are based on Burrows-Wheeler Transformation (71), the Smith–Waterman algorithm (72), or the combination of both. These algorithms allow software to find the optimal alignment match within an acceptable computational time and a few (e.g., 2) mismatches for each short read. If no reference genome is available, the Assembly by Short Sequences (73) or Velvet (74) tools can be used for de novo assembly of the transcriptome (75). Some other tools, such as Short Oligonucleotide Analysis Package, Novo Align, and SHRiMP, are also available that can be used for reads mapping (57, 76). The expression levels of genes or transcripts from the total count of reads can be inferred by using software tools such as Cufflinks, feature-count, and HTSeq-count (77), which facilitate the quantification of RNA species, including protein-coding RNA, long ncRNA, and small RNA such as miRNA and siRNA (78). After the actual counts of short reads have been calculated, the counts need to be normalized to minimize the influence of sequencing depth or library sizes (the total number of mapped reads), gene length dependence, and count distribution biases and differences. Several methods, including RPKM (Reads Per Kilobase of transcript per Million reads mapped), FPKM (Fragments Per Kilobase of exon model per Million reads mapped), or TPM (Transcripts Per Million), can be used for data normalization and reporting expression values. After quantification and normalization, identified genes or transcripts are subject to differential expression analysis. Several statistical packages (e.g., DESeq, Cufflinks, EdgeR, and BaySeq) are available for identification of the differentially expressed genes (DEGs) (59).

Biological interpretation of bioinformatics data

Obtaining a list of DEGs is the initial step for investigation of the biological insight of an experimental system, a developmental stage of an organism, or a particular molecular mechanism (79). Gene ontology (GO) analysis can be performed for each DEG against the GO database to find out the biological processes associated with the given DEG (80). Similarly, to understand more details about the biological context of a DEG, pathway and function enrichment analyses and network prediction can be conducted using Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways (81), DAVID (82), or other commercial knowledge systems such as Ingenuity Pathway Analysis (39, 40). Results obtained from these pathway enrichment analyses can help to identify genetic biomarkers, to annotate novel transcripts, or to provide interpretable information about the DEG associated with complex molecular interactions and biological functions.

Advantages and limitations of RNA-Seq technology

In the current postgenome era, scientists have numerous opportunities to investigate the complex interrelations between nutrients and genes in agricultural animals. Research objectives, experimental design, and organism of interest are 3 key determinants in the selection of technology for quantitative analysis of gene expression. For a transcriptomic analysis of differential gene expression with a known reference genome of a given animal, DNA microarray is a robust and inexpensive technique as compared with other methods (40, 83). In contrast to the microarray approach, the sequencing-based GWA approaches, such as the RNA-Seq approach, can directly determine all the cDNA sequences. In addition, the RNA-Seq approach can generate high-quality short-read sequences compared with other sequencing methods. Most importantly, a reference genome is helpful but not a prerequisite. A genome without reference sequence information can be sequenced using de novo assembly of the short reads generated. This methodology removes the possibility of experimental bias or cross-hybridization, and thus largely decreases the background noise. A sample from a single cell, or even in nanograms, is sufficient for the laboratory process (38, 47, 84). Nevertheless, RNA-Seq also has some limitations, mainly in the aspects of library construction and bioinformatics analysis. During library construction and laboratory sequencing, there are sample contamination from cDNA fragmentation, sequencing reactions, PCR artifacts, untrimmed adapter sequences, sequence specific biases, and other contaminants that may negatively influence data quality. In the aspect of bioinformatics analysis, RNA-Seq also faces several challenges that include 1) development of a simple and efficient method for data analysis, 2) inability to directly analyze the short transcipt reads containing exon junctions or poly-A ends, and 3) difficulty of mapping the reads that span splice junctions in a complex transcriptome owing to the presence of extensive alternative splicing and trans-splicing (38). Altogether, at the present time, RNA-Seq is still a preferred method for analysis of transcriptomes to disentangle the complex relations between nutrients and genomes in life sciences.

Application of RNA-Seq Technology in Animal Nutrition Studies

Application in general animal science studies

High-throughput RNA-Seq has become an approach of choice for enabling inexpensive and routine comprehensive analysis of animal transcriptomes or genomes (47, 79). This technology is primarily applied for quantitative determination of the expression patterns of transcripts or genes (known and unknown), the mRNA splice variants, and the analysis of SNPs which can be used as potential biomarkers for particular production traits (46, 57, 85). Studies have been conducted using RNA-Seq in agriculture animals to explore global gene expression patterns in tissues that are related to economically important traits such as feed efficacies (86–90). Indeed, recent progress in genome sequencing of agricultural animals including swine (91), cattle (92), and sheep (93) has provided important reference genomes that can be used to align RNA-Seq short reads to determine the changes in gene expression in response to dietary nutrients. Therefore, animal nutritionists may use these genomics data to establish some framework for developing novel feeds or feed additives, which will be genotype-specific to promote animal growth, health, and production. The objective of the following sections is to specifically review the relevant examples of RNA-Seq application in studying nutrient–gene interactions in agricultural animals.

Application in maternal nutrition studies

Maternal methionine plays a vital role in the regulation of in utero fetal development through epigenetic modifications in the fetal genome such as DNA methylation, which is dependent on the availability of methyl donor nutrients (e.g., methionine). Peñagaricano et al. (94) employed RNA-Seq to evaluate the effect of maternal methionine supplementation (2.43% compared with 1.89% of the dietary metabolizable protein in the experimental and the control diets, respectively) on the transcriptome of the preimplantation embryos of Holstein cows. Their results indicated that 276 out of 10,662 genes were differentially expressed between the treatments; 200 genes showed higher expression in the control treatment, while 76 genes showed higher expression in the methionine-rich treatment. Some of these DEGs [e.g., vimentin (VIM), interferon, α-inducible protein 6 (IFI6), BCL2-related protein A1 (BCL2A1), and T-box 15 (TBX15)] were associated with the regulation of embryonic development and the others [e.g., natural killer cell group 7 (NKG7) and TYRO protein tyrosine kinase-binding protein (TYROBP)] were associated with animal immune response. More precisely, the authors demonstrated the possible effects of maternal methionine supplementation from GO analysis of the biological process of embryonic tube development, where 8 of the 11 significant genes were decreased in the methionine-rich treatment. Although morphological evaluation showed similar ratings of embryos in both treatments for developmental stage, significant transcriptomic differences were detected. Increased levels of methionine (i.e., the methyl donor) in the one-carbon pathway would increase the DNA methylation levels of many fetal genes, which in turn could suppress gene expression. Another study assessed the impacts of different isoenergetic maternal diets, including alfalfa haylage (HY; for fiber), corn (CN; for starch), and dried corn distillers grains (DG; for fiber, protein, and fat), fed to sheep, on the transcriptomes of fetal muscle and subcutaneous and perirenal adipose tissues (95). In longissimus dorsi, a total of 224, 823, and 29 genes showed differential expression between CN and DG, CN and HY, and DG and HY, respectively. Specifically, the maternal CN-fed group showed decreased gene expression (168 out of 224 genes, and 600 out of 823) compared with the DG and HY groups, respectively. Interestingly, 166 genes differed simultaneously between the CN and the other 2 (HY and DG) groups. Many of these genes are directly involved in embryonic and fetal development [e.g., ankyrin repeat domain 11 (ANKRD11), axin 1 (AXIN1), epsin 1 (EPN1), and epsin 2 (EPN2)], skeletal muscle cell and tissue differentiation [e.g., ankyrin repeat domain 1 (ANKRD1), B cell CLL/lymphoma 9-like (BCL9L), histone cell cycle regulator (HIRA), and myogenic differentiation 1 (MYOD1)], and muscle myosin complex and sarcomere organization [e.g., myosin heavy chain 13 (MYH13)]. In the subcutaneous adipose tissue, the DEGs are associated with the embryonic and fetal development [e.g., angiomotin (AMOT) and integrin, β 6 (ITGB6)], the adipose tissue development [e.g., acetoacetyl-CoA synthetase (AACS)], and the fatty acid biosynthetic process [e.g., MLX interacting protein-like (MLXIPL) and protein kinase, AMP-activated, α 2 catalytic subunit (PRKAA2)]. Therefore, the authors concluded that alteration of maternal nutrition during the mid-to-late gestation stages may change the fetal programming of fetal muscle and adipose tissues.

Application in feeding strategy studies

Feed restriction and re-alimentation is a common feeding strategy used in animal agriculture to reduce production cost (96). The difference in nutrient supply during the 2 feeding regimes has profound effects on genome expression that could lead animals to develop an accelerated growth phenomenon known as compensatory gain (CG). The candidate genes or signatures related to CG can be identified from gene and transcript profiling. Several studies focused on the feeding restriction and re-alimentation effects on skeletal muscle (97) and hepatic tissues (98) have been conducted on postweaned beef cattle. From evaluation of the DEGs in skeletal muscle and hepatic tissues, it was found that several genes were commonly expressed and followed the same direction of change between the studies. The genes dehydrogenase/reductase 3 (DHRS3), collagen type I α1 chain (COL1A1), solute carrier family 27 member 6 (SLC27A6), osteonectin, transcription elongation factor A3 (TCEA3), and VIM are of particular interest, and hold potential for further investigation with regard to their use as biomarkers for CG selection. Diets, such as grass and grain rations, have a significant role in determining the fatty acid profile, antioxidant content, lipid deposition, and metabolism of ruminal proteins, carbohydrates, and lipids. A study was conducted on 2 grass-fed compared with 2 grain-fed Angus beef cattle using RNA-Seq to explore and compare the transcriptomic profiles of ruminal walls (99). A total of 342 DEGs were found between the 2 feeding regimens. Among the top 10 DEGs, desmoglein-1 (DSG1) is related to embryonic, organ, and organismal development, whereas R-spondin 3 (RSPO3) is related to abnormal morphology and organismal death. In the absence of DSG1, the phosphorylation of the RNA polymerase II carboxy-terminal domain may be transformed, which would affect the recruitment of RNA processing machinery as well as protein synthesis. As is known, the RSPO3 gene product is a novel protein in the Wnt signaling pathway, one of the key pathways controlling cell differentiation, cell proliferation, and morphogenesis. The distinct feeding regimens (grass-fed and grain-fed) had differential effects on the cattle transcriptome and affected cattle growth rate and carcass characteristics. Baldwin et al. (100) investigated the effects of dietary propionate supplementation on longissimus lumborum (LL) transcriptome in Black Angus beef steers using RNA-Seq. A total of 110 genes (74 upregulated and 36 downregulated) were differentially expressed in response to propionate in LL tissue. The network analysis result revealed the top 4 gene networks were associated with lipid metabolism, small-molecule biochemistry, carbohydrate metabolism, and molecular transport. Using microarray technology with quantitative PCR verification, Piantoni et al. (101) reported some changes of the transcript profile of the mammary gland in preweaned Holstein heifers in response to dietary nutrient amount. Piantoni et al. (101) concluded that these changes might underlie the observed differences in tissue mass and development. The results on bovine milk transcript profile obtained by Wickramasinghe et al. (33) using RNA-Seq revealed some key information for understanding lactation biology in terms of gene expression. To analyze gene expression in response to diets rich in unsaturated fatty acids, Ibeagha-Awemu et al. (102) using RNA-Seq completed transcript profiling of the bovine mammary gland. Their findings suggested that a diet rich in α-linolenic acid has a significant impact on the transcriptome of the mammary gland in altering the expression of genes associated with lipid metabolism, which helps to decrease the content of SFAs and increase the content of PUFAs in the milk. Recently, Vailati-Riboni et al. (103) identified DEGs in mammary parenchyma (PAR) and mammary fat pad (MFP) tissues of Holstein heifer calves, which were fed either a restricted milk replacer or an enhanced milk replacer. A total of 1561 genes (895 upregulated, 666 downregulated) and of 970 genes (506 upregulated, 464 downregulated) were differentially expressed in PAR and MFP tissues, respectively. Direct impact analysis of DEGs revealed possible molecular mechanisms in response to enhanced diet or enhanced plane nutrition for early mammary gland development before first lactation (103). Similarly in dairy cattle, Baldwin et al. (104) demonstrated the effects of butyrate on rumen epithelium transcriptome during a dry period. Butyrate-induced genes expressed in the rumen epithelium were involved with the mitotic cell cycle process, cell cycle process, and regulation of the cell cycle. For example, the genes histone H3 and histone H4 were centered in one of the gene networks, which suggests that butyrate is not only an element of nutrients but also a regulator of histone modification and gene expression. Dai et al. (105) studied the effects of rice straw (low-quality forage) on milk protein production. Functional analysis of the DEGs suggested that the enhanced capacity for energy and fatty acid metabolism, increased protein degradation, decreased protein synthesis, decreased amino acid metabolism, and depressed cell growth were all related to rice straw consumption. In addition, Dai et al. (106) also found multiple DEGs related to reduced energy metabolism, attenuated protein synthesis, enhanced protein degradation, and lower mammary cell growth in the mammary glands of lactating cows fed corn stover. With more studies building on those aforementioned nutrigenomics data, it is possible to design some improved nutritional strategies to yield high-quality cow milk. Besides beef and dairy cattle, RNA-Seq has also been used in studying nutrient–gene interactions in other ruminant animals such as goats and sheep. For example, Qu et al. (107) using RNA-Seq investigated the effect of dietary vitamin E supplementation in sheep on spermatogenesis and the associated regulatory mechanisms. This investigation detected a number of DEGs, such as N-myc downstream regulated 1 (NDRG1), fascin actin-bundling protein 3 (FSCN3), and cytochrome P450, family 26, subfamily B, polypeptide 1 (CYP26B1). It was known that these genes have important roles in the regulation of spermatogenesis (107).

Application in gut microbiota studies

Gut microbiota, diet, and gut health are interconnected through a complex interplay (108). It is well known that macronutrients (proteins, carbohydrates, and lipids) and other environmental factors play major roles in altering the living environment of enteric cells (109, 110). It is also known that the fermentation of dietary components by the gut microbiota contributes a great deal to generating nutrients and energy for animals, especially the ruminants (111). Despite a body of knowledge about the effects of nutrients on microbial gene expression in the gut, a lot more questions pertaining to early microbial colonization in the mammalian intestine remain unanswered. As to this topic, research applying RNA-Seq was conducted to evaluate the effects of sow milk (mother fed) compared with artificial formula (formula fed) on the community-wide gut microbiota in 21-d-old neonatal piglets (112). The gut microbial cDNA profiling showed that the microbial communities were similar at the phylum level but were dissimilar at the genus level. Prevotella was a dominant genus within the mother-fed group, whereas Bacteroides was the most abundant genus within the formula-fed group. Pitta et al. (113) used pyrosequencing to identify changes in ruminal bacterial populations in response to induction of and recovery from diet-induced milk fat depression. Their analysis revealed that the induction diet reduced the relative sequence abundance of Bacteroidetes and increased the relative sequence abundance of Firmicutes and Actinobacteria. On the other hand, the recovery diet resulted in a sharp increase in the Bacteroidetes lineages and a decrease in Firmicutes members. The authors concluded that alterations in milk fatty acid profiles at induction are preceded by microbial alterations in the rumen driven by dietary changes. The results obtained from dietary nutrition and gut microbiota interactions will provide a reference frame for similar experiments with different diets for different agricultural animals.

Application in swine science and nutrition

In recent years, transcriptomic technologies have been successfully applied to study quantitative genetics and nutrient–gene interactions, and to explore more innovative genomic information in agricultural animals. However, only a limited number of studies have employed RNA-Seq to study the nutrient–gene interactions in pigs. The absence of a complete pig genome sequence was the major limitation for quantitative genetics and nutrigenomics research (91). Nonetheless, the first version of the annotated pig genome (Sscrofa 9) was released with Ensembl 56 in September 2009 (91, 114) and the updated version (Sscrofa 11.1), a high-quality draft, has been available at the Ensembl database since July 2017. Now the updated sequence of the porcine genome has opened a new avenue for swine genetics and nutrition research. As is known, genetic potential and nutrient supply are 2 essential factors that determine pork production profitability and sustainability. Modern swine production combines both genetics (e.g., genetic selection and quantitative loci mapping) and nutritional management (e.g., the optimal nutrient utilization by the animal) to improve economically important traits such as growth rate, feed intake, leanness, meat quality, litter size, and disease resistance. Nutritional factors, such as methyl donor nutrients (methionine, folate, choline, and vitamins B-6 and B-12), may modulate or reshape the pig epigenome by methylation during fetal development (115, 116). The effects of gestation diets supplemented with methyl donor nutrients on the liver transcriptome of pig offspring have been investigated with RNA-Seq, and the results showed that the methyl donor nutrients influence the expression of genes related to nucleic acid metabolism during fetal development and lipid metabolism at the adult stage (117). Biomedical studies have shown that dietary fat, a vital class of nutrients and a source of energy, can regulate gene expression through well-characterized transcription factors such as peroxisome proliferator-activated receptors (PPARs), hepatocyte nuclear factor 4α (HNF4A), NF-κB, and sterol regulatory element binding protein 1c (SREBP1c) (118). In pigs, Oczkowicz et al. (119) reported the impact of dietary fats (rapeseed oil, beef tallow, and coconut oil) fed along with dried distillers grains with solubles (DDGS) on the liver transcriptomic profile. The authors found 39 DEGs among the treatment groups fed the diets with different added fats and different concentrations of DDGS. The majority of these genes were those involved in the regulation of lipid metabolism. Specifically, the genes encoding the cytochrome P45 (CYP45) family of proteins (e.g., CYP2B22, CYP2C49) responsible for lipid homeostasis were mostly affected. Szostak et al. (120) using RNA-Seq investigated the effect of a diet enriched in ω-6 and ω-3 fatty acids on pig liver transcriptomic gene expression. Dietary ω-3 fatty acids increased the expression of genes related to fatty acid β-oxidation. The expression of the cytochrome P450 family 7 subfamily A member 1 (CYP7A1), a key member in the PPAR signaling pathway, was significantly decreased. CYP7A1 is regulated by liver X receptor (LXR), a key nuclear receptor that regulates the expression of genes involved in hepatic bile and fatty acid synthesis, Glc metabolism, and sterol efflux (121). Moreover, the expression of lipogenic genes, such as acetyl-CoA carboxylase-1 (ACACA) and fatty acid synthase (FASN), was decreased, indicating a decreased production of SCFAs. These results suggested that a decreased ratio of ω-6 to ω-3 fatty acid could alter the PUFA metabolic pathway. Integration of all the knowledge obtained from the previous studies not only will better our understanding of nutrient–gene interactions but also can be applied to develop feeding strategies to improve the production efficiency of the swine industry. In addition, the International Swine Methylome Consortium recently started to function for generation of a porcine reference methylome map, which can help to explain the methylation pattern in the swine genome (122). Altogether, the combination of genomic information, growth performance data, and marker-assisted selection technology will provide great potential to significantly improve swine production efficiency.

Conclusions

In short, the novel RNA-Seq technology for transcriptomics analyses of living organisms is considered as a powerful approach to better understand molecular mechanisms in terms of nutrient–gene interactions, although the application of this technology still faces some technical challenges in both its experimental and computational aspects. For agricultural animals, scientists can use this technology to study some economically important production traits, such as feed efficiency, litter size, genetic markers, and breed selection, although the magnitude of this use is still limited. Because RNA-Seq holds great potential to be employed to explore the new insights into the understanding of nutrient regulation of animal growth, development, health, and reproduction, it is expected that the application of RNA-Seq in animal nutrition studies to elucidate the molecular mechanisms, especially in terms of nutrient–gene or environment–gene interactions, will greatly enhance our research capacity in the foreseeable future.

111 in total

Review 1. The PPARs: from orphan receptors to drug discovery.

Authors: T M Willson; P J Brown; D D Sternbach; B R Henke
Journal: J Med Chem Date: 2000-02-24 Impact factor: 7.446

2. A rapid method for determining sequences in DNA by primed synthesis with DNA polymerase.

Authors: F Sanger; A R Coulson
Journal: J Mol Biol Date: 1975-05-25 Impact factor: 5.469

Review 3. Glucose regulation of gene transcription.

Authors: S Vaulont; M Vasseur-Cognet; A Kahn
Journal: J Biol Chem Date: 2000-10-13 Impact factor: 5.157

Review 4. Dietary regulation of intestinal gene expression.

Authors: I R Sanderson; S Naik
Journal: Annu Rev Nutr Date: 2000 Impact factor: 11.848

Review 5. Glycogen synthase kinase 3: an emerging therapeutic target.

Authors: Hagit Eldar-Finkelman
Journal: Trends Mol Med Date: 2002-03 Impact factor: 11.951

Review 6. Nutrigenomics: goals and strategies.

Authors: Michael Müller; Sander Kersten
Journal: Nat Rev Genet Date: 2003-04 Impact factor: 53.242

Review 7. Nutrigenomics: exploiting systems biology in the nutrition and health arena.

Authors: Ben van Ommen; Rob Stierum
Journal: Curr Opin Biotechnol Date: 2002-10 Impact factor: 9.740

Review 8. Nutritional regulation of gene expression.

Authors: R J Cousins
Journal: Am J Med Date: 1999-01-25 Impact factor: 4.965

Review 9. Regulation of gene expression by glucose.

Authors: P Ferré
Journal: Proc Nutr Soc Date: 1999-08 Impact factor: 6.297

Review 10. Glucose-sensing mechanisms in eukaryotic cells.

Authors: F Rolland; J Winderickx; J M Thevelein
Journal: Trends Biochem Sci Date: 2001-05 Impact factor: 13.807

4 in total

1. Dietary lysine affects amino acid metabolism and growth performance, which may not involve the GH/IGF-1 axis, in young growing pigs1.

Authors: M Shamimul Hasan; Mark A Crenshaw; Shengfa F Liao
Journal: J Anim Sci Date: 2020-01-01 Impact factor: 3.159