| Literature DB >> 34773482 |
Sandrine Lagarrigue1, Matthias Lorthiois2, Fabien Degalez3, David Gilot4, Thomas Derrien5.
Abstract
Animal genomes are pervasively transcribed into multiple RNA molecules, of which many will not be translated into proteins. One major component of this transcribed non-coding genome is the long non-coding RNAs (lncRNAs), which are defined as transcripts longer than 200 nucleotides with low coding-potential capabilities. Domestic animals constitute a unique resource for studying the genetic and epigenetic basis of phenotypic variations involving protein-coding and non-coding RNAs, such as lncRNAs. This review presents the current knowledge regarding transcriptome-based catalogues of lncRNAs in major domesticated animals (pets and livestock species), covering a broad phylogenetic scale (from dogs to chicken), and in comparison with human and mouse lncRNA catalogues. Furthermore, we describe different methods to extract known or discover novel lncRNAs and explore comparative genomics approaches to strengthen the annotation of lncRNAs. We then detail different strategies contributing to a better understanding of lncRNA functions, from genetic studies such as GWAS to molecular biology experiments and give some case examples in domestic animals. Finally, we discuss the limitations of current lncRNA annotations and suggest research directions to improve them and their functional characterisation.Entities:
Mesh:
Substances:
Year: 2021 PMID: 34773482 PMCID: PMC9114084 DOI: 10.1007/s00335-021-09928-7
Source DB: PubMed Journal: Mamm Genome ISSN: 0938-8990 Impact factor: 3.224
Bioinformatic tools for annotating and classifying lncRNAs from multi-species databases
| Database name | Read mapping | Gene model-ling | Coding-potential assesse-ment | Number of lncRNA genes/transcripts by species ( | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Human | Mouse | Cow | Pig | Chicken | Dog | Horse | ||||
| Ensembl (v104) | BWA | Exone-rate | ORF and PFAM align-menta | 16 896/46 960 (GRC g38.p13) | 9972/12 601 (GRC m39) | 1488/2199 (ARS_ UCD1.2) | 6979/9367 (Sscrofa 11.1) | 5506/8870 (GRC g6a) | 7083/12 283 (Can Fam 3.1) | 7244/11 978 (Equ Cab 3.0) |
| NCBI (v105) | Minimap2 (long read) Spilign (short-read) | Gnomon | Gnomon | 16 375/27 838 (GRC g38.p13) | 13 317/23 542 (GRC m39) | 5183/7254 (ARS_ UCD1.2) | 5605/9292 (Sscrofa 11.1) | 5147/8233 (GRC g6a) | 10 823/19 248 (Can Fam 3.1) | 6789/10 850 (Equ Cab 3.0) |
| NONCODE (v6.0) | Literature parsing with RNAseq key words + "CuffCompare" to deal with overlapping features | Comparison with RefSeq + "CNIT" | 96 411/173 112 (GRC g38) | 87 890/131 974 (GRC m39) | 22 127/23 515 (UMD 3.1) | 17 811/29 858 (Sscrofa 10.2) | 9527/12 850 (galgal4) | NA | NA | |
The number of lncRNAs found for each species with the corresponding assembly used is also presented
aGene models containing a substantial open reading frame (ORF) and protein domains (e.g. from Pfam) are classified as coding. For human and mouse annotations, additional manual curations from Gencode
Fig. 1Characterization of lncRNA and mRNA gene structures in 5 domesticated animals (dog, horse, cow, pig, and chicken, respectively in dark green, orange, purple, pink, and light green) in comparison with mouse and human annotations (light and dark grey respectively) extracted from Ensembl (v103). A Comparison of the number of lncRNA and mRNA genes, transcripts, and exons (number of lncRNA and mRNA features are indicated on top of each bar). B Boxplot distributions of the length of lncRNA and mRNA transcripts and exons. C ORF coverage of Ensembl-based lncRNAs annotated as protein-coding by the FEELnc program
Fig. 2Distribution of reads supporting lncRNAs and mRNAs (A) and gene overlap between NCBI and Ensembl resources according to both biotypes (B). A For each gene biotype (lncRNAs in blue and mRNAs in red), the dark, intermediate and light shades correspond to the percentage of reads supporting all expressed genes, 25% of the most expressed genes and the 10 most expressed genes respectively. RNAseq data correspond to the chicken PRJEB28745 project and 4 tissues (adip adipose tissue, livr liver, blod blood, hypt hypothalamus) of the same population (Rhode Island Red). B) Percentages of chicken lncRNA gene overlap—using 1 bp or more—between the GRCg6a—V104 Ensembl and NCBI gene catalogues. Note that these overlaps have been computed at the gene level given the uncertainty of isoform modelling with short-reads as explained in the main text
LncRNA studies associated with trait-related tissues in dog and livestock species
| Tissues | Related traits/disease | Species | References |
|---|---|---|---|
| A. Dog | |||
| Retina | X-linked progressive retinal atrophy | Dog | (Appelbaum et al. |
| Various | Breed morphology ( | Dog | (Plassais et al. |
| Mucosal and skin tissues | Mucosal melanoma | Dog | (Hitte et al. |
| Lymph node | Lymphoma | Dog | (Verma et al. |
| B. Three major species: pig, chicken, and cow* | |||
| Muscle | Growth performance and meat quality | Pig | (J. Sun et al. |
| Chicken | (Li et al. | ||
| Cow | (Choi et al. | ||
| Mammary gland | Milk production and quality | Cow | (Tong et al. |
| Immunity tissues | Disease or resistance against pathogenic infections | Pig | (Fang et al. |
| Chicken | (Qiu et al. | ||
| Cow | (Özdemir and Altun | ||
| Male sexual organs | Male reproduction traits | Pig | (Esteve-Codina et al. |
| Chicken | (Liu et al. | ||
| Cow | (Wang et al. | ||
| Female sexual organs | Female reproduction traits | Pig | (Wang et al. |
| Chicken | (Liu et al. | ||
| Liver and adipose tissues | Body lipid reserves and metabolic efficiency | Pig | (Wang et al. |
| Chicken | (Muret et al. | ||
| Cow | (Nolte et al. | ||
| Intestine | NA | Cow | (Weikard et al. |
| Spleen | NA | Pig | (Che et al. |
| Chicken | (You et al. | ||
| C. Other livestock species* | |||
- Liver and cerebral parietal lobe - Placenta - Eight tissues | Horse | (Dahlgren et al. | |
- Skin - Endometrium - Ovary and follicle | Goat | (Ren et al. | |
- Multiple tissues - Wool - Pituitary - Oocyte development - Consensus set of ruminant lncRNAs | Sheep | (Bakhtiarizadeh et al. consensus set of ruminant lncRNAs provided by Bush et al. | |
- Muscle - Adipose tissue - Skin - Embryos | Rabbit | (Kuang et al. | |
- Ovary - Brain, lung and spleen - Embryo fibroblast cells | Duck | (Ren et al. | |
- Testes - Ovary | Geese | (Ran et al. | |
*Updates from two previous reviews (Weikard et al. 2017 and Kosinska-Selbi et al. 2020)
Fig. 3Phylogenetic divergence between domesticated species, mouse, and human. A Red numbers correspond to the common ancestor of different species. This tree was generated using the TimeTree database (Kumar et al. 2017). Distances were calculated from estimated molecular time. B. Genomic conservation of 2 lncRNAs (in green) in divergent position extracted from Foissac et al. (Foissac et al. 2019)
Fig. 4Syntenic conservation of lncRNAs across 7 species. A schema of "1–1" and "n–1" principles of positionally conserved lncRNAs. The "1–1" corresponds to the case of a strict and unique syntenic equivalent in both species located in-between two adjacent "1–1" protein-coding genes. The "n–1" corresponds to the case of multiple lncRNA loci in the analysed species that corresponds to an unique lncRNA in human located between the two "1–1" protein-coding genes. B Number of lncRNA for each homology category across species with numbers of lncRNA loci (in italic) extracted from Ensembl (v104). The "*" indicates the chicken lncRNA-enriched annotation anchored on the v101 (equivalent to v104) Ensembl resource (Jehl et al. 2020)
Fig. 5Association between transposable elements (TEs) annotated by RepeatMasker and long non-coding RNAs annotated by Ensembl (v103) in 5 genome assemblies (canFam3, equCab3, bosTau9, susScr11 and galGal6). A Proportion of the genome covered by four TEs classes: LINEs, SINEs, LTRs, and DNA_transposons in green, blue, orange, and grey, respectively. B Proportion of Ensembl-based lncRNA transcripts overlapped by TEs for three fractions overlap (≥ 1 nucleotide, ≥ 5%, and ≥ 10% of the lncRNA sequence) in five domesticated species (dog, horse, cow, pig, and chicken, respectively, in dark green, orange, purple, pink, and light green)
LncRNA studies associated with in vitro functional analyses for livestock species
| lncRNA name | lncRNA impact | Cellular model | Strategy | Year (Refs) |
|---|---|---|---|---|
| A. Chicken | ||||
Embryonic development Sex determination | Egg (0-day blastoderms) | OverEx | 2012 (Roeszler et al. | |
| B. Cow | ||||
| Impact on SIRT1 by competing with miR-204 as a ceRNA to regulate adipogenesis | HEK293T,HEK293A & ADSC cells | OverEx KD by siRNA | 2016 (Li et al. | |
| Embryonic developmental rates | Cattle matured oocytes | KD by siRNA | 2015 (Caballero et al. | |
| Differentiation of satellite cells. Blocking of the Sirt1/FoxO1 pathway during myogenesis | C2C12 cells & satellite cells (from adult cattle muscle) | OverEx KD by pLenti-NTC interference vector | 2017 (Xu et al. | |
Inhibit myogenic differentiation of bovine skeletal muscle satellite cells Negatively regulated gene Myf6 and positively regulated protein KRAS | Satellite cells (from foetal bovine muscle) | OverEx KD by siRNA | 2020 (Zhang et al. | |
| Promote proliferation and differentiation of bovine myoblasts through various pathways | Myoblasts (from foetal bovine muscle) | OverEx KD by siRNA | 2020 (Song et al. | |
| C. Pig | ||||
| Associated with adipogenesis and effect in intramuscular preadipocyte proliferation and differentiation | Intramuscular preadipocytes (from 2 pig breeds) | KD by siRNA | 2020 (Sun et al. | |
| Decreasing of Myod, MyoG and MyHC such as glycolysis and pyruvate metabolism which are related to skeletal muscle satellite cell differentiation | Skeletal muscle satellite cells | KD by ASO | 2019 (Huang et al. | |
| Regulate AKR1C1 and progesterone metabolism | Endometrial cells | OverEx KD by ASO | 2020 (Su et al. | |
KD knock-down, OverEx overexpression