Literature DB >> 24452801

Transcriptome-wide investigation of genomic imprinting in chicken.

Laure Frésard1, Sophie Leroux, Bertrand Servin, David Gourichon, Patrice Dehais, Magali San Cristobal, Nathalie Marsaud, Florence Vignoles, Bertrand Bed'hom, Jean-Luc Coville, Farhad Hormozdiari, Catherine Beaumont, Tatiana Zerjal, Alain Vignal, Mireille Morisson, Sandrine Lagarrigue, Frédérique Pitel.   

Abstract

Genomic imprinting is an epigenetic mechanism by which alleles of some specific genes are expressed in a parent-of-origin manner. It has been observed in mammals and marsupials, but not in birds. Until now, only a few genes orthologous to mammalian imprinted ones have been analyzed in chicken and did not demonstrate any evidence of imprinting in this species. However, several published observations such as imprinted-like QTL in poultry or reciprocal effects keep the question open. Our main objective was thus to screen the entire chicken genome for parental-allele-specific differential expression on whole embryonic transcriptomes, using high-throughput sequencing. To identify the parental origin of each observed haplotype, two chicken experimental populations were used, as inbred and as genetically distant as possible. Two families were produced from two reciprocal crosses. Transcripts from 20 embryos were sequenced using NGS technology, producing ∼200 Gb of sequences. This allowed the detection of 79 potentially imprinted SNPs, through an analysis method that we validated by detecting imprinting from mouse data already published. However, out of 23 candidates tested by pyrosequencing, none could be confirmed. These results come together, without a priori, with previous statements and phylogenetic considerations assessing the absence of genomic imprinting in chicken.

Entities:  

Mesh:

Year:  2014        PMID: 24452801      PMCID: PMC3973300          DOI: 10.1093/nar/gkt1390

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

Parental genomic imprinting is a process that leads to the differential expression of alleles depending on their parental origin. This phenomenon has been described in eutherian mammals, marsupials, plants and insects, but has never been observed in birds (1,2). Even if not the only one (3), the main theory proposed to explain this phenomenon is the parental conflict hypothesis, also known as the kinship theory. Briefly, imprinting is leading to an imbalance of parental allele contribution, and the parental conflict hypothesis argues that the genes controlling the allocation of resources to the embryo should be particularly affected, with the maternal genome restricting the use of resources, allowing to preserve the mother and future progeny, and the paternal genome favoring growth of his offspring (4–8). Readers are referred to (9) for an in-depth discussion of the kinship theory. According to this theory, imprinting should be restricted to organisms in which maternal resources directly affect the embryo. It would therefore be unlikely to observe this phenomenon in oviparous animals (10), where the amount of resources is limited by the egg size, controlled by the mother. Genomic imprinting evolved both in angiosperm plants and placental mammals (11) but is also seen in insects, and it has even been shown that it arose several times during mammals' evolution due to different selective pressure at several loci (12). Regardless of the kinship theory, the appearance of genomic imprinting in nonmammalian animals is thus not implausible. Genomic imprinting was studied in birds, but until now, only the expression of genes known to be imprinted in mammals has been analyzed in chickens, most of them showing biallelic expression and thus, leading to the conclusion of the lack of imprinting in this species: several studies have been conducted on IGF2, known to be paternally expressed in mice and humans (13), all of them except one (14) leading to the conclusion that it was biallelically expressed (15–19). In this genomic region, the H19 imprinting center, controlling an imprinted cluster including IGF2, is shown to be absent from the chicken genome (19). Other genes known to be imprinted in mammals have been shown to be biallelically expressed in chicken, such as ASCL2/MASH2, M6PR/IGF2R, DLK1 and UBE3A (19–21). However, parent-of-origin QTLs have been detected in chicken and quail (22–24). Whereas some of them may finally not be considered as relevant to genomic imprinting due to linkage disequilibrium or bias generated by the experimental design (25,26), others appeared to be consistent, when using appropriate experimental animal population designs and methodology (25,27). Moreover, reciprocal effects, of great importance in poultry production, may be partly explained by imprinting. Together with several molecular features, such as the absence of DNMT3L (key factor of the establishment of DNA methylation) in the chicken genome (28) and the possible existence of imprinting mechanisms other than DNA methylation (29,30), these contradictory observations encourage to definitely answer the question (31). Advances in sequencing technologies now allow to study genomic imprinting through whole transcriptome sequencing (32–34), avoiding any a priori choice of target genomic regions, such as the chicken homologous of imprinted regions in mammals. We therefore chose Next Generation Sequencing (NGS) technology to analyze imprinting in chicken, without any preconception on whether it exists. Chicken is a model species for numerous studies in birds, and the availability of inbred lines is a great advantage for our purpose: we set up an experimental design consisting in two families from two reciprocal crosses between chicken lines, chosen so as to allow the identification of the parental origin of each allele in embryonic offspring. We sequenced whole embryos transcriptomes, and the relative expression level of each parental allele was compared, for the informative loci, between both cross ways.

MATERIALS AND METHODS

Ethics statement

Chickens were bred at INRA, UE1295 Pôle d’Expérimentation Avicole de Tours, F-37380 Nouzilly, in accordance with European Union Guidelines for animal care. The farm is registered by the Ministry of Agriculture with the license number C37–175–1 for animal experimentation. The experiment was realized under authorization 37–002 delivered to D. Gourichon.

Chicken lines and crosses

To maximize the probability of identifying the parental origin of an allele in F1 individuals from two reciprocal crosses, we selected 2 chicken lines among 10 (Supplementary Table S1), available at INRA experimental farm (UE1295 PEAT, F-37380 Nouzilly, France). Genotyping was performed on 3–12 individuals from each population with an Illumina 57K Single Nucleotide Polymorphism (SNP) array [57 636 SNPs (35)]. SNPs were filtered (call freq > 0.9, GenTrainScore > 0.5, Minor Allele Frequency (MAF) > 0.01) and allele frequencies within each line calculated with plink, option—freq (36). Genetic similarities between lines (homozygozities for a given line) were then calculated (Figure 1) from the 52 189 filtered SNPs. The most genetically distant pair of lines together with the highest homozygosity each were selected, i.e. the Line 6 (37) and the Line R− (38). A male and a female from each line were reciprocally crossed, and 24 F1 embryos, 12 from each cross (6 males and 6 females per cross), were harvested from the same batch at embryonic day 4.5 (stage 26). Genomic DNA and total RNA were concurrently extracted from the same samples of crushed whole embryos without extraembryonic membranes (AllPrep DNA/RNA Mini Kit, Qiagen). RNA quality was measured by a BioAnalyzer (Agilent); all samples had a RNA Integrity Number ≥ 9.9.
Figure 1.

Experimental design. (A) Heatmap of genetic similarities between lines. Colors correspond to the inbred level of tested individuals (red is inbred). Lines 6 and R− (red arrows) were chosen to be as inbred and as genetically distant as possible. (B) Reciprocal cross between the two selected lines. In case of genomic imprinting, embryos are preferentially expressing one of their parents’ allele (this figure shows a case of paternally expressed gene). Detection of such event is possible with polymorphisms differentiating the lines.

Experimental design. (A) Heatmap of genetic similarities between lines. Colors correspond to the inbred level of tested individuals (red is inbred). Lines 6 and R− (red arrows) were chosen to be as inbred and as genetically distant as possible. (B) Reciprocal cross between the two selected lines. In case of genomic imprinting, embryos are preferentially expressing one of their parents’ allele (this figure shows a case of paternally expressed gene). Detection of such event is possible with polymorphisms differentiating the lines. Parental genomic DNA was extracted from blood samples through a high-salt extraction method (39).

Sequencing

Chicken data

RNA sequencing

Libraries with a mean insert size of 200 bp were prepared following Illumina instruction for RNA-Seq analysis, by selecting polyA+ fragments (TruSeq RNA Sample Prep Kit) from each sample. Samples were tagged to allow subsequent identification, amplified by polymerase chain reaction (PCR) and quantified by quantitative PCR (Agilent QPCR Library Quantification Kit) and then sequenced (paired ends, 100 bp) in triplicate on an Illumina HiSeq 2000 sequencer (Illumina, TruSeq PE Cluster Kit v3, cBot and TruSeq SBS Kit v3) by randomizing their position in six different sequencing lanes.

DNA sequencing

Parental DNA was sequenced on four lanes of Illumina HiSeq 2000 to allow the detection of SNPs discriminating the two lines. Library preparation (mean insert size of 300 bp), DNA quantification and sequencing (paired ends, 100 bp) were performed according to the manufacturer instructions (TruSeq DNA Sample Prep Kit Illumina, Agilent QPCR Library Quantification Kit, TruSeq PE Cluster Kit v3 cBot TruSeq SBS Kit v3).

Mouse data

RNA-seq sequences were collected from a previous study (40). They correspond to liver mRNA from 18 F1 males and females DBA/2J and C57BL/6J reciprocally crossed mice (two pools of nine mice). These crosses are termed as DXB and BXD in the article.

Computational analysis

When not specified, analyses were made with homemade Perl, Python and R scripts.

Genomic sequences analyses

Reads from DNA sequencing of parents were aligned on the last assembly of chicken genome (Galgal4) using the bwa program version 0.6.1, option aln (41).

Identification of SNP discriminating parents from both lines

Allelic counts for each base position on the genome were determined. These counts allowed selecting only loci discriminating parents, i.e. positions where both parents from each line were homozygous for distinct alleles. The functional consequence of each SNP in each transcript was predicted using the Ensembl Variant Effect Predictor version 71 (42).

Transcriptome characterization

Chicken data

Tophat 2.0.5 software (43) was used to align transcript sequences from the F1 embryos on Galgal4 chicken assembly. Reads were first mapped allowing multiple alignments, with the intron length parameter set between 3 and 25 000bp to limit the number of false positive while discovering most junctions, the deletion length was limited to 1 instead of 3, and the mate inner distance in our case corresponded to 200. The option read-realign-edit-dist was set to 0 to map every read in all the Tophat mapping steps to get the best possible alignment. Finally, a micro exon search was applied. Other alignment options were set as default. To remove potential PCR reads duplicates from the analysis, the program samtools rmdup (41) was run on each individual after alignment. Afterward, reads with potential multiple locations were discarded. Samtools flagstat command (41) was used to generate statistics on aligned reads, notably the total number of reads mapped after applying our criteria. To characterize the general level of expression in all the embryos, cufflinks software 2.0.0 was used on Galgal4.72 GTF file to quantify expression on known genes (44).

Mouse data

Sequencing reads were aligned with Tophat 2.0.5 on an artificial reference genome where all SNPs known between DBA/2J and C57BL/6J (45) were converted to ‘N’ to prevent the preferential mapping of the reference allele. In order not to affect to a large extend the alignment efficiency, no penalty was set for reads mapping on N in this new reference. A maximum of three mismatches was allowed for a read to be kept. Finally, a filter was applied to keep only reads uniquely mapped to the reference genome.

Identification of SNP with biased expression

Transcripts alignments were merged by parental cross for further analysis. Allelic counts allowed to select SNPs represented with at least 10 reads in each cross, with an inverted allelic ratio between both crosses (Fold Change ≥2.5). To test the expression bias, a Fisher Exact Test was made on every transcribed loci and the significance threshold was fixed at False Discovery Rate (FDR) < 0.05 using the R package ‘qvalue’ (46).

Validation assays

Pyrosequencing

Twenty-three markers were tested on a Qiagen PyroMark Q24 sequencer. Primers were designed using PyroMark Assay Design software (Supplementary Table S2). PCR samples were produced using PyroMark PCR Kit (Qiagen). Runs were analyzed by PyroMark Q24 1.0.10 software with default analysis parameters.

Droplet Digital PCR

In total, three markers were tested on Bio-Rad QX100™ Droplet Digital PCR (ddPCR) System machine. Primers and probes were ordered at Sigma (Supplementary Table S3).

RESULTS

Constitution of an informative device

To detect genomic imprinting for a given gene, it is required that we are able to discriminate the parental origin of its expressed allele in the F1 embryos. The principal purpose of the lines' choice was thus to maximize the chances to identify from which parent the expressed allele had been transmitted to the embryos. To select parents as inbred (to reduce intra-population polymorphism) and as genetically distant (to maximize inter-population polymorphism) as possible, 3–12 animals from 10 chicken populations were genotyped on an Illumina 57K SNP array (35). Line R− and Line 6 were the two that fitted the best these criteria (Figure 1A) and were reciprocally crossed: one male from Line 6 was mated with a female from Line R− and one R− male was crossed with one female from Line 6 (Figure 1B). We selected 12 embryos from each reciprocal cross, with balanced ratio of six males and six females from each cross. To detect SNPs discriminating the parents from both lines, genomic DNA from all four parents was sequenced. A total of 897 648 484 paired-ends reads (100 bp) were produced from an Illumina HiSeq 2000 sequencer and 813 407 461 (91%) aligned to the Galgal4 reference genome, which was thus covered at 98%, with a mean depth of 20×. We found 2 298 622 SNPs discriminating the parents from both lines on genomic DNA, amounting to 2.1 ± 0.7 SNP/kb of sequence. When considering the number of annotated genes with at least one discriminating SNP, we found that 12 303 genes were represented. It corresponds to 72% of chicken annotated genes (Supplementary Table S4). Among the observed discriminating SNPs, 54.9% were localized in intergenic regions (Supplementary Figure S1), and 27.4% of the SNP detected in the coding region were nonsynonymous, which is similar to results previously obtained in chicken (47,48). To have a better idea of their density in the potentially expressed regions, SNPs within exons were considered, that is 55 016 SNPs (i.e. 2% of detected SNPs). It corresponds to a density of 1.3 SNP/kb of coding regions, which is sufficient to have a reasonable screening of the exome. Nevertheless, we used the total set of available discriminating SNPs in the study, to include nonannotated regions in the analysis. The fraction of these SNPs that will be observed in the RNA-Seq data represents the number of opportunities to link the allelic imbalance of transcripts detected in embryos with their parental origin.

Transcriptome characterization

To link the parental origin of the alleles to their expression pattern in embryos, we performed RNA-Seq on 12 embryos from each cross. We obtained reliable sequencing results from 20 embryos (SRA accession number SRP033603), 11 from Cross 1 (5 females and 6 males) and 9 from Cross 2 (4 females and 5 males); the other 4 were discarded because of sequencing tags problems. From those 20 embryos, 1.33 × 109 and 6.07 × 108 sequences were produced from Cross 1 and Cross 2 individuals, respectively (200 Gb of sequence data altogether), corresponding on average to 121.1 million reads and 67.4 million reads for each individual in Cross 1 and Cross 2, respectively. After alignment to the chicken assembly, removing of PCR duplicates and allowing only one possible hit, 74% of reads were mapped to the reference genome, which corresponds to 42% coverage of the genome (39% for Cross 1 and 35% for Cross 2, Supplementary Figure S2). We set a minimal depth of 10 reads to consider a region as covered, to avoid background of low expressed regions. Regarding the general expression in chicken embryo, a minimal FPKM of 1 was expected (one fragment per kilobase per million reads) to call a gene as ‘expressed’. Among 17 108 genes already described in chicken (available from galgal4.72 GTF file), 11 416 genes, i.e. 67% of them, had a FPKM superior or equal to 1 in our data set.

Filters—bias in expression

RNA sequences were filtered to detect allelic imbalance expression at loci discriminating parents from both lines. Around 300 million bases were counted as expressed in both crosses, meaning that at least one read was observed at this position in each cross. To avoid bias in expression linked to a lack of depth, a threshold of 10 reads had to be reached in each cross (which corresponds to a minimal depth of 20 in total), leading to 123 532 053 positions, i.e. 40% of base positions expressed in both crosses. Among these loci, 5 388 148 were polymorphic, and therefore potentially informative to test allelic imbalance. When crossing these expressed positions to discriminating SNPs (201 209 positions), we found that 7731 of 11 416 expressed genes (68%) were informative, i.e. containing at least one SNP discriminating parents of embryos in both crosses (Supplementary Table S4). A Fisher Exact Test was applied on these 5 million SNPs to test if the expression profile was different between Cross 1 and Cross 2. At a 5% FDR threshold, we observed 117 740 loci for which expression profiles were significantly different between Cross 1 and Cross 2, including positions with an allelic bias in only one of the crosses. Only positions where the allelic ratio was inverted between both crosses were potentially candidates, i.e. the minor allele in one cross is overrepresented in the reciprocal cross. Because the study was made on the whole embryo (a mixture of different tissues) a complete extinction of one allele was not necessarily expected, and we set a minimal fold change of 2.5 for this filter. A total of 4904 SNPs met these criteria. They could either result from polymorphisms in parental sequences or from an effective bias in expression. A further filtering was done removing the SNPs that, even if significant, were heterozygous in the parents and therefore not informative to trace back the allele parental origin. For that reason, linking these candidate SNPs to the parental origin of the allele was only possible at loci discriminating parents from both crosses. To do so, we crossed filtered data with the 2 298 622 discriminating SNPs detected in parallel on parents' genomic DNA. We checked that the detected allelic imbalance was similar for most of the pooled embryos in each cross, to avoid any bias due to a high monoallelic expression in few embryos (never observed in our candidates). At the end, we ended up with 79 potentially imprinted SNPs. These SNPs were mostly located in introns (51% of them); the others were upstream or downstream of annotated genes (31%), exons (6%), 3′-UTR (4%) or intergenic regions (8%) (Figure 2, Table 1). This distribution is similar to that of discriminating SNPs in sequences declared as expressed in both crosses, where 11% of these SNPs are located in known annotated transcripts (exons and UTR regions). This attribution was based on the known chicken annotation (Ensembl release 72, http://www.ensembl.org/Gallus_gallus/Info/Index). We considered expressed intergenic regions as unannotated regions.
Figure 2.

Genomic contexts of the candidate loci on the chicken genome.

Table 1.

Candidate positions, allelic counts and statistical results

In bold, positions tested by pyrosequencing.

Allelic ratios = allele 1 counts/allele 2 counts, for each cross.

Genomic contexts of the candidate loci on the chicken genome. Candidate positions, allelic counts and statistical results In bold, positions tested by pyrosequencing. Allelic ratios = allele 1 counts/allele 2 counts, for each cross. We tested 23 candidate SNPs to confirm the Illumina HiSeq results with pyrosequencing as an independent validation method (49,50). A total of 16 embryos (8 per cross) were tested. Tested candidates were chosen to get a panel of different situations highlighted after sequences analysis: intronic, exonic, regulatory regions and unannotated regions were looked at. Furthermore, candidates with different depth profiles were tested. To select the positions to be tested, we preferentially chose loci where SNPs surrounding the candidate position had an allelic profile similarly unbalanced: taking into account these additional positions—even if not passing our filters due to a lack of depth or an allelic ratio <2.5—ensure to avoid artifactual SNPs. First, to avoid any bias due to PCR amplification of an allele or to genomic duplication in one cross, we checked that the pattern obtained by pyrosequencing on genomic DNA from the embryos truly corresponds to a heterozygous profile. Then we quantified the allelic ratio observed on cDNA. Unfortunately, while confirming the heterozygous status of embryos in each case, we could not observe on cDNA any allelic imbalance corresponding to an imprinting profile (Figure 3).
Figure 3.

Comparison of results on a candidate gene obtained through HiSeq, droplet digital PCR and pyrosequencing. Colors are differentiating both alleles in sequencing results and ddPCR. (A) Hiseq counts of both alleles at the candidate locus (A allele in green, G allele in blue). (B) Pyrosequencing results from one sample of each cross (chosen as representative of average results). Analyses were made on the reverse strand. Allele A is overexpressed in both crosses (left: Cross 1 results from one embryo's cDNA, right: Cross 2 results from one embryo's cDNA).The highlighted peaks correspond to the SNP where the relative proportion is quantified. (C) Droplet digital PCR on the candidate locus from eight samples of each cross (A allele in green, G allele in blue).

Comparison of results on a candidate gene obtained through HiSeq, droplet digital PCR and pyrosequencing. Colors are differentiating both alleles in sequencing results and ddPCR. (A) Hiseq counts of both alleles at the candidate locus (A allele in green, G allele in blue). (B) Pyrosequencing results from one sample of each cross (chosen as representative of average results). Analyses were made on the reverse strand. Allele A is overexpressed in both crosses (left: Cross 1 results from one embryo's cDNA, right: Cross 2 results from one embryo's cDNA).The highlighted peaks correspond to the SNP where the relative proportion is quantified. (C) Droplet digital PCR on the candidate locus from eight samples of each cross (A allele in green, G allele in blue). Three markers were also tested by droplet digital PCR as another complementary quantitative approach. This confirmed the pyrosequencing results, further invalidating our HiSeq results (Figure 3).

Method validation with RNA sequencing data from mouse

As parental genomic imprinting is thought to be absent in chicken, we used as ‘positive control’ the mouse data set produced by Lagarrigue and collaborators (40) to detect imprinted genes in mouse liver, to validate the approach we proposed on chicken. In the mouse study, a reciprocal cross was made between DBA/2J (further called D) and C57BL/6J (further called B) mouse strains and RNA was sequenced from liver samples of 18 F1 mice. Out of 285 166 513 reads generated by RNA-seq in both mouse crosses, 145 412 592 reads (51%) were mapped uniquely on an artificial reference genome where SNPs between both mouse strains were replaced by ‘N’. It corresponds to 74 733 541 reads for one cross (DXB) and 70 679 051 reads for the reciprocal cross (BXD). These sequences were then filtered for bias in expression following the same pipeline as for chicken embryos. At a 5% FDR, we found 20 candidate SNPs with an inverted ratio between both crosses (minimum Fold Change of 2.5, Table 2). Of these, five are located in three genes known to be imprinted in mouse: H13 (33), Peg13 (51), both detected in the original study, and Snurf (52), not revealed by the first analysis. A fourth gene, Sgce, also detected by Lagarrigue et al., was identified when less stringent conditions on sequencing depth threshold were applied (i.e. when accepting nine reads in each cross).
Table 2.

Candidate position obtained from the mouse data set

ChromosomePositionGene short nameLocalizationImprinting statusaP-value Fisher exact testAllelic ratio Cross 1Allelic ratio Cross 2
124619956Not annotated1.55e-1178.640.004
2152530393H13ExonMaternal1.25e-623.910.30
460672634Mup2 or Mup11ExonUnknown1.06e-267920.31
4142807839Not annotated2.70e-053.830.34
4144508797Dhrs3 or AK147071IntronUnknown4.44e-072.820.07
461746350Mup3IntronUnknown1.22e-140.034
587427181Not annotated3.94e-260.31103
731513450Gapdhs or Tmem147IntronUnknown6.15e-15190.13
767133548SnurfExonPaternal5.64e-32850.03
767131688SnurfExonPaternal3.36e-400.0147
1183318343Taf15ExonUnknown4.04e-0810.50.37
1198911391Igfbp4IntronUnknown1.75e-555.110.01
1198911393Igfbp4IntronUnknown5.69e-730.1290
12105186516Serpina1eIntronUnknown1.86e-130.0514
1422266157AdkIntronUnknown2.17e-050.072.8
1430847794ChdhintronUnknown1.33e-130.0210
1572637471Peg13ExonPaternal1.77e-090.0618
1572640049Peg13ExonPaternal1.11e-210.0218
1735534271Bat1aIntronUnknown4.63e-120.374.1
1946726575D19Wsu16 or Opal1IntronUnknown3.33e-050.093.4

aKnown imprinting status in mouse in the literature.

In bold, genes known to be imprinted in the literature.

Allelic ratios = allele 1 counts/allele 2 counts, for each cross.

Candidate position obtained from the mouse data set aKnown imprinting status in mouse in the literature. In bold, genes known to be imprinted in the literature. Allelic ratios = allele 1 counts/allele 2 counts, for each cross. These results confirm the efficiency of our pipeline to detect genomic imprinting from RNA-Seq data that allowed not only confirming three genes identified by Lagarrigue et al. but also demonstrating the presence of another imprinted gene, Snurf, which had not been detected in their study. The newly discovered loci have to be further analyzed.

DISCUSSION

This work represents, to our knowledge, the first transcriptome-wide investigation of genomic imprinting in birds. Previous studies in chicken were made on targeted genes, known to be imprinted in mammals, and they concluded to the nonexistence of this phenomenon in birds (15–18,20,53). To get free of any a priori, investigations should not be restricted to targeted loci but consider every single chicken gene. RNA sequencing technology is an approach meeting these expectations (54,55). Transcriptome sequencing has already been used with this intention in other species, especially in mouse (32,33,54). It not only allowed the confirmation of most mammalian imprinted genes but also the discovery of putative new ones. We thus set up an experimental cross designed for the search of parental genomic imprinting in chicken, by using RNA-Seq on whole embryos from reciprocal crosses between chicken lines, as inbred and genetically distant as possible. Our results identified 79 potentially imprinted SNPs, mainly located in introns from distinct genes, not shown to be imprinted when analyzing their coding portion. It has been demonstrated that imprinted retrogenes could be hosted in introns of biallelically expressed genes in mouse (56), leading us to consider these loci as possibly interesting. But in the presence of genes truly imprinted, one could expect a higher number of candidate SNPs detected in known coding regions, and these results, with a large proportion of SNPs outside the coding regions, strengthened the need of experimental validations of these results. These positions out of the coding sequence raise the question of the functional significance of the transcribed portion of the genome. There is a strong debate in the literature about the proportion of the genome that is ‘functional’ (57–60), following the ENCODE results, which assigned biochemical functions to 80.4% of the human genome, for which the cumulative coverage is 62.1% for processed transcripts (61). In our analysis, only ∼40% of the genome is considered to be transcribed, with a minimal coverage of 10 reads in the pool of all 20 embryos data at a single-base scale. This discrepancy may be partly explained by our selection of only polyadenylated mRNA. Nevertheless, the transcribed portion of the genome far exceeds its exonic subset, a more and more admitted characteristic of the genome (62,63). Before validation, we ensured that candidate SNPs were not declared biased because of only one highly expressed embryo (Supplementary Figure S3). None of the candidate loci tested through validation experiments was confirmed as imprinted. These candidates being probably false positives, we thus validated our methodology by reanalyzing a mouse data set, as a ‘positive control’, where imprinting should be detected. By confirming the three previously identified imprinted genes [H13, Peg13 and Sgce (40)], and by discovering one more in these data, Snurf, previously known to be imprinted (52), we confirmed the reliability of the method. These four genes found imprinted in mouse are a severe underestimate of the total set of genes known to be imprinted in this species. This low number of genes is discussed in the original study by Lagarrigue and coworkers (40), notably highlighting the fact that only a fraction of imprinted genes are expressed in a given tissue at a precise developmental stage. Their study was the first, to our knowledge, to examine the parent-of-origin specific gene expression in adult liver in mouse, and there may be few imprinted genes in this tissue. The amount of data is probably the main reason for this low number of detected genes. Our chicken data set contains 10 times more sequences aligned and analyzed than the mouse data set. Considering that the mouse has a slightly higher number of annotated genes, expression can be better estimated in our chicken RNA-Seq sequences. With a higher sequencing depth, more genes could potentially have been detected. Nevertheless, two analysis methods from the same data set lead to the detection of imprinted genes in mouse. The differences between the results obtained here (four genes) and in the previous analysis (three genes) from the same data have two main origins: first, the minimal depth was originally set to 10 for each allele, which discarded Snurf, almost totally imprinted, as a candidate locus. Moreover, the original study did not take into account the noncoding region (explaining the discovery of only exonic imprinted SNPs in the first analysis). The biological reality of the newly discovered loci has to be determined, but this analysis confirmed that imprinting, when existing, is detected with our method. This validation is reinforced by the fact that we analyzed 200 Gb of chicken sequence (versus 3 Gb for the mouse study), leading to the assumption that we should have observed significant results if imprinting did exist in chicken. On the other hand, the use of whole embryos in our study (versus only liver samples in mouse) might have affected the final results. But our choice was, for the same amount of data, to reach a sufficient depth to reliably discover multitissue parental imprinting, without having to a priori select one single tissue. The choice of the embryo was determined by the high number of imprinted genes already known to be imprinted in the embryo, compared with single adult tissues (64). Moreover, others studies conducted on the whole embryo allowed the detection of imprinting in mouse (65). If imprinting does exist in chicken at this developmental stage, but is observed only in one tissue, or if a locus is paternally expressed in some tissues but maternally expressed in others (66), our device might have not allowed to detect such cases. While such profiles were already observed, most genes are imprinted in multiple tissue types (67). Nevertheless, it would be interesting to confirm the absence of imprinting in chicken by studying other tissues, extraembryonic membranes in particular: most of the known imprinted genes in mammals are expressed in the placenta (68,69), notably regulating nutritional resources to the embryo during development (70,71). Another question concerning our model's choice was the embryonic stage of development. Stage 26 is a developmental stage from which a sufficient quantity of RNA can be extracted for RNA-Seq and validation experiments, and the expression of imprinted genes has already been described at the equivalent stage in the mouse embryo (64). It would also have been interesting to check the imprinting status of genes expressed in adult brain: by applying the kinship theory to a social context, the existence of imprinting would be possible in a species showing maternal care, as previously proposed (72). Incidentally, several genes have been shown to be imprinted in the mouse brain, notably related to social behavior (73). But the best bird lines to study would have been in this case from species with strong parental care behavior, and inbred chicken lines, despite their suitability for the RNA-Seq strategy used, are probably not the best model in this area. Besides practical reasons as availability of inbred lines and of a reliable genome assembly, the choice of the chicken for this study has been dictated by several arguments: parent-of-origin QTLs have been detected in this species (27), and chicken is an example of farm animals used in crossbreeding, where genomic imprinting could be important for the utilization of reciprocal effects. It is also an important animal model in developmental biology and phylogenetics, and is used as a model for other avian species, especially in genetics and genomics (47). A last comment about the device can be made about the sequenced transcripts: we limited the screening to polyA RNA, and thus we could not conclude on the potential imprinted status of noncoding RNAs, some of which are also known to be imprinted in mammals (74). With a reliable analysis method but no evidence of imprinting in our chicken data set, our results seem to confirm the absence of imprinting in chicken, at least in the whole embryo at embryonic day 4.5. This analysis is a first step in the genome-wide search of genomic imprinting in birds. The absence of this phenomenon will be definitely confirmed—taking into account the difficulty of claiming the absence of such a multifarious mechanism—only when other studies will complete this first one: other transcribed parts of the genome (such as non-coding RNA), other tissues at other developmental stages must be analyzed, as well as other chicken lines and, most importantly, other bird species. But given previous work on several candidate genes (15–21), and absence of differential methylated regions according to the parental allele in the chicken genome (75), it gives a strong argument, through a genome-wide study, in agreement with previous studies. This absence would be in accordance with the most shared hypothesis, the kinship theory, which leads to the assumption that imprinting does not exist in oviparous species (10). Another theory concerning the evolution of genomic imprinting mechanisms is the ‘host defence’ hypothesis, assuming that imprinting appeared consequently to mechanisms aimed at silencing foreign DNA elements inserted into the genome, and evolved in parallel with accumulation of repeats in certain genomic regions (12,70,76,77). This is again not in favor of the existence of this phenomenon in chicken, a species with a low level of repeats and active transposable elements, compared with mammals (78). The absence of this phenomenon would lead to alternative explanations of reciprocal effects. These effects represent ∼20% of the phenotypic variability of body weight in broiler and of egg viability in layers (79), and one cause could have been genomic imprinting. But other mechanisms can lead to such effects, as direct maternal effects, sex-linked genes or mitochondrial DNA transmission. Regarding QTLs detected as showing parent-of-origin effects, most of them can result from biases due to the animal device, as previously pointed out (25,26,80). Rowe and coworkers, in a paper describing a maternally expressed QTL obtained in a pedigree avoiding such biases, comment their results by stating they ‘have insufficient evidence to confirm that these are truly imprinted effects’. Taking into account the historical discussion about the absence of genomic imprinting in chicken and our results, it is possible that no QTL truly showing parent-of-origin effect does exist in chicken. To understand why these false imprinted candidates were detected, we tried to investigate the possible biases and issues encountered during the analysis. This is not a unique situation in RNA sequencing data analysis, as some of the results from imprinting analysis through transcriptome sequencing had also been contested in mouse (81,82), highlighting many possible biases inherent to RNA sequencing data, from experiments to analyses. In particular, some alleles can be underrepresented through library preparation, several SNPs can appear to be artifactual and read alignment can be biased due to a better mapping of the reference allele (83). Because we were looking for inverted allelic ratios between reciprocal crosses in our study, this last issue could not explain our results because regions where the reference allele was preferentially mapped were avoided. We considered that the possible problem of artifactual SNPs was also absent from our data because we crossed the SNPs observed from the RNA-Seq data with the polymorphisms detected between the parents by genomic sequencing. Furthermore, all the candidate SNPs analyzed through other methods, as Sanger sequencing (data not shown) or pyrosequencing, confirmed the heterozygous status of the embryo's DNA. The possibility of underrepresentation of an allele due to the library preparation remains, even if less likely due to the use of reciprocal crosses: to generate such a bias, an allele should be ‘randomly’ retained in one cross, while the alternative allele should be ‘randomly’ kept in the other cross during the library preparations. Likewise, there is a possible dependence between nucleotide frequency at a SNP and its position on sequence reads (84). This bias can originate from the sequencing platform or from a positional bias of the reads across the transcripts. In any case, correcting this bias can improve expression estimates (85). All 79 positions were studied for this putative bias, but only four of them do have a median position among reads in the 20% extremities (Figure 4). The sequencing depth may be another cause of the median deviation from the theoretical value of 50 (average position on the read). While location of a SNP on a read may randomly vary, lower expression level increases the probability of the median position of the SNP differing from this medium position. By removing candidate SNPs being in extremities of reads, we either remove low expressed loci or effectively biased loci. Excluding these four candidates is thus a conservative choice. But the unconfirmed allelic imbalance of the 75 left ones cannot be explained this way. Moreover, if we consider the proportion of forward and reverse reads for each candidate, it appears that no overrepresentation of one direction of sequencing was detected (no candidate SNP with a proportion of forward read <20% or >80%). This potential bias was thus not the source of false positive in our case.
Figure 4.

Candidate SNP positions among sequenced reads. Boxplot of SNP localization among reads for each candidate ordered by median. Orange color represents candidates with a median position in the 20% extremities of reads.

Candidate SNP positions among sequenced reads. Boxplot of SNP localization among reads for each candidate ordered by median. Orange color represents candidates with a median position in the 20% extremities of reads. If the discovery of false-positive SNPs does not directly come from alignment or position bias, it can also be linked with analysis criteria. One of the main issues in such studies is the sequencing depth threshold to declare allelic imbalance. Studies made until now used different threshold in their analyses (32,33,86). This issue was underlined in many allelic imbalance studies using RNA sequences data, but until now, no consensus was made on a reliable threshold (32,86,87). We set ours to a minimum read depth of 10 in each cross, which corresponds to a minimal coverage of 20 reads for each locus considered in the analysis. This threshold was appropriate to highlight imprinted loci in the mouse data set. Thus, taking into account the lower sequencing depth in the mouse study compared with the chicken one, we should have detected imprinted loci in chicken if present: even a higher threshold would have brought to light such loci from our data set. We thus looked at the candidates with regard to the general distribution of sequencing depth. Most of them have a low expression profile (Figure 5), which could lead to false-positive candidates. We can raise the hypothesis that at some low sequencing depth, i.e. at low expression levels, genes might express one allele preferentially, or that it is not technically possible to properly amplify both alleles when the gene is expressed under a certain limit. Under these conditions, the principal null hypothesis used in this study could be modified: instead of a 50:50 ratio between both alleles in transcripts of heterozygous embryos, the null hypothesis would depend on the expression rate of the genes. This hypothesis is in accordance with previous studies, highlighting the technical issues leading to allelic imbalance (86,88,89).
Figure 5.

Sequencing depth distribution for all SNPs (top), with classes containing candidate SNPs in gray, and for candidate SNPs (bottom), with tested SNPs in black.

Sequencing depth distribution for all SNPs (top), with classes containing candidate SNPs in gray, and for candidate SNPs (bottom), with tested SNPs in black. By looking to possible biases in our analysis, we tried to understand why no genomic imprinting could be observed on the candidate SNPs that had been highlighted. This leads to several considerations: firstly, results' validation is required through a reliable quantitative method, as we did mainly by pyrosequencing; secondly, possible experimental and analytical biases should be considered in the final results interpretation and care should be taken regarding general sequencing depth, possible alignment biases and threshold choices, as previously underlined (82,90). This study was designed to detect parental genomic imprinting in birds, with chicken as a model species. Our results show that, at least at this developmental stage, there is no genomic region imprinted at the whole embryo level. This does not exclude specific events occurring in a small subset of tissues or at different developmental stages that would most probably not have been detectable in the combination of tissues used in this work. Nevertheless, this study proposes a performing method to investigate the existence of genomic imprinting at a genome scale, without a priori choice of targets genes known to be imprinted in mammals. Our results support the fact that, while the evolution of imprinting has occurred in a convergent manner for several clades, from plants to mammals, it is probably missing in birds.

ACCESSION NUMBERS

NCBI_SRA accessions number: SRP033603.

SUPPLEMENTARY DATA

Supplementary Data are available at NAR Online.

FUNDING

Région Midi-Pyrénées (http://www.midipyrenees.fr/) and Animal Genetics Division (INRA, http://www.ga.inra.fr/en/) (to L.F.); French ‘Agence Nationale de la Recherche’ EpiBird grant [ANR-009-GENM-004, http://www.agence-nationale-recherche.fr/]. Funding for open access charge: INRA, UMR444 Laboratoire de Génétique Cellulaire, Castanet-Tolosan, F-31326, France. Conflict of interest statement. None declared.
  86 in total

1.  Allelic expression of IGF2 in marsupials and birds.

Authors:  M J O'Neill; R S Ingram; P B Vrana; S M Tilghman
Journal:  Dev Genes Evol       Date:  2000-01       Impact factor: 0.900

2.  On the detection of imprinted quantitative trait loci in experimental crosses of outbred species.

Authors:  Dirk-Jan de Koning; Henk Bovenhuis; Johan A M van Arendonk
Journal:  Genetics       Date:  2002-06       Impact factor: 4.562

3.  Avian genomics in the 21st century.

Authors:  D W Burt; S J White
Journal:  Cytogenet Genome Res       Date:  2007       Impact factor: 1.636

Review 4.  The evolution of genomic imprinting: costs, benefits and long-term consequences.

Authors:  Luke Holman; Hanna Kokko
Journal:  Biol Rev Camb Philos Soc       Date:  2013-10-28

Review 5.  The conflict theory of genomic imprinting: how much can be explained?

Authors:  Y Iwasa
Journal:  Curr Top Dev Biol       Date:  1998       Impact factor: 4.897

6.  Biases in Illumina transcriptome sequencing caused by random hexamer priming.

Authors:  Kasper D Hansen; Steven E Brenner; Sandrine Dudoit
Journal:  Nucleic Acids Res       Date:  2010-04-14       Impact factor: 16.971

7.  Parental genomic imprinting of the human IGF2 gene.

Authors:  N Giannoukakis; C Deal; J Paquette; C G Goodyer; C Polychronakos
Journal:  Nat Genet       Date:  1993-05       Impact factor: 38.330

8.  The allele distribution in next-generation sequencing data sets is accurately described as the result of a stochastic branching process.

Authors:  Verena Heinrich; Jens Stange; Thorsten Dickhaus; Peter Imkeller; Ulrike Krüger; Sebastian Bauer; Stefan Mundlos; Peter N Robinson; Jochen Hecht; Peter M Krawitz
Journal:  Nucleic Acids Res       Date:  2011-11-29       Impact factor: 16.971

9.  The reality of pervasive transcription.

Authors:  Michael B Clark; Paulo P Amaral; Felix J Schlesinger; Marcel E Dinger; Ryan J Taft; John L Rinn; Chris P Ponting; Peter F Stadler; Kevin V Morris; Antonin Morillon; Joel S Rozowsky; Mark B Gerstein; Claes Wahlestedt; Yoshihide Hayashizaki; Piero Carninci; Thomas R Gingeras; John S Mattick
Journal:  PLoS Biol       Date:  2011-07-12       Impact factor: 8.029

10.  Effect of read-mapping biases on detecting allele-specific expression from RNA-sequencing data.

Authors:  Jacob F Degner; John C Marioni; Athma A Pai; Joseph K Pickrell; Everlyne Nkadori; Yoav Gilad; Jonathan K Pritchard
Journal:  Bioinformatics       Date:  2009-10-06       Impact factor: 6.937

View more
  33 in total

Review 1.  Revisiting mechanisms and functions of prenatal hormone-mediated maternal effects using avian species as a model.

Authors:  Ton G G Groothuis; Bin-Yan Hsu; Neeraj Kumar; Barbara Tschirren
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2019-04-15       Impact factor: 6.237

2.  Whole-genome resequencing reveals loci with allelic transmission ratio distortion in F1 chicken population.

Authors:  Peng Ren; Feilong Deng; Shiyi Chen; Jinshan Ran; Jingjing Li; Lingqian Yin; Yan Wang; Huadong Yin; Qing Zhu; Yiping Liu
Journal:  Mol Genet Genomics       Date:  2021-01-06       Impact factor: 3.291

3.  Allelic expression of mammalian imprinted genes in a matrotrophic lizard, Pseudemoia entrecasteauxii.

Authors:  Oliver W Griffith; Matthew C Brandley; Katherine Belov; Michael B Thompson
Journal:  Dev Genes Evol       Date:  2016-03-04       Impact factor: 0.900

4.  Genome-wide identification of allele-specific expression in response to Streptococcus suis 2 infection in two differentially susceptible pig breeds.

Authors:  Huayu Wu; Uma Gaur; Supamit Mekchay; Xianwen Peng; Lianghua Li; Hua Sun; Zhongxu Song; Binke Dong; Mingbo Li; Klaus Wimmers; Siriluck Ponsuksili; Kui Li; Shuqi Mei; Guisheng Liu
Journal:  J Appl Genet       Date:  2015-03-04       Impact factor: 3.240

5.  Next-Generation Sequencing Techniques Reveal that Genomic Imprinting Is Absent in Day-Old Gallus gallus domesticus Brains.

Authors:  Qiong Wang; Kaiyang Li; Daixi Zhang; Junying Li; Guiyun Xu; Jiangxia Zheng; Ning Yang; Lujiang Qu
Journal:  PLoS One       Date:  2015-07-10       Impact factor: 3.240

6.  Matrilineal inheritance of a key mediator of prenatal maternal effects.

Authors:  Barbara Tschirren; Ann-Kathrin Ziegler; Joel L Pick; Monika Okuliarová; Michal Zeman; Mathieu Giraudeau
Journal:  Proc Biol Sci       Date:  2016-09-14       Impact factor: 5.349

7.  Imprinting analysis by droplet digital PCR coupled with locked nucleic acid TaqMan probes.

Authors:  Maiko Mitake; Shiori Hirano; Tatsuya Kishino
Journal:  Epigenetics       Date:  2020-09-24       Impact factor: 4.528

8.  Genome-Wide Characterization of RNA Editing in Chicken Embryos Reveals Common Features among Vertebrates.

Authors:  Laure Frésard; Sophie Leroux; Pierre-François Roux; Christophe Klopp; Stéphane Fabre; Diane Esquerré; Patrice Dehais; Anis Djari; David Gourichon; Sandrine Lagarrigue; Frédérique Pitel
Journal:  PLoS One       Date:  2015-05-29       Impact factor: 3.240

9.  Expanding Duplication of Free Fatty Acid Receptor-2 (GPR43) Genes in the Chicken Genome.

Authors:  Camille Meslin; Colette Desert; Isabelle Callebaut; Anis Djari; Christophe Klopp; Frédérique Pitel; Sophie Leroux; Pascal Martin; Pascal Froment; Edith Guilbert; Florence Gondret; Sandrine Lagarrigue; Philippe Monget
Journal:  Genome Biol Evol       Date:  2015-04-24       Impact factor: 3.416

Review 10.  Different yet similar: evolution of imprinting in flowering plants and mammals.

Authors:  Nuno D Pires; Ueli Grossniklaus
Journal:  F1000Prime Rep       Date:  2014-08-01
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.