Literature DB >> 21057797

SNP discovery in the bovine milk transcriptome using RNA-Seq technology.

Angela Cánovas¹, Gonzalo Rincon, Alma Islas-Trejo, Saumya Wickramasinghe, Juan F Medrano.

Abstract

High-throughput sequencing of RNA (RNA-Seq) was developed primarily to analyze global gene expression in different tissues. However, it also is an efficient way to discover coding SNPs. The objective of this study was to perform a SNP discovery analysis in the milk transcriptome using RNA-Seq. Seven milk samples from Holstein cows were analyzed by sequencing cDNAs using the Illumina Genome Analyzer system. We detected 19,175 genes expressed in milk samples corresponding to approximately 70% of the total number of genes analyzed. The SNP detection analysis revealed 100,734 SNPs in Holstein samples, and a large number of those corresponded to differences between the Holstein breed and the Hereford bovine genome assembly Btau4.0. The number of polymorphic SNPs within Holstein cows was 33,045. The accuracy of RNA-Seq SNP discovery was tested by comparing SNPs detected in a set of 42 candidate genes expressed in milk that had been resequenced earlier using Sanger sequencing technology. Seventy of 86 SNPs were detected using both RNA-Seq and Sanger sequencing technologies. The KASPar Genotyping System was used to validate unique SNPs found by RNA-Seq but not observed by Sanger technology. Our results confirm that analyzing the transcriptome using RNA-Seq technology is an efficient and cost-effective method to identify SNPs in transcribed regions. This study creates guidelines to maximize the accuracy of SNP discovery and prevention of false-positive SNP detection, and provides more than 33,000 SNPs located in coding regions of genes expressed during lactation that can be used to develop genotyping platforms to perform marker-trait association studies in Holstein cattle.

Entities: Chemical Disease Gene Species

Mesh：

Substances：
DNA, Complementary

Year: 2010 PMID： 21057797 PMCID： PMC3002166 DOI： 10.1007/s00335-010-9297-z

Source DB: PubMed Journal: Mamm Genome ISSN： 0938-8990 Impact factor: 2.957

Introduction

Next-generation sequencing technologies have provided unprecedented opportunities for high-throughput functional genomic research, including gene expression profiling, genome annotation, small ncRNA discovery, and profiling and detection of aberrant transcription (Bentley 2006; Morozova and Marra 2008). Among these approaches, RNA sequencing (RNA-Seq) is a powerful new method for mapping and quantifying transcriptomes developed to analyze global gene expression in different tissues. Recently, this technique has also been used as an efficient and cost-effective method to systematically identify SNPs in transcribed regions in different species (Chepelev et al. 2009; Cirulli et al. 2010; Cloonan et al. 2008; Morin et al. 2008). RNA-Seq generates sequences on a very large scale at a fraction of the cost required for traditional Sanger sequencing, allowing the application of sequencing approaches to biological questions that would not have been economically or logistically practical before (Marguerat et al. 2008). Taking this into account, we applied this novel approach to identify SNPs in the expressed coding regions of the bovine milk transcriptome. The majority of gene expression analyses in the bovine mammary gland have been developed using a biopsy sample (Boutinaud and Jammes 2002; Finucane et al. 2008). An alternative sampling procedure has been proposed by isolating mRNA directly from somatic cells that are naturally released into milk during lactation (Boutinaud et al. 2002). Recently, Medrano et al. (2010), using the RNA-Seq technique, compared the milk and mammary gland transcriptomes and showed extensive similarities of gene expression in both tissues. In the present study we performed a SNP discovery analysis in milk transcriptome using RNA-Seq technology. For this purpose, seven milk samples from Holstein cows at different stages of lactation were analyzed by sequencing cDNA libraries using an Illumina GAII analyzer (Illumina, San Diego, CA) system. To evaluate the accuracy of SNPs detected with RNA-Seq, a comparison was made with SNPs detected in a set of 42 candidate genes expressed in milk that had been resequenced previously using Sanger sequencing technology. SNPs that were observed with only one technique were validated by the KASPar SNP Genotyping System.

Materials and methods

RNA-Seq library preparation

Seven milk samples were obtained from Holstein cows at two stages of lactation (day 15 and day 250). Milk samples were collected in 50-ml tubes 3 h after milking, kept on ice, and processed immediately for RNA extraction. Samples were centrifuged at 2000×g for 10 min to obtain a pellet of cells. Total RNA was purified following a Trizol protocol (Invitrogen, Carlsbad, CA), and mRNA was isolated and purified using an RNA-Seq sample preparation kit (Illumina). mRNA was fragmented and first- and second-strand cDNA were synthesized. After adapters were ligated to the ends of double-stranded cDNA, a 300-bp fragment size was selected by gel excision and each sample was individually sequenced on an Illumina GAII analyzer.

RNA-Seq analysis and SNP detection

Short sequence reads (36-40 bp) were assembled and mapped to the annotated bovine reference genome Btau4.0 (http://www.ncbi.nlm.nih.gov/genome/guide/cow/index.html) using CLC Genomics Workbench software (CLC Bio, Aarhus, Denmark). Sequencing reads for each of the seven samples were pooled to perform the RNA-Seq and SNP discovery analyses. We applied stringent criteria in order to reduce the rate of detection of false-positive SNPs. For the assembly procedure, the sequences were mapped to the consensus genome accounting for a maximum of two gaps or mismatches in each sequence. Reads were then classified as uniquely mapped reads and nonspecifically mapped reads as shown in Table 1. SNP detection was performed using the following quality and significance filters: (1) the minimum average quality of surrounding bases and minimum quality of the central base were set as 15 and 20 quality score units, respectively; (2) minimum coverage was set at ten reads; (3) minimum variant frequency or count was set at 20% or two read counts per SNP; and (4) SNPs located in read ends (last three bases) were not considered in the analysis due to possible sequencing errors.

Table 1

Summary of mapping all the RNA-Seq reads to the reference genome (Btau4.0) obtained from seven pooled milk samples

	Uniquely mapped reads		Nonspecifically mapped reads		Mapped reads
	No. of reads	%	No. of reads	%	No. of reads	%
Total exon reads	59888107	83	12480476	17	72368583	87.5
Exon-exon reads^a	10961868	89	1313183	11	12275051	87.5
Total intron reads	9100877	88	1271041	12	10371918	12.5
Exon-intron reads^b	1475160	90	166238	10	1641398	12.5
Total gene reads	68988984	83	13751517	17	82740501	100

a Exon-exon reads reads mapping to two contiguous exons. Number is included in total exon reads

b Exon-intron reads reads mapping an exon and a contiguous intron. Number is included in total intron reads

Summary of mapping all the RNA-Seq reads to the reference genome (Btau4.0) obtained from seven pooled milk samples a Exon-exon reads reads mapping to two contiguous exons. Number is included in total exon reads b Exon-intron reads reads mapping an exon and a contiguous intron. Number is included in total intron reads

Sanger resequencing of target genes and SNP detection

Resequencing was performed in a DNA resource population specifically developed for SNP discovery as described by Rincon et al. (2007). This population consisted of eight Holstein animals that were unrelated at least three generations back in their pedigrees. Genomic sequences for 42 candidate genes that were expressed in milk samples were obtained from the Btau4.0 assembly and resequenced using Sanger sequencing technology. Exons and conserved noncoding regions were identified using multiple-species genome alignments with Genome VISTA (Couronne et al. 2003). Coding regions and the conserved noncoding regions of each gene were resequenced at SeqWright DNA Technology Services (Houston, TX) using Sanger sequencing technology. SNPs were analyzed using CodonCode aligner software (http://www.codoncode.com); gene sequences and SNPs were assembled and annotated in Vector NTI advance 10.1.1 software (Invitrogen, Carlsbad, CA).

SNP validation by the KASPar SNP genotying system

The KASPar SNP Genotyping System (KBiociences, Herts, UK) was used to validate SNPs detected by RNA-Seq and not detected by Sanger sequencing. For this purpose, 15 bovine DNA samples (8 cows used for Sanger resequencing and 7 cows used for RNA-Seq) were selected. Genomic DNA was extracted from 5 ml of cow’s blood following the protocol of the Gentra Puregene blood kit (Qiagen, Valencia, CA). KASPar assay primers (Table 2) were designed using the Primer Picker software available at http://www.kbioscience.co.uk/primer-picker.htm (KBiociences). Genotyping assays were carried out with a 7500 Fast Real Time instrument (Applied Biosystems, Foster City, CA) in a final volume of 8 μl containing 4× Reaction Mix (KBiociences), 120 nM of each allele-specific primers and 300 nM of common primer, 2.2 mM of MgCl2, and 2 mM KTaq polymerase (KBiociences). The following thermal profile was used for all reactions: 15 min at 94°C; 20 cycles of 10 s at 94°C, 5 s at 57°C, and 10 s at 72°C; and 18 cycles of 10 s at 94°C, 20 s at 57°C, and 40 s at 72°C.

Table 2

KASPar primers used to validate SNP detected by RNA-Seq

Primer name	Sequence 5′ → 3′	Position^a
DDIT3_60417924_A	GAAGGTCGGAGTCAACGGATTGGACTTCAGCCTTTAATATTGGAGAAA	I2
DDIT3_60417924_T	GAAGGTGACCAAGTTCATGCTGGACTTCAGCCTTTAATATTGGAGAAT	I2
DDIT3_60417924	CCATGGGATTTTCCAGGCAAGAGTA	I2
INSIG2_73132468_C	GAAGGTCGGAGTCAACGGATTAAGCACTCTTATAGTCTGCATGACG	I8
INSIG2_73132468_T	GAAGGTGACCAAGTTCATGCTAAAGCACTCTTATAGTCTGCATGACA	I8
INSIG2_73132468	ATATCGTATCACAGTGTTGATGTGCCAAA	I8
STAT5A_43749704_G	GAAGGTGACCAAGTTCATGCTCGAGCACCGGGTCAGGGC	I20
STAT5A_43749704_A	GAAGGTCGGAGTCAACGGATTCGAGCACCGGGTCAGGGT	I20
STAT5A_43749704	GCAGGCCAGCTCCCTCTGATA	E20
STAT5A_43746587_C	GAAGGTCGGAGTCAACGGATTCGCCTGGAAGTTTGACTCTCC	E15
STAT5A_43746587_G	GAAGGTGACCAAGTTCATGCTCGCCTGGAAGTTTGACTCTCG	E15
STAT5A_43746587	GGAGTGTGGGCAATGCAGGGAA	I16
STAT5A_43741732_T	GAAGGTGACCAAGTTCATGCTCTCCGCCAACTTCTCACACCT	I18
STAT5A_43741732_A	GAAGGTCGGAGTCAACGGATTCTCCGCCAACTTCTCACACCA	I18
STAT5A_43741732	GGCCCTGGGGCTCGGGTT	E18

Position according to exon/intron distribution in bovine gene

KASPar primers used to validate SNP detected by RNA-Seq Position according to exon/intron distribution in bovine gene

Results and discussion

Detecting genetic variation in pooled milk transcriptome reads by RNA-Seq

RNA-Seq analysis included 118 million reads, ranging from 36 to 40 bp in size, that were assembled and mapped to the annotated NCBI bovine whole-genome assembly (27,368 genes). An average of 17 million short-sequence reads was obtained for each individual sample. The median coverage for the exons was 38×. The analysis revealed that 82.7 million reads (~70%) were categorized as mapped reads (68.9 million were uniquely mapped reads and 13.7 million were nonspecifically mapped reads), while 35 million were unmapped reads (Table 1). Most of the uniquely mapped reads corresponded to total exon reads (87.5%), whereas a small fraction corresponded to total intron reads (12.5%; Table 1). Intron reads are expressed regions that are not annotated as exons in Btau4.0. RPKM (reads per kilobase per million mapped reads) (Mortazavi et al. 2008) values were used to identify the total number of genes that were expressed in the milk transcriptome. A RPKM threshold value of 0.3 was established in order to balance the number of false positives and false negatives as described in Bentley et al. (2008) and Ramskold et al. (2009). A total of 13,807 genes were selected with RPKM threshold values greater than 0.3. For those genes with RPKM < 0.3, a detailed analysis was performed to determine the number of unique reads falling outside exon regions that can be representing either annotation errors or new exons not included in the current Btau4.0 genome. Using this strategy, 5368 expressed genes/regions were found with more than ten unique reads. We detected 19,175 expressed genes in milk samples (70.06%) of the 27,368 total bovine annotated genes in Btau4.0 genome assembly. The SNP detection analysis revealed 100,734 SNPs in the seven Holstein samples. Of these SNPs, 67,689 (67.2%) were homozygous, corresponding to differences between Holsteins and the Hereford bovine whole-genome assembly Btau4.0. This is a large number of SNPs that are fixed in Holstein for a different allele to that found in the Hereford genome reference and requires further investigation. In some cases these SNPs may represent artifacts due to errors in the reference sequence or due to misalignment of the short reads to the reference (see subsection “Validation of unique SNP detected by RNA-Seq” below). It may also be possible that some of the Holstein fixed SNPs in fact correspond to variants with a very low frequency and a large number of cows will be needed to detect the common Hereford variant. A total of 33,045 (32.8%) SNPs were polymorphic within Holsteins. Allele frequencies for these heterozygous SNPs were obtained for the pooled samples by counting the number of reads representing each allele. In summary, 1,849 SNPs had an allele frequency of 80/20, 5,511 SNPs had an allele frequency of 70/30, 15,411 SNPs had an allele frequency of 60/40, and 10,274 SNPs had an allele frequency of 50/50. Figure 1 represents the total number of SNPs per gene mapped to the bovine Btau4.0 genome assembly. SNPs that are different between the Hereford consensus sequence and that of Holstein are shown in red, and SNPs that are polymorphic in Holstein samples are in blue. We observed that SNPs in expressed regions are distributed along the entire genome, but there are an increasing number of polymorphisms located in the extremes of the chromosomes’ centromeric and telomeric regions. The pattern of SNP distribution in each chromosome is very similar between those that differentiate Holstein and Hereford and those SNPs that are polymorphic in Holsteins, suggesting that there are genomic regions that tend to accumulate a large number of SNPs. Interestingly, the collagen family genes in BTA1, BTA4, BTA12, and BTA19 showed the highest SNP count difference between the Holstein and the Hereford consensus sequences. A large amount of data was generated in this study; a detailed description of the SNPs is available from the authors upon request.

Fig. 1

Total number of SNPs per gene expressed in milk cells mapped to the Btau4.0 genome assembly. Red dots represent the number of SNPs per gene in coding regions that are different between Hereford and Holstein. Blue dots represent the SNPs per gene that are polymorphic in Holsteins

Accuracy of RNA-Seq technology for SNP detection

To analyze the accuracy of RNA-Seq technology for SNP detection, 42 genes highly expressed in milk and related to fatty acid synthesis and the growth hormone GH/IGF axis were resequenced using Sanger methodology. Nine genes did not show polymorphisms in exons by Sanger resequencing and were excluded from the SNP discovery and validation analyses: IGF1, IGFBP3, IGFBP4, MBTPS1, MBTPS2, NR3C1, PIAS1, STAT2, and STAT4. Eighty-six SNPs were detected in the remaining 33 candidate genes that exhibited variation in Holsteins. Seventy of 86 SNPs were also detected by RNA-Seq in 18 genes (Table 3). From the 16 SNPs that were not detected by RNA-Seq, 6 were located in exons that were not expressed in milk samples and therefore no sequencing reads were found.

Table 3

List of 70 SNPs in 18 genes validated in coding regions using RNA-Seq and Sanger sequencing

Gene	BTA	SNP location^a	Allele variation	Frequency
ADCY4 (Adenylate cyclase 4) ENSBTAG00000018419	10	21091039	A/T	50/50
		21093914	A/T	60/40
		21095171	A/T	60/40
		21096245	C/G	60/40
		21096645	T/A	60/40
CISH (Cytokine-inducible SH2-containing protein) ENSBTAG00000022622	22	50657856	G/T	50/50
		50659783	T/C	60/40
		50659810	C/T	60/40
DDIT3 (DNA damage-inducible transcript 3) ENSBTAG00000031544	5	60415461	C/G	60/40
		60416148	G/T	60/40
		60418030	G/A	60/40
FURIN (Trans Golgi network protease furin) ENSBTAG00000002939	21	21527200	C/G	50/50
		21528000	A/T	60/40
		21528001	T/G	60/40
		21528082	C/T	50/50
		21532215	C/A	60/40
IGF1R (Insulin-like growth factor 1 receptor) ENSBTAG00000021527	21	6862322	T/C	60/40
		6866036	T/C	60/40
		6868728	C/A	60/40
		6869964	A/C	50/50
IGFBP6 (Insulin-like growth factor-binding protein 6) ENSBTAG00000021467	5	29836177	A/T	50/50
INSIG1 (Insulin-induced gene 1) ENSBTAG00000001592	4	121362562	T/A	60/40
		121363743	C/A	60/40
		121363744	T/G	60/40
		121364402	A/C	50/50
		121364403	T/C	70/30
		121364574	T/G	60/40
		121364685	G/A	50/50
		121364911	A/G	60/40
		121365407	G/A	60/40
		121365607	T/C	60/40
		121365773	G/A	60/40
		121365941	G/A	60/40
		121366597	A/G	50/50
		121369136	G/A	60/40
		121369282	A/G	60/40
		121369843	G/A	60/40
		121370020	T/A	60/40
		121370021	T/A	50/50
		121371117	C/G	50/50
INSIG2 (Insulin-induced gene 2) ENSBTAG00000002112	2	73130467	C/T	50/50
NMI (N-myc (and STAT) interactor) ENSBTAG00000016219	2	47179315	C/G	60/40
PAPPA (Pregnancy-associated plasma protein-A) ENSBTAG00000004010	8	110792768	G/A	60/40
	8	110968532	C/T	60/40
SCAP (Sterol regulatory element-binding protein cleavage- activating protein) ENSBTAG00000015782	22	53546833	C/T	50/50
		53549091	C/G	50/50
		53552713	A/T	60/40
SOCS5 (Suppressor of cytokine signaling 5) ENSBTAG00000008987	11	30359073	T/A	60/40
	11	30360874	T/C	60/40
SREBF1 (Sterol regulatory element binding protein-1) ENSBTAG00000007884	19	35680267	G/A	60/40
		35680574	A/G	60/40
		35682842	A/G	60/40
		35683588	G/C	60/40
		35685082	A/T	60/40
SRPR (Signal recognition particle receptor subunit alpha) ENSBTAG00000014105	29	31200612	A/C	50/50
STAT1 (Signal transducer and activator of transcription 1) ENSBTAG00000007867	2	83382392	A/T	60/40
STAT3 (Signal transducer and activator of transcription 3) ENSBTAG00000021523	19	43780445	A/G	70/30
	19	43780740	C/A	60/40
STAT5A (Signal transducer and activator of transcription 5A) ENSBTAG00000009496	19	43729581	C/G	50/50
		43730210	G/A	60/40
		43730211	T/A	60/40
		43741509	G/A	60/40
		43743914	C/G	50/50
		43745596	C/G	60/40
		43748702	C/G	70/30
STAT5B (Signal transducer and activator of transcription 5B) ENSBTAG00000010125	19	43655236	A/C	50/50
STAT6 (Signal transducer and activator of transcription 6) ENSBTAG00000006335	5	60837392	A/G	60/40
		60837393	C/T	50/50
		60845709	G/T	60/40
		60845948	G/T	60/40

aSNP location is based on the bovine genome assembly Btau4.0

List of 70 SNPs in 18 genes validated in coding regions using RNA-Seq and Sanger sequencing aSNP location is based on the bovine genome assembly Btau4.0 It is important to note that the samples used for RNA-Seq were different from those sequenced by the Sanger method, so we were not expecting a 100% concordance of the results. However, it is noteworthy that despite the difference in sample composition in the analysis, only ten SNPs observed in Sanger were not detected in RNA-Seq. On the other hand, five SNPs were observed in three genes, DDIT3 (DNA-damage-inducible transcript 3), INSIG2 (insulin-induced gene 2), and STAT5A (signal transducer and activator of transcription 5A), in RNA-Seq that were not detected by Sanger (Table 4). These SNPs were further validated using the KASPar SNP Genotyping System.

Table 4

Unique SNPs detected by RNA-Seq that were validated using the KASPar Genotyping System

Gene	Chromosome	SNP	Position	Frequency (%)	Confirmed^a
DDIT3	5	T/A	60417924	50.0/50.0	Yes
INSIG2	2	T/C	73132468	50.0/50.0	Yes
STAT5A	19	G/A	43749704	54.5/45.5	Yes
STAT5A	19	G/C	43746587	70.6/29.4	No
STAT5A	19	T/A	43741732	66.7/33.3	No

DDIT3 DNA damage inducible transcript 3, INSIG2 insulin-induced gene 2, STAT5A signal transducer and activator of transcription 5A

aSNP confirmed by KASPar SNP Genotyping System and Sanger resequencing

Unique SNPs detected by RNA-Seq that were validated using the KASPar Genotyping System DDIT3 DNA damage inducible transcript 3, INSIG2 insulin-induced gene 2, STAT5A signal transducer and activator of transcription 5A aSNP confirmed by KASPar SNP Genotyping System and Sanger resequencing

Validation of unique SNPs detected by RNA-Seq

In order to confirm the presence of the five SNPs uniquely found by RNA-Seq, they were genotyped using the KASPar SNP Genotyping System. Three out the five SNPs were validated, as shown in Table 4. Two SNPs in the STAT5A gene that failed with the KASPar assay were further examined with a detailed analysis of the sequence reads containing the putative SNPs. We observed that the corresponding 40-bp sequence that mapped to a STAT5A region had a 99% homology with the STAT5B gene. In the SNP discovery analysis we set up a threshold of a maximum number of mismatches to two. With this mismatch rate, reads that correspond to a given gene, like STAT5B, can be assigned to STAT5A. This was not a common situation for most of the genes studied in this analysis, but it could represent a problem in gene families with highly conserved domains when using short sequence reads. In a similar study, Cirulli et al. (2010) observed that some false-positive SNPs identified in cDNA arose from alignment of a read to the wrong gene and that in these cases the correct gene and the gene chosen for the alignment always had very similar sequences. This situation has also been observed in regions associated with sequence repeats (Morozova and Marra 2008). Although the short-read structure of next-generation sequencers has some potential problems with respect to sequence assembly, the result is a system that generates accurate data and large coverage of consensus sequence and SNP calling at very high throughput and low cost (Thomas et al. 2006).

Conclusion

We have demonstrated that analyzing the transcriptome using RNA-Seq technology is an efficient and cost-effective method to identify SNPs in transcribed regions. Stringent criteria have to be applied to maximize the accuracy and prevent false-positive SNP detection. This study provides a valuable resource of more than 33,000 SNPs located in coding regions of genes expressed during lactation that can be used for further gene variation analysis and association studies in Holstein cattle.

15 in total

Review 1. Whole-genome re-sequencing.

Authors: David R Bentley
Journal: Curr Opin Genet Dev Date: 2006-10-18 Impact factor: 5.578

2. Sensitive mutation detection in heterogeneous cancer specimens by massively parallel picoliter reactor sequencing.

Authors: Roman K Thomas; Elizabeth Nickerson; Jan F Simons; Pasi A Jänne; Torstein Tengs; Yuki Yuza; Levi A Garraway; Thomas LaFramboise; Jeffrey C Lee; Kinjal Shah; Keith O'Neill; Hidefumi Sasaki; Neal Lindeman; Kwok-Kin Wong; Ana M Borras; Edward J Gutmann; Konstantin H Dragnev; Ralph DeBiasi; Tzu-Hsiu Chen; Karen A Glatt; Heidi Greulich; Brian Desany; Christine K Lubeski; William Brockman; Pablo Alvarez; Stephen K Hutchison; J H Leamon; Michael T Ronan; Gregory S Turenchalk; Michael Egholm; William R Sellers; Jonathan M Rothberg; Matthew Meyerson
Journal: Nat Med Date: 2006-06-25 Impact factor: 53.440

3. Application of massively parallel sequencing to microRNA profiling and discovery in human embryonic stem cells.

Authors: Ryan D Morin; Michael D O'Connor; Malachi Griffith; Florian Kuchenbauer; Allen Delaney; Anna-Liisa Prabhu; Yongjun Zhao; Helen McDonald; Thomas Zeng; Martin Hirst; Connie J Eaves; Marco A Marra
Journal: Genome Res Date: 2008-02-19 Impact factor: 9.043

4. Onset of lactation in the bovine mammary gland: gene expression profiling indicates a strong inhibition of gene expression in cell proliferation.

Authors: Kiera A Finucane; Thomas B McFadden; Jeffrey P Bond; John J Kennelly; Feng-Qi Zhao
Journal: Funct Integr Genomics Date: 2008-02-08 Impact factor: 3.410

Review 5. Applications of next-generation sequencing technologies in functional genomics.

Authors: Olena Morozova; Marco A Marra
Journal: Genomics Date: 2008-08-24 Impact factor: 5.736

6. Mapping and quantifying mammalian transcriptomes by RNA-Seq.

Authors: Ali Mortazavi; Brian A Williams; Kenneth McCue; Lorian Schaeffer; Barbara Wold
Journal: Nat Methods Date: 2008-05-30 Impact factor: 28.547

7. Stem cell transcriptome profiling via massive-scale mRNA sequencing.

Authors: Nicole Cloonan; Alistair R R Forrest; Gabriel Kolle; Brooke B A Gardiner; Geoffrey J Faulkner; Mellissa K Brown; Darrin F Taylor; Anita L Steptoe; Shivangi Wani; Graeme Bethel; Alan J Robertson; Andrew C Perkins; Stephen J Bruce; Clarence C Lee; Swati S Ranade; Heather E Peckham; Jonathan M Manning; Kevin J McKernan; Sean M Grimmond
Journal: Nat Methods Date: 2008-05-30 Impact factor: 28.547

8. Strategies and tools for whole-genome alignments.

Authors: Olivier Couronne; Alexander Poliakov; Nicolas Bray; Tigran Ishkhanov; Dmitriy Ryaboy; Edward Rubin; Lior Pachter; Inna Dubchak
Journal: Genome Res Date: 2003-01 Impact factor: 9.043

9. Accurate whole human genome sequencing using reversible terminator chemistry.

Authors: David R Bentley; Shankar Balasubramanian; Harold P Swerdlow; Geoffrey P Smith; John Milton; Clive G Brown; Kevin P Hall; Dirk J Evers; Colin L Barnes; Helen R Bignell; Jonathan M Boutell; Jason Bryant; Richard J Carter; R Keira Cheetham; Anthony J Cox; Darren J Ellis; Michael R Flatbush; Niall A Gormley; Sean J Humphray; Leslie J Irving; Mirian S Karbelashvili; Scott M Kirk; Heng Li; Xiaohai Liu; Klaus S Maisinger; Lisa J Murray; Bojan Obradovic; Tobias Ost; Michael L Parkinson; Mark R Pratt; Isabelle M J Rasolonjatovo; Mark T Reed; Roberto Rigatti; Chiara Rodighiero; Mark T Ross; Andrea Sabot; Subramanian V Sankar; Aylwyn Scally; Gary P Schroth; Mark E Smith; Vincent P Smith; Anastassia Spiridou; Peta E Torrance; Svilen S Tzonev; Eric H Vermaas; Klaudia Walter; Xiaolin Wu; Lu Zhang; Mohammed D Alam; Carole Anastasi; Ify C Aniebo; David M D Bailey; Iain R Bancarz; Saibal Banerjee; Selena G Barbour; Primo A Baybayan; Vincent A Benoit; Kevin F Benson; Claire Bevis; Phillip J Black; Asha Boodhun; Joe S Brennan; John A Bridgham; Rob C Brown; Andrew A Brown; Dale H Buermann; Abass A Bundu; James C Burrows; Nigel P Carter; Nestor Castillo; Maria Chiara E Catenazzi; Simon Chang; R Neil Cooley; Natasha R Crake; Olubunmi O Dada; Konstantinos D Diakoumakos; Belen Dominguez-Fernandez; David J Earnshaw; Ugonna C Egbujor; David W Elmore; Sergey S Etchin; Mark R Ewan; Milan Fedurco; Louise J Fraser; Karin V Fuentes Fajardo; W Scott Furey; David George; Kimberley J Gietzen; Colin P Goddard; George S Golda; Philip A Granieri; David E Green; David L Gustafson; Nancy F Hansen; Kevin Harnish; Christian D Haudenschild; Narinder I Heyer; Matthew M Hims; Johnny T Ho; Adrian M Horgan; Katya Hoschler; Steve Hurwitz; Denis V Ivanov; Maria Q Johnson; Terena James; T A Huw Jones; Gyoung-Dong Kang; Tzvetana H Kerelska; Alan D Kersey; Irina Khrebtukova; Alex P Kindwall; Zoya Kingsbury; Paula I Kokko-Gonzales; Anil Kumar; Marc A Laurent; Cynthia T Lawley; Sarah E Lee; Xavier Lee; Arnold K Liao; Jennifer A Loch; Mitch Lok; Shujun Luo; Radhika M Mammen; John W Martin; Patrick G McCauley; Paul McNitt; Parul Mehta; Keith W Moon; Joe W Mullens; Taksina Newington; Zemin Ning; Bee Ling Ng; Sonia M Novo; Michael J O'Neill; Mark A Osborne; Andrew Osnowski; Omead Ostadan; Lambros L Paraschos; Lea Pickering; Andrew C Pike; Alger C Pike; D Chris Pinkard; Daniel P Pliskin; Joe Podhasky; Victor J Quijano; Come Raczy; Vicki H Rae; Stephen R Rawlings; Ana Chiva Rodriguez; Phyllida M Roe; John Rogers; Maria C Rogert Bacigalupo; Nikolai Romanov; Anthony Romieu; Rithy K Roth; Natalie J Rourke; Silke T Ruediger; Eli Rusman; Raquel M Sanches-Kuiper; Martin R Schenker; Josefina M Seoane; Richard J Shaw; Mitch K Shiver; Steven W Short; Ning L Sizto; Johannes P Sluis; Melanie A Smith; Jean Ernest Sohna Sohna; Eric J Spence; Kim Stevens; Neil Sutton; Lukasz Szajkowski; Carolyn L Tregidgo; Gerardo Turcatti; Stephanie Vandevondele; Yuli Verhovsky; Selene M Virk; Suzanne Wakelin; Gregory C Walcott; Jingwen Wang; Graham J Worsley; Juying Yan; Ling Yau; Mike Zuerlein; Jane Rogers; James C Mullikin; Matthew E Hurles; Nick J McCooke; John S West; Frank L Oaks; Peter L Lundberg; David Klenerman; Richard Durbin; Anthony J Smith
Journal: Nature Date: 2008-11-06 Impact factor: 49.962

10. Next-generation sequencing: applications beyond genomes.

Authors: Samuel Marguerat; Brian T Wilhelm; Jürg Bähler
Journal: Biochem Soc Trans Date: 2008-10 Impact factor: 5.407

68 in total

Review 1. Genome-wide genetic marker discovery and genotyping using next-generation sequencing.

Authors: John W Davey; Paul A Hohenlohe; Paul D Etter; Jason Q Boone; Julian M Catchen; Mark L Blaxter
Journal: Nat Rev Genet Date: 2011-06-17 Impact factor: 53.242

2. RNA sequencing to study gene expression and SNP variations associated with growth in zebrafish fed a plant protein-based diet.

Authors: Pilar E Ulloa; Gonzalo Rincón; Alma Islas-Trejo; Cristian Araneda; Patricia Iturra; Roberto Neira; Juan F Medrano
Journal: Mar Biotechnol (NY) Date: 2015-02-22 Impact factor: 3.619

3. Advances in genomics for flatfish aquaculture.

Authors: Joan Cerdà; Manuel Manchado
Journal: Genes Nutr Date: 2012-08-19 Impact factor: 5.523

4. Establishing gene Amelogenin as sex-specific marker in yak by genomic approach.

Authors: P P Das; G Krishnan; J Doley; D Bhattacharya; S M Deb; P Chakravarty; P J Das
Journal: J Genet Date: 2019-03 Impact factor: 1.166

5. Multi-perspective quality control of Illumina RNA sequencing data analysis.

Authors: Quanhu Sheng; Kasey Vickers; Shilin Zhao; Jing Wang; David C Samuels; Olivia Koues; Yu Shyr; Yan Guo
Journal: Brief Funct Genomics Date: 2017-07-01 Impact factor: 4.241

6. Genomic resources for multiple species in the Drosophila ananassae species group.

Authors: Sarah Signor; Thaddeus Seher; Artyom Kopp
Journal: Fly (Austin) Date: 2013 Jan-Mar Impact factor: 2.160

7. Whole transcriptome analyses of six thoroughbred horses before and after exercise using RNA-Seq.

Authors: Kyung-Do Park; Jongsun Park; Junsu Ko; Byung Chul Kim; Heui-Soo Kim; Kung Ahn; Kyoung-Tag Do; Hansol Choi; Hak-Min Kim; Sanghoon Song; Sunghoon Lee; Sungwoong Jho; Hong-Sik Kong; Young Mok Yang; Byung-Hak Jhun; Chulhong Kim; Tae-Hyung Kim; Seungwoo Hwang; Jong Bhak; Hak-Kyo Lee; Byung-Wook Cho
Journal: BMC Genomics Date: 2012-09-12 Impact factor: 3.969

8. RNA-seq analysis of differential gene expression in liver from lactating dairy cows divergent in negative energy balance.

Authors: Matthew McCabe; Sinéad Waters; Dermot Morris; David Kenny; David Lynn; Chris Creevey
Journal: BMC Genomics Date: 2012-05-20 Impact factor: 3.969

9. Liver transcriptome profile in pigs with extreme phenotypes of intramuscular fatty acid composition.

Authors: Yuliaxis Ramayo-Caldas; Nuria Mach; Anna Esteve-Codina; Jordi Corominas; Anna Castelló; Maria Ballester; Jordi Estellé; Noelia Ibáñez-Escriche; Ana I Fernández; Miguel Pérez-Enciso; Josep M Folch
Journal: BMC Genomics Date: 2012-10-11 Impact factor: 3.969

10. A comparison of brain gene expression levels in domesticated and wild animals.

Authors: Frank W Albert; Mehmet Somel; Miguel Carneiro; Ayinuer Aximu-Petri; Michel Halbwax; Olaf Thalmann; Jose A Blanco-Aguiar; Irina Z Plyusnina; Lyudmila Trut; Rafael Villafuerte; Nuno Ferrand; Sylvia Kaiser; Per Jensen; Svante Pääbo
Journal: PLoS Genet Date: 2012-09-27 Impact factor: 5.917