Literature DB >> 29892639

Genomic structural differences between cattle and River Buffalo identified through comparative genomic and transcriptomic analysis.

Wenli Li1, Derek M Bickhart1, Luigi Ramunno2, Daniela Iamartino3,4, John L Williams5, George E Liu6.   

Abstract

Water buffalo (Bubalus bubalis L.) is an important livestock species worldwide. Like many other livestock species, water buffalo lacks high quality and continuous reference genome assembly, required for fine-scale comparative genomics studies. In this work, we present a dataset, which characterizes genomic differences between water buffalo genome and the extensively studied cattle (Bos taurus Taurus) reference genome. This data set is obtained after alignment of 14 river buffalo whole genome sequencing datasets to the cattle reference. This data set consisted of 13,444 deletion CNV regions, and 11,050 merged mobile element insertion (MEI) events within the upstream regions of annotated cattle genes. Gene expression data from cattle and buffalo were also presented for genes impacted by these regions. Public assessment of this dataset will allow for further analyses and functional annotation of genes that are potentially associated with phenotypic difference between cattle and water buffalo.

Entities:  

Year:  2018        PMID: 29892639      PMCID: PMC5993156          DOI: 10.1016/j.dib.2018.05.015

Source DB:  PubMed          Journal:  Data Brief        ISSN: 2352-3409


Specifications Table Value of the data This data set presents the major genomic differences between cattle and river buffalo: copy number variation deletion (CNV-deletion) and mobile element insertion (MEI). Genes identified in this analysis provides the basis for of further development functional assays aimed at identify genomic factors underlying phenotypic differences between cattle and buffalo. Structural variants and genes identified in this study will facilitate the development of resources suitable for water buffalo genomic selection.

Data

Water buffalo (Bubalus bubalis L.) is a significant livestock species worldwide with high economic importance [1]. This study sought to characterize differences in gene content, regulation and structure between taurine cattle (2n = 60) and river buffalo (2n = 50) (one extant type of water buffalo) using the extensively annotated UMD3.1 cattle reference genome as a basis for comparisons. Using 14 WGS datasets from river buffalo, we identified 13,444 deletion CNV regions (Supplemental Table 1) in river buffalo, but not identified in cattle. We also presented 11,050 merged mobile element insertion (MEI) events (Supplemental Table 2) in river buffalo, out of which, 568 are within the upstream regions of annotated cattle genes. Furthermore, our tissue transcriptomics analysis provided expression profiles of genes impacted by MEI (Supplemental Tables 3–6) and CNV (Supplemental Table 7) events identified in this study. This data provides the genomic coordinates of identified CNV-deletions and MEI events. Additionally, normalized read counts of impacted genes, along with the adjusted p-values of statistical analysis are presented (Supplemental Tables 3–6).

Experimental design, materials and methods

Data used and experimental design

Genomic DNA samples from river buffalo were provided by the International Water Buffalo Genome Consortium. Sequence data was generated at the USDA Agricultural Research Service (Beltsville) on an Illumina Genome Analyzer II. All sequencing data were submitted to NCBI (accession #PRJNA350833). Genomic sequencing reads from cattle were deposited to NCBI (accession #PRJNA277147). For whole transcriptome sequencing data, raw reads of river buffalo tissue transcriptomics were deposited to NCBI (accession #PRJEB4351). For cattle, we used RNA-seq data from the Angus breed (accession #PRJNA311009). This study used the extensively annotated UMD3.1 cattle reference genome as a basis for comparisons between river buffalo and cattle, by aligning whole genome shotgun sequencing reads from river buffalo to the cattle reference genome. To identify river buffalo specific, genomic variants, CNV, SNP and MEI calls resulting from the cattle WGS reads were used as a background filter to remove variant sites previously identified in cattle from the river buffalo dataset.

Structural variant calling

To detect mobile element insertions (MEIs), RAPTR-SV [2] version 0.0.14 (run with default parameters) and RepeatMasker (http://www.repeatmasker.org/) were used. We selectively focused on trans-chromosomal read pair alignments from RATPR-SV's preprocess divet file format. RepeatMasker generated tabular output from the cattle reference genome was used to determine candidate repetitive origins of trans-chromosomal reads. Using a custom Java program that selectively clusters trans-chromosomal read pairs and intersects them with repetitive elements (https://github.com/njdbickhart/MEIDivetID), only discordant reads unlikely to consist of misaligned repetitive elements were considered in this analysis. To ensure that trans-chromosomal repetitive reads were not simply misalignments of local repeats to the wrong chromosome, the program searched for the nearest repetitive element of the same class (as determined by RepeatMasker) within 1 kb of the anchor read fragment. If none were found, the event was output as a putative MEI near the anchor read position, with the true event assumed to be downstream of the forward orientation of the anchor read, and within a distance close to the sequence library average insert size. Bedtools suite [6] was to identify genes impacted by MEI events. Genes and their promoter regions were included to identify intersections. To identify copy number variations, cn.mops [3] version 3.5 and JaRMS [4], a Java language port of the CNVnator software package [5] was used. The Bedtools suite [6] was used to find consensus calls between JaRMS and cn.mops CNV and custom perl scripts (https://github.com/njdbickhart/perl_toolchain). CNV deletions shared by both JaRMS and cn.mops were further intersected with cattle gene coordinates.

Comparative gene expression analysis between cattle and river buffalo

RNA-sequencing reads from river buffalo (NCBI, PRJEB4351) and the Angus breed of cattle (NCBI, PRJNA311009) were used to compare the expression differences of genes impacted by MEI and CNV-deletions. For MEI-impacted genes, RNA-seq data from liver and muscle were used. For CNV-deletion impacted genes, analyses were performed for all the tissues for which we had RNA sequencing data. To avoid potential quantification bias introduced by sequencing depth, gene-level, raw read counts obtained from STAR [7] were normalized/divided by a “per million reads” factor (obtained by dividing the total # of raw read counts by 1,000,000). Normalized read counts produced by the above steps were then used for gene expression comparisons between cattle and river buffalo. SAM (significant analysis of microarrays) [8], [9], [10], [11] was used to calculate statistical significance of gene expression differences in river buffalo compared to cattle (< 0.05, q-value cutoff used).
Subject areaBiology
More specific subject areaComparative genomics
Type of dataTables
How data was acquiredWhole genome sequencing and whole transcriptome sequencing
Data formatFiltered and analyzed
Experimental factorsnone
Experimental featuresComparative genomics between water buffalo and cattle
Data source locationnone
Data accessibilityTables are with this article. Raw read data of whole genome and transcriptome sequencing were deposited to NCBI Bioprojects as the following: PRJNA350833 (https://www.ncbi.nlm.nih.gov/bioproject/?term=350833)
PRJNA277147 (https://www.ncbi.nlm.nih.gov/bioproject/?term=277147) and PRJEB4351 (https://www.ncbi.nlm.nih.gov/bioproject/?term=PRJEB4351)
Related research articleComparative sequence alignment reveals River Buffalo genomic structural differences compared with cattle
  10 in total

1.  Significance analysis of microarrays applied to the ionizing radiation response.

Authors:  V G Tusher; R Tibshirani; G Chu
Journal:  Proc Natl Acad Sci U S A       Date:  2001-04-17       Impact factor: 11.205

2.  A Survey and Comparative Study of Statistical Tests for Identifying Differential Expression from Microarray Data.

Authors:  Sanghamitra Bandyopadhyay; Saurav Mallik; Anirban Mukhopadhyay
Journal:  IEEE/ACM Trans Comput Biol Bioinform       Date:  2014 Jan-Feb       Impact factor: 3.710

3.  CNVnator: an approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing.

Authors:  Alexej Abyzov; Alexander E Urban; Michael Snyder; Mark Gerstein
Journal:  Genome Res       Date:  2011-02-07       Impact factor: 9.043

4.  RAPTR-SV: a hybrid method for the detection of structural variants.

Authors:  Derek M Bickhart; Jana L Hutchison; Lingyang Xu; Robert D Schnabel; Jeremy F Taylor; James M Reecy; Steven Schroeder; Curt P Van Tassell; Tad S Sonstegard; George E Liu
Journal:  Bioinformatics       Date:  2015-02-16       Impact factor: 6.937

5.  STAR: ultrafast universal RNA-seq aligner.

Authors:  Alexander Dobin; Carrie A Davis; Felix Schlesinger; Jorg Drenkow; Chris Zaleski; Sonali Jha; Philippe Batut; Mark Chaisson; Thomas R Gingeras
Journal:  Bioinformatics       Date:  2012-10-25       Impact factor: 6.937

Review 6.  Water buffalo genome science comes of age.

Authors:  Vanessa N Michelizzi; Michael V Dodson; Zengxiang Pan; M Elisabete J Amaral; Jennifer J Michal; Derek J McLean; James E Womack; Zhihua Jiang
Journal:  Int J Biol Sci       Date:  2010-06-17       Impact factor: 6.580

7.  BEDTools: a flexible suite of utilities for comparing genomic features.

Authors:  Aaron R Quinlan; Ira M Hall
Journal:  Bioinformatics       Date:  2010-01-28       Impact factor: 6.937

8.  cn.MOPS: mixture of Poissons for discovering copy number variations in next-generation sequencing data with a low false discovery rate.

Authors:  Günter Klambauer; Karin Schwarzbauer; Andreas Mayr; Djork-Arné Clevert; Andreas Mitterecker; Ulrich Bodenhofer; Sepp Hochreiter
Journal:  Nucleic Acids Res       Date:  2012-02-01       Impact factor: 16.971

9.  Analyzing large gene expression and methylation data profiles using StatBicRM: statistical biclustering-based rule mining.

Authors:  Ujjwal Maulik; Saurav Mallik; Anirban Mukhopadhyay; Sanghamitra Bandyopadhyay
Journal:  PLoS One       Date:  2015-04-01       Impact factor: 3.240

10.  Annotated Draft Genome Assemblies for the Northern Bobwhite (Colinus virginianus) and the Scaled Quail (Callipepla squamata) Reveal Disparate Estimates of Modern Genome Diversity and Historic Effective Population Size.

Authors:  David L Oldeschulte; Yvette A Halley; Miranda L Wilson; Eric K Bhattarai; Wesley Brashear; Joshua Hill; Richard P Metz; Charles D Johnson; Dale Rollins; Markus J Peterson; Derek M Bickhart; Jared E Decker; John F Sewell; Christopher M Seabury
Journal:  G3 (Bethesda)       Date:  2017-09-07       Impact factor: 3.154

  10 in total
  6 in total

1.  Copy Number Variations of Four Y-Linked Genes in Swamp Buffaloes.

Authors:  Ting Sun; Quratulain Hanif; Hong Chen; Chuzhao Lei; Ruihua Dang
Journal:  Animals (Basel)       Date:  2019-12-22       Impact factor: 2.752

2.  In Silico Analysis of Sperm From Ejaculates with Different Semen Characteristics.

Authors:  Jesús Alfredo Berdugo Gutiérrez; Walter D Cardona Maya
Journal:  J Reprod Infertil       Date:  2021 Jul-Sep

3.  A cattle graph genome incorporating global breed diversity.

Authors:  A Talenti; J Powell; J D Hemmink; E A J Cook; D Wragg; S Jayaraman; E Paxton; C Ezeasor; E T Obishakin; E R Agusi; A Tijjani; W Amanyire; D Muhanguzi; K Marshall; A Fisch; B R Ferreira; A Qasim; U Chaudhry; P Wiener; P Toye; L J Morrison; T Connelley; J G D Prendergast
Journal:  Nat Commun       Date:  2022-02-17       Impact factor: 17.694

4.  Comparative Genomic Characterization of Insulin-Like Growth Factor Binding Proteins in Cattle and Buffalo.

Authors:  Muhammad Saif-Ur Rehman; Muqeet Mushtaq; Faiz-Ul Hassan; Muhammad Mushahid; Borhan Shokrollahi
Journal:  Biomed Res Int       Date:  2022-07-25       Impact factor: 3.246

5.  Comparative Genomic Characterization of Relaxin Peptide Family in Cattle and Buffalo.

Authors:  Muhammad Saif-Ur Rehman; Faiz-Ul Hassan; Zia-Ur Rehman; Hafiz Noubahar Hussain; Muhammad Adnan Shahid; Muhammad Mushahid; Borhan Shokrollahi
Journal:  Biomed Res Int       Date:  2022-10-04       Impact factor: 3.246

6.  Genome Wide Prediction, Mapping and Development of Genomic Resources of Mastitis Associated Genes in Water Buffalo.

Authors:  Sarika Jaiswal; Jaisri Jagannadham; Juli Kumari; Mir Asif Iquebal; Anoop Kishor Singh Gurjar; Varij Nayan; Ulavappa B Angadi; Sunil Kumar; Rakesh Kumar; Tirtha Kumar Datta; Anil Rai; Dinesh Kumar
Journal:  Front Vet Sci       Date:  2021-06-18
  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.