Literature DB >> 30395234

Genome assembly and gene expression in the American black bear provides new insights into the renal response to hibernation.

Anuj Srivastava¹, Vishal Kumar Sarsani¹, Ian Fiddes², Susan M Sheehan¹, Rita L Seger³, Mary E Barter¹, Selena Neptune-Bear¹, Charlotte Lindqvist⁴, Ron Korstanje¹.

Abstract

The prevalence of chronic kidney disease (CKD) is rising worldwide and 10-15% of the global population currently suffers from CKD and its complications. Given the increasing prevalence of CKD there is an urgent need to find novel treatment options. The American black bear (Ursus americanus) copes with months of lowered kidney function and metabolism during hibernation without the devastating effects on metabolism and other consequences observed in humans. In a biomimetic approach to better understand kidney adaptations and physiology in hibernating black bears, we established a high-quality genome assembly. Subsequent RNA-Seq analysis of kidneys comparing gene expression profiles in black bears entering (late fall) and emerging (early spring) from hibernation identified 169 protein-coding genes that were differentially expressed. Of these, 101 genes were downregulated and 68 genes were upregulated after hibernation. Fold changes ranged from 1.8-fold downregulation (RTN4RL2) to 2.4-fold upregulation (CISH). Most notable was the upregulation of cytokine suppression genes (SOCS2, CISH, and SERPINC1) and the lack of increased expression of cytokines and genes involved in inflammation. The identification of these differences in gene expression in the black bear kidney may provide new insights in the prevention and treatment of CKD.

Entities: CellLine Chemical Disease Gene Species

Keywords: RNA-Seq; black bear; genome assembly; hibernation; kidney

Mesh：

Substances：

Year: 2019 PMID： 30395234 PMCID： PMC6379037 DOI： 10.1093/dnares/dsy036

Source DB: PubMed Journal: DNA Res ISSN： 1340-2838 Impact factor: 4.458

1. Introduction

The prevalence of chronic kidney disease (CKD) is rising worldwide and currently 10–15% of the global population suffer from CKD and its devastating complications. Although the adult human kidney has some ability to recover from acute kidney injury through cellular proliferation of the damaged intrarenal tissues, regenerating nephrons through de novo nephron development is considered unlikely, as the formation of new nephrons in humans is terminated at the embryonic stages. However, renal regeneration through nephron neogenesis in the event of renal injury has been described in some fish species and the possibility of this happening in mammalian species has not been excluded. Members of the bear family (Ursidae) might be such species and studying how they deal with periods of decreased kidney function during hibernation (biomimicry) could be a new approach to understand kidney disease and develop new treatments. The American black bear hibernates for up to 7 months annually. During this period, they do not eat, drink, urinate, or defecate. Bear hibernation is a state similar to prolonged sleep during which body temperature is reduced by 1–8°C, there is a 20–50% reduction in metabolic rate with a depressed heart rate, and the volume of urine produced is reduced by 95%. The small volumes of urine and urea that enter the bladder during hibernation are reabsorbed across the bladder epithelium, and the urea is recycled for production of new protein. Throughout hibernation the kidney continues to concentrate urine and produces renin, erythropoietin, and vitamin D 1-α-hydroxylase. Hibernating bears have the ability to prevent azotemia (high levels of nitrogen-containing compounds in the blood, common in human patients with renal function), but the mechanism is unknown. Understanding these processes could lead to creating novel therapies for treating human conditions related to resistance to the complications of CKD and recovery from acute kidney injury. The studies aforementioned suggest unique kidney features in the American black bear allowing them to endure lower functioning during hibernation and recovery soon after hibernation. They are likely in part encoded in the genome sequence and gene expression patterns unique to the bear. To address this, we have performed high-throughput sequencing of genomic DNA and RNA isolated from kidneys of wild black bears. We generated a de novo assembly and annotation of the complete genome and compared transcription profiles of kidneys collected in the spring (within weeks after hibernation) and in the fall (before hibernation).

2. Materials and methods

2.1. Sample collection, library preparation and sequencing

Bear samples were obtained by hunters during the hunting seasons in Maine. Hunters were asked to participate on a voluntary base and no bears were killed for the specific purpose of this study. All methods were carried out in accordance with relevant guidelines and regulations. DNA was isolated using the DNeasy Blood & Tissue Kit (Qiagen). The whole-genome library was prepared using the KAPA Hyper Prep Kit (Kapa Biosystems, Inc., Wilmington, MA) with a bead-based size selection to select for inserts with an average size of 400 bp and 6 cycles of PCR. Sequencing was done on two 2 × 125 bp Illumina 2500 lanes. The mate pair library was prepared using the Illumina Nexera Mate Pair Kit (Illumina, San Diego, CA, USA) with a gel-based size selection to select for inserts with an average size of 10 kb and 14 cycles of PCR. Sequencing was done on two 2 × 100 bp Illumina HiSeq 2500 lanes. The PacBio library was prepared using the Pacific Biosciences SMRTbell Template Prep Kit 1.0 (Pacific Biosciences, Menlo Park, CA, USA) using the ‘20-kb Template Preparation Using BluePippin Size-Selection System (15-kb Size cutoff)’ protocol obtained from PacBio SampleNet. The BluePippin was set to collect from 7 to 50 kb. Sequencing was done on 23 PacBio SMRT cells. For RNA-Seq, hunters collected kidneys and stored them in liquid nitrogen. Kidneys were first homogenized in Trizol (Invitrogen) and an aliquot was used for RNA isolation using the miRNeasy Mini kit (Qiagen), according to manufacturers’ protocols, including the optional DNase digest step. Quality was assessed using an Agilent 2100 Bioanalyzer instrument and RNA 6000 Nano LabChip assay (Agilent). We prepared total RNA for sequencing using the Illumina TruSeq methodology (TruSeq Stranded Total RNA LT Sample Prep Kit with Ribo-Zero Gold). The first step involves the removal of ribosomal RNA (rRNA) using biotinylated, target-specific oligos, combined with Ribo-Zero rRNA removal beads. After individual samples were bar-coded they were pooled and sequencing was done on two 2 × 75 bp Illumina NextSeq 2500 lanes. All raw data have been submitted to the NCBI’s Sequence Read Archive (http://www.ncbi.nlm.nih.gov/sra) under accession number SRP075217.

2.2. Sequence assembly

All data were subjected to quality control check and samples with base qualities greater ≥30 over 70% of read length were used in the downstream analysis. KmerFreq_HA v2.0 and Corrector_HA v2.01 tools of SOAPec_v2.01 were further used to perform K-mer frequency generation and error correction of paired-end and mate-pair data, respectively. SOAPdenovo-127mer (v2.04) was used to perform the contig and scaffold assembly from paired end and mate-pair libraries (avg_ins = 350 and 10 K for paired-end and mate-pair, respectively). After the assembly, gaps in scaffolds were closed with the GapCloser tool (v1.12) (with −l 125) of the soap module. Smaltmap, perfectfrombam and pipeline tools of the reapr module (v 1.0.17) was used to recognize the errors in the assembly by re-mapping of paired-end and mate-pair data to the de novo assembled genome. The assembly was broken at potential misassembled points. The broken assembly was further used as an input for the jelly tool of the PBSuite (v15.2.20) with blasr (v1.3.1) at parameters [—minMatch 8—minPctIdentity 80—bestn 1—nCandidates 20—maxScore—500]. This tool was used to upgrade the existing Illumina assembly with low-pass PacBio data. Repeat masking was performed by downloading the Repbase TE library from the repbase server (http://www.girinst.org/server/RepBase/). To identify known TE elements, we used repeat masker and repeat protein mask software in the Repeat Masker package (www.repeatmasker.org), which identifies TEs by aligning the genome sequence to a defined TE database. Tandem repeats were predicted using TRF by using the default parameters ‘Match = 2, Mismatch = 7, Delta = 7, PM = 80, PI = 10, Minscore = 50, and MaxPeriod = 12’. The completeness of the assembly was estimated by using CEGMA. The assembly was screened against a collection of 248 universal eukaryotic single-copy genes. Core eukaryotic gene datasets were downloaded and a blast database was made from the assembly before running CEGMA. This Whole Genome Shotgun project has been deposited at DDBJ/ENA/GenBank under the accession LZNR00000000. Version LZNR01000000 is described in this work. It is also available at ftp://ftp.jax.org/maine_blackbear_project/.

2.3. Annotation

The black bear genome assembly was annotated using Comparative Annotation Toolkit (CAT—https://github.com/ComparativeGenomicsToolkit/Comparative-Annotation-Toolkit). CAT uses whole genome alignments generated by progressiveCactus to project gene annotations from a high quality reference genome on to one or more target genomes. This process leverages previously curated annotation sets to rapidly construct a set of orthologues in the target genomes. After transcript projection, the protein coding projections are provided to the ab-initio gene finding tool AUGUSTUS with additional extrinsic information derived from RNA-Seq. AUGUSTUS then enforces a coding gene model that allows for shifts in splice sites to maintain frame. This process can also rescue exons that did not align in the whole genome alignment. CAT also performs true ab-initio gene prediction by using a new method of running AUGUSTUS called Comparative Augustus, or AugustusCGP. This parameterization simultaneously predicts genes in all species in a progressiveCactus alignment, using RNA-Seq data in one or more species to help guide the annotation process. After projection, clean-up and ab-initio prediction, CAT combines these separate transcript sets into one final gene set through a consensus finding process. For orthologous transcripts, the transcript with the best fidelity to the reference with the best extrinsic support is chosen. In the case where multiple paralogous projections are found, CAT selects the most likely orthologue through a combination of splice junction fidelity, synteny and alignment identity. Finally, ab-initio predictions from AugustusCGP are evaluated for providing new information. If significant overlap with an orthologous projection is found, and the transcript provides a new splice junction or exon, then it is included as a new isoform of the orthologue. Otherwise, the locus is considered a candidate for a novel gene, often a gene family expansion. In addition to an annotation set on target genomes, CAT produces a UCSC Assembly Hub. This assembly hub has tracks for the raw transMap projections, the post-filtering projections, the various modes of AUGUSTUS employed, the final consensus annotation set, as well as the input RNA-Seq information including a filtered splice junction track. All annotations are stored in a modified bigBed format with a wide variety of various additional features annotated on each entry, which can be accessed from the details page for that entry. Additional features include RNA-Seq support for specific splice junctions and binary classifiers such as having a frame-shifting indel relative to the source transcript. In addition to the genome-specific tracks, an alignment track (snake track) shows the cactus alignment between the current species and other species. Browsing this assembly hub provides the opportunity to visualize the relationship between all aligned species and the various ways the transcript projections were filtered and combined with ab-initio predictions. Assembly hub can be visualized by visiting UCSC ‘Track Data Hubs’ (https://genome.ucsc.edu/index.html) and then adding URL ftp://ftp.jax.org/maine_blackbear_project/assemblyHub/hub.txt under the ‘My Hubs’ tab. For the black bear project, the cactus alignment generated contained black bear, horse (equCab2), dog (canFam3), polar bear (GCA_000687225.1), elephant (loxAfr3), human (hg38), and mouse (mm10) using the following guide phylogenetic tree: ((Human: 0.145908, Mouse: 0.356483)human_mouse_anc: 0.020593,(Horse: 0.109397,(Dog: 0.052458,(Polar_bear: 0.01, Maine_black_bear: 0.01)bear_anc: 0.08)dog_bear_anc: 0.069845)horse_dog_bear_anc: 0.043625)root; CAT was run using the Dog Ensembl V87 annotation set as the reference. RNA-Seq was obtained from SRA for dog (SRR2960309, SRR2960315, SRR2960317, SRR2960319, SRR2960320, SRR2960321, SRR2960328, SRR3727716, SRR3727717, SRR3727718, SRR3727719, SRR3727720, SRR3727722, SRR3727723, SRR3727724, SRR3727725, SRR5018836) and for polar bear (SRR950076, SRR950074) and aligned to their respective genomes. In addition, RNA-Seq was generated for Maine black bear as part of this project.

2.4. Gene expression analysis and RNA editing

Samples were aligned to the soft masked Bear Assembly using STAR aligner (v2.5.3) (—sjdbOverhang 75—quantMode GeneCounts—twopassMode Basic) with known annotation. STAR provided the expression count matrix was used for differential expression analysis with DESeq2. The genes with FDR <= 0.05 were considered as differentially expressed between spring and fall Samples. STAR generated bam were further processed by picard MarkDuplicates and GATK SplitNCigarReads to remove the duplicates and split reads into exonic segments and hard clip any reads hanging into the intronic region, respectively. The final bam of all 12 samples were used to perform the join variant calling by HaplotypeCaller (v3.4-0). The output of HaplotypeCaller was further processed to identify the sites showed alternate allele in all 12 samples or that were genotype as reference in all six samples of one season and as alternate allele in the other season. We further filtered variants to set of canonical editing sites (and their reverse complementary). RNA editing prediction tool-REDItools was used to confirm the filtered sites, strand identification and its count was used as final count to estimate the editing frequency. These sites were subjected to confirmation in DNA-seq data generated for six samples. The whole genome sequence data of these bears were subjected to quality trimming and aligned by bwa-mem (v0.7.9a). The alignments were converted to bam and processed by Picard-MarkDuplicates to remove the duplicates from the data. Afterwards, the Genome Analysis tool kit module IndelRealigner was used to pre-process the alignments. The realigned bam file was processed to identify the coverage at potential editing sites. The sites with at least one DNA-seq sample support were included in final set for experimental validation. Gene expression profiles from embryonic mouse kidneys were downloaded from NCBI GEO (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi? acc=GSE3808). Six samples representing three time points (two replicates each) were used in the analysis. For the analysis, E12.5, E13.5, and E16.5 were considered early & late age samples, respectively. GEO samples accessions of array data are GSM87387, GSM87388, GSM87389, GSM87390, GSM87391, and GSM87392. Samples were normalized using the affy package and GenBank accession for the probs (Affymetrix Mouse Expression 430A Array) were downloaded from GEO (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi? acc=GPL339). We extracted the MGI Gene IDs corresponding to GenBank accession using mouse genome informatics database (http://www.informatics.jax.org), performed the principal component analysis (PCA) on the mouse array data and created a PCA biplot (Supplementary Table S10). To map the differentially expressed (DE) bear genes on the PCA biplot, we first extracted the corresponding dog gene Ensembl id from the bear generic feature format file (gff) and extracted the Ensembl id of mouse gene orthologues of dog from biomart. Only genes with homology type: ‘ortholog_one2one’, orthology confidence: high and Query: target & target: query identity (≥50%) between dog and mouse were selected (153 genes remaining at the end) for mapping. Finally, we extracted the MGI:ID for the mouse ensembl id and mapped it on the mouse PCA biplot to create an eggogram (Fig. 5) and performed an analysis for significance (Supplementary Fig. S4).

Figure 5

The mouse kidney developmental expression profile at E12.5–E16.5 in their first two temporal principal components (PC1-2) with the mouse orthologues of genes that are downregulated after hibernation in blue and genes that are upregulated in red. Color figure is available at online version.

3. Results

3.1. Sequencing and assembly of the black bear genome

The ability to explore the black bear genome for unique features and to facilitate gene expression analysis depends on an assembled and well-annotated genome. High-throughput sequencing of the black bear genome has been previously reported, but only using Illumina short-read sequencing to a ∼30× average depth of coverage. We improved this coverage by extracted DNA from a single male Maine black bear (JAX001) and sequencing it using three methods: paired-end sequencing and mate-pair sequencing using the Illumina HiSeq 2500 platform and single molecule sequencing using the PacBio RSII system (Supplementary Tables S2 and S3). Raw reads were assembled into 113,759 scaffolds and contigs ≥1,000 bases with an N50 length of 190 kb, totalling 2.59 Gb in length (Supplementary Table S4). This is slightly larger than the estimated size of the panda bear assembly (2.46 Gb) and the polar bear assembly (2.53 Gb). We estimated the quality of the assembled sequence by mapping all paired-end reads back to the assembled genome with Burrows–Wheeler Aligner to determine the mappability and median coverage of the assembled genome. Approximately 90% of reads mapped back to the assembly with a mapping quality of ≥30 and with a median coverage of 43×. The peak sequencing depth was 50×, and more than 20 reads covered over 85% of the assembled sequences (Fig. 1a). The completeness of our assembly was estimated using CEGMA by screening against 248 highly conserved core eukaryotic genes. Our black bear assembly covers 212 out of 248 genes completely and 241 out of 248 genes partially. Figure 1b shows this 85.5% completeness compared with 88.7% for the panda bear assembly and 90.7% for the polar bear assembly, which are all comparable. GC content in mammals is correlated with a number of genomic features that are functionally relevant, for example, gene density, transposable element distribution, and methylation rate. The GC-content distribution in the black bear covers a narrower range compared with the panda bear and the polar bear (Fig. 1c). This might be a consequence of the lower amount of repetitive sequences that is present in the black bear genome (Fig. 2). Thus, we were able to establish a genome assembly of good coverage and quality for the black bear, comparable with polar bear and panda bear.

Figure 1

Sequencing depth and comparison of several characteristics of the black bear, panda, and polar bear genome assemblies. (A) Distribution of sequencing depth of the assembled genome (B) Completeness and contiguity of the assembly was estimated using CEGMA by screening against 248 universal eukaryotic single-copy genes. Results for the black bear (85.5% completeness) were comparable with the panda (88.7%) and polar bear (90.7%). (C) Density plot of the %GC content in the bear genomes comparing the black bear (solid line) with the panda (dashed line) and polar bear (dotted line). The narrower distribution in the black bear might indicate fewer repetitive sequences.

Figure 2

Comparison of different types of repeats between the black bear and several mammalian species shows a lower number of repeat sequences in the black bear.

3.2. Annotation of the black bear genome

Similar to the panda genome, our black bear genome assembly was annotated using the Comparative Annotation Toolkit (CAT). CAT identified 29,624 genes (18,091 coding) in our black bear assembly, representing 88% of the genes present in the dog annotation. In addition, 2,995 transcripts with at least 2 splices were predicted by AugustusCGP, which also had at least one intron junction supported by RNA-Seq and not supported by transMap. Of these, 1,730 were associated with a transMapped gene and assigned as a novel isoform of that gene. The remainder were evaluated for being putatively novel loci. Six hundred and fifty-three loci were identified as being possibly paralogous, which is defined as overlapping a paralogous transMap projection, which was dropped in paralogue resolution. Twenty-eight loci were identified as possible false fusions, defined as having >80% overlap with more than one transMap projection. Four hundred and fifty-eight loci were identified as poor alignments, which are predictions in regions, which have alignment to a known gene but did not pass transMap filtering. This left 126 putatively novel loci. Of the 22,081 orthologous protein coding transcripts identified, pairwise codon-aware alignments of coding sequences identified frame-shifting indels in 7,848 transcripts. A slight enrichment of frame-shifting deletions was seen relative to insertions, suggesting a systematic bias in the assembly process.

3.3. High-throughput sequencing of renal RNA and differential expression between fall and spring

Kidneys were collected in the fall (animals JAX118, JAX134, JAX141, JAX146, JAX160, and JAX165), before hibernation, and in the spring (animals JAX101, JAX 103, JAX106, JAX113, JAX136, and JAX140), shortly after the bears emerged from hibernation (three males and three females for each time point). The timing of sample collection was (in part) driven by practicality. It was only possible to collect samples during a few specific periods, which were the late fall, before hibernation, and early spring, approximately 1 or 2 weeks after the animals come out of hibernation. Approximately 60 million RNA-Seq reads were obtained from each sample before quality control (Supplementary Table S5). After appropriate quality control and correction for batch effect, we performed a PCA (Supplementary Fig. S1a) and determined that sample 101 (from a male in the spring) was an outlier (see Supplementary Fig. S3 and Table S9). After removing this sample and repeating the analysis (Supplementary Fig. S1b) we compared the spring samples to the fall samples. We identified 169 differentially expressed protein-coding genes with an adjusted P-value below 0.05, of which CISH, SLCO1C1, and NDST3 showed a >2-fold change (Fig. 3 and Supplementary Table S6). Of these, 101 genes were downregulated and 68 genes were upregulated after hibernation. Fold changes ranged only from 1.8-fold downregulation (RTN4RL2) to 2.4-fold upregulation (CISH). Figure 4 shows a heatmap for all 11 bears with the 50 most significant genes (Supplementary Fig. S2 shows all 169 genes). Pathway enrichment analysis (Ingenuity Pathway Analysis) of the 169 genes did not identify any specific pathways. However, most notable was the upregulation of three cytokine suppression genes (SOCS2, CISH, and SERPINC1) and the lack of increased cytokine expression (e.g. IL6, CCL2, CCL6) and damage markers (LCN2 and HAVCR1) normally seen in lower functioning or recovering kidneys of other species.

Figure 3

Volcano plot of the differential gene expression between spring and fall with CISH, SLC01C1, and NDST3 being upregulated >2-fold in the spring compared with the fall.

Figure 4

Cluster analysis of the gene expression profiles showing the 50 genes with the lowest P-value.

Volcano plot of the differential gene expression between spring and fall with CISH, SLC01C1, and NDST3 being upregulated >2-fold in the spring compared with the fall. Cluster analysis of the gene expression profiles showing the 50 genes with the lowest P-value. Because of the lack in inflammatory response, we asked whether the differentially expressed genes matched a particular development stage of the kidney. We used a method previously employed to place human cancers on a developmental landscape. First, we obtained gene expression profiles from mouse kidneys at different developmental stages (from E12.5 through E16.5) that were deposited in NCBI GEO and plotted the developmental expression profile in their first two temporal principal components (PC1-2). The high-dimensional expression profiles are simplified into a developmental timeline, ordering the genes in a linear array based on their temporal pattern of expression. Early genes are localized on the left end, genes with no bias towards early or late expression centre in the middle and late genes localize the right end. Thus, the unique order of genes on the timeline represents a summary of early and late states for each developmental process. Next, we matched the differentially expressed bear genes with their mouse orthologues (153 out of 169 genes) and plotted their location on the developmental timeline for the genes that were expressed in the developing kidney (109 genes). We observed clustering of downregulated genes with the early mouse genes (E12.5) and clustering of upregulated genes with the late mouse genes (16.5) (Fig. 5). The processes that take place in the E12.5 kidney are focused on progenitor cell dynamics, FGF, and Wnt signaling, which we know are pro-fibrotic in the adult. In contrast, E16.5 is very different as it involves glomerulogenesis, tubulogenesis and maturation. The mouse kidney developmental expression profile at E12.5–E16.5 in their first two temporal principal components (PC1-2) with the mouse orthologues of genes that are downregulated after hibernation in blue and genes that are upregulated in red. Color figure is available at online version. Taken together, despite the many months of reduced renal function and downregulated metabolism during hibernation, the gene expression differences between spring and fall bears is limited both in number of genes that are differentially expressed, fold-change, and the lack of upregulation of genes that are typically seen in other species after a period of low function or damage. In fact, the expression profile of bears coming out of hibernation resembles a developmental stage.

3.4. Identification of RNA-editing in the black bear kidney

Our RNA-Seq data identified possible RNA-editing in a small set of genes when aligning the reads to our assembled genome sequence. To rule out the possibility of variation in the transcripts due to single nucleotide polymorphisms (SNPs) in the genomes of the bears for which we performed the RNA-Seq, we sequenced the genomes of bears 101, 103, 113, 118, 134, and 160 at a low coverage (Supplementary Table S7). After comparing with the genome sequences and filtering for protein-coding transcripts we identified 38 transcripts containing a different variant (between 10% and 95% of the total reads) from the genome sequence (Supplementary Table S8). Of these 38 transcripts, 30 (79%) show A-to-I editing, while the other sites have C-to-U editing. Almost half of the edited sites are in the coding region and five are predicted to lead to an amino acid change. According to the REDIportal (http://srv00.recas.ba.infn.it/atlas/) RNA-editing has been observed in four of the genes in the human kidney (although not at the same base pair) and for one gene, FLNB, RNA-editing also occurs both in the human and the mouse kidney. Interestingly, RNA-editing in 37 of the transcripts is observed in most or all 12 bears, but for ZNF688 we find editing in all 6 spring bears and no editing in the 6 fall bears. To confirm our observations in the high-throughput sequencing data, we randomly selected eight editing sites and designed primers flanking these sites that could discriminate between RNA and possible genomic DNA contamination. Subsequent Sanger sequencing confirmed all eight editing sites (Supplementary Table S8).

4. Discussion

Previous studies in bears suggest they have unique features in the kidney that allow them to endure lower renal functioning during hibernation and recovery soon after hibernation. As these features are likely in part encoded in the genome sequence and gene expression patterns unique to the bear, we set out to explore the genome sequence and renal gene expression of the American black bear (Ursus americanus). A first important step was to establish the complete annotated genome sequence. To accomplish this, we used three approaches for sequencing a single male bear with 100× coverage before filtering. This resulted in a final, high quality, assembly. This current genome is comparable in quality and size to the published panda genome and polar bear genome. A comparison between the three genomes shows a narrower range in GC-content in the black bear (Fig. 1c), likely the result of fewer repetitive sequences caused by transposon integration (Fig. 2). The new Comparative Annotation Toolkit (CAT) pipeline allowed us to establish a high-quality annotation of the genome with 22,081 protein coding transcripts and identification of 126 putative novel loci that warrant further investigation as to their uniqueness and possible function. Based on the reported changes in the bear kidney function during hibernation, we expected we would find a signature of these changes in the gene expression of kidneys collected soon after hibernation when compared with kidneys collected before hibernation. We therefore collected samples from 3 male and 3 female bears (samples were collected within 10 days from each other) soon after hibernation and from 3 male and 3 female bears before hibernation. One limitation is that the time between waking up from hibernation and sample collection is unknown and we assumed it was similar for all animals. Deviation from this assumption is expected to lead to more variation in the gene expression profile. RNA-Seq analysis identified 169 protein coding genes that were differentially expressed between the two time points. Of these, 101 genes were downregulated and 68 genes were upregulated after hibernation. Fold changes only ranged from 1.8-fold downregulation (RTN4RL2) to 2.4-fold upregulation (CISH) and only three genes showed a >2-fold change (CISH, SLCO1C1, and NDST3). Little is known about the function in the kidney of these genes: CISH is a member of the SOCS family and involved in suppression of cytokine signaling through the JAK-STAT5 pathway. SLCO1C1 is a member of the organic anion transport family and best known for mediating sodium-independent uptake of thyroid hormones in brain tissue. NDST3 is a member of the heparan sulphate/heparin GlcNAc N-deacetylase/N-sulphotransferase family and increased expression indicates an increase in N-sulphation of heparin sulphate in the kidney. Since kidney glucosaminoglycans have been shown to be related to the ability of turtles to conserve water, it can be speculated that the increased expression of NDST3 in hibernating black bear kidneys protect the animals during the long period of dehydration during winter sleep. Moreover, since the endothelial glycocalyx on the vascular endothelial cells plays a major role in albuminuria and kidney disease, it could be speculated that upregulation of NDST3 may protect the endothelial glycocalyx in bear kidneys during the anuric hibernation. Most notable among the differentially expressed genes was the upregulation of cytokine suppression genes (SOCS2, CISH, and SERPINC1) in the post-hibernating bears and the lack of increased cytokine expression (such as IL6 and CCL2) and damage markers such as HAVCR1 (KIM-1) and LCAN (NGAL). Pathway enrichment analysis (IPA) did not identify any pathways that were significantly overrepresented among the differentially expressed genes. These findings support previous observation that despite several pro-inflammatory risk factors during the hibernation period (such as kidney dysfunction and immobilization), there are no signs of systemic inflammation. This is further supported by our PCA in which the genes that are downregulated after hibernation cluster with a developmental stage in which FGF and Wnt signaling, which we know are pro-fibrotic in the adult kidney, play an important role and genes that are upregulated after hibernation cluster with a developmental stage in which glomerulogenesis, tubulogenesis and maturation are important. In patients with CKD, systemic inflammation is a common finding that promotes premature aging and predicts poor outcome. Although this is the first study in which gene expression is analysed in kidney tissue, it is not the first in which gene expression in bears has been examined. Fedorov et al. analysed samples from captive bears that were either kept in hibernating or non-hibernating conditions with a custom microarray. They analysed heart and liver samples and found 245 differentially expressed genes in the heart and 319 genes in the liver. We were able to map the majority (85.2%) of their ESTs onto our genome assembly and match them to our transcripts. However, none of the genes that were found to be differentially expressed in heart and liver overlapped with differentially expressed genes in the kidney. The lack of overlap could be due to difference in experimental conditions such as the timing of tissue harvest. Alternatively, it may indicate that while nearly every tissue adapts to hibernation, the ensemble of genes involved is highly tissue-specific. Overall, both the number of differentially expressed genes and the fold-changes are low, but are in line with the findings of Federov et al. Although it has been shown that American black bears maintain a reduced metabolic rate for up to 3 weeks after emerging from their dens, a possible explanation for our results is that the kidneys already completely recovered from hibernation within the short time between emerging from hibernation and sample collection. Another explanation is that the reduction in metabolic rate and urine production has a much smaller impact on the kidney than would be predicted. Our RNA-Seq data also identified the presence of RNA editing, which was confirmed for all sites that were tested with Sanger sequencing. Several of the genes that are edited also show RNA-editing in mouse and human, but at different positions (Supplementary Table S8). The most interesting is the RNA editing in the 3′ UTR of the transcriptional regulator ZNF688, which only seems to happen in the spring. This suggests season-dependent RNA-editing and a possible mechanism through which gene expression is altered. In conclusion, we established a high-quality and well-annotated genome sequence of the black bear (U. americanus). RNA-Seq of kidney samples before and after hibernation shows RNA-editing and the modest differential expression of a set of 169 genes that might be involved in the unique stress response due to hibernation in the bear. Our results suggest that during hibernation, changes in gene expression favours anti-inflammatory pathways. Further studies are needed to understand the effects of these expression differences in the kidney. Click here for additional data file.

10 in total

Review 1. Using Genetic and Species Diversity to Tackle Kidney Disease.

Authors: Michael R Garrett; Ron Korstanje
Journal: Trends Genet Date: 2020-04-30 Impact factor: 11.639

2. Polar bear evolution is marked by rapid changes in gene copy number in response to dietary shift.

Authors: David C Rinker; Natalya K Specian; Shu Zhao; John G Gibbons
Journal: Proc Natl Acad Sci U S A Date: 2019-06-17 Impact factor: 11.205

3. Uncovering the enigmatic evolution of bears in greater depth: The hybrid origin of the Asiatic black bear.

Authors: Tiantian Zou; Weimin Kuang; Tingting Yin; Laurent Frantz; Chang Zhang; Jianquan Liu; Hong Wu; Li Yu
Journal: Proc Natl Acad Sci U S A Date: 2022-07-18 Impact factor: 12.779

4. An Annotated Draft Genome for the Andean Bear, Tremarctos ornatus.

Authors: Nedda F Saremi; Jonas Oppenheimer; Christopher Vollmers; Brendan O'Connell; Shard A Milne; Ashley Byrne; Li Yu; Oliver A Ryder; Richard E Green; Beth Shapiro
Journal: J Hered Date: 2021-07-15 Impact factor: 2.645

5. Insights in the regulation of trimetylamine N-oxide production using a comparative biomimetic approach suggest a metabolic switch in hibernating bears.

Authors: Thomas Ebert; Johanna Painer; Peter Bergman; Abdul Rashid Qureshi; Sylvain Giroud; Gabrielle Stalder; Karolina Kublickiene; Frank Göritz; Sebastian Vetter; Claudia Bieber; Ole Fröbert; Jon M Arnemo; Andreas Zedrosser; Irene Redtenbacher; Paul G Shiels; Richard J Johnson; Peter Stenvinkel
Journal: Sci Rep Date: 2020-11-23 Impact factor: 4.379

6. Conservation of chromatin conformation in carnivores.

Authors: Marco Corbo; Joana Damas; Madeline G Bursell; Harris A Lewin
Journal: Proc Natl Acad Sci U S A Date: 2022-03-01 Impact factor: 11.205

7. A Beary Good Genome: Haplotype-Resolved, Chromosome-Level Assembly of the Brown Bear (Ursus arctos).

Authors: Ellie E Armstrong; Blair W Perry; Yongqing Huang; Kiran V Garimella; Heiko T Jansen; Charles T Robbins; Nathan R Tucker; Joanna L Kelley
Journal: Genome Biol Evol Date: 2022-09-06 Impact factor: 4.065

8. Draft Genome Assembly for the Tibetan Black Bear (Ursus thibetanus thibetanus).

Authors: Chenglong Zhu; Wenjie Xu; Jianchuan Li; Chang Liu; Mingliang Hu; Yuan Yuan; Ke Yuan; Yijiuling Zhang; Xingzhi Song; Jin Han; Xinxin Cui
Journal: Front Genet Date: 2020-04-02 Impact factor: 4.599

9. Analyses of key genes involved in Arctic adaptation in polar bears suggest selection on both standing variation and de novo mutations played an important role.

Authors: Jose Alfredo Samaniego Castruita; Michael V Westbury; Eline D Lorenzen
Journal: BMC Genomics Date: 2020-08-06 Impact factor: 3.969

10. Long-read isoform sequencing reveals tissue-specific isoform expression between active and hibernating brown bears (Ursus arctos).

Authors: Elizabeth Tseng; Jason G Underwood; Brandon D Evans Hutzenbiler; Shawn Trojahn; Brewster Kingham; Olga Shevchenko; Erin Bernberg; Michelle Vierra; Charles T Robbins; Heiko T Jansen; Joanna L Kelley
Journal: G3 (Bethesda) Date: 2022-03-04 Impact factor: 3.542

10 in total