| Literature DB >> 33319913 |
Qionghua Gao1, Zijun Xiong1,2,3, Rasmus Stenbak Larsen4, Long Zhou3, Jie Zhao1, Guo Ding1,3,4, Ruoping Zhao1, Chengyuan Liu1, Hao Ran1, Guojie Zhang1,3,4,5.
Abstract
BACKGROUND: Ants with complex societies have fascinated scientists for centuries. Comparative genomic and transcriptomic analyses across ant species and castes have revealed important insights into the molecular mechanisms underlying ant caste differentiation. However, most current ant genomes and transcriptomes are highly fragmented and incomplete, which hinders our understanding of the molecular basis for complex ant societies.Entities:
Keywords: zzm321990 Monomorium pharaoniszzm321990 ; alternative splicing; long non-coding RNA; long-read sequencing; social insects
Year: 2020 PMID: 33319913 PMCID: PMC7736795 DOI: 10.1093/gigascience/giaa143
Source DB: PubMed Journal: Gigascience ISSN: 2047-217X Impact factor: 6.524
Figure 1:Characterization of M. pharaonis genome assembly. (A) Photo of pharaoh ant (Monomorium pharaonis) colony with 4 ant castes (queens, gynes, males, and workers). (B) Heat map of Hi-C interactions among all chromosomes of pharaoh ant. (C) Comparison of scaffold N50s and contig N50s of 27 short-read–assembled (pink filled circles) and 4 long-read–assembled (blue filled triangles) ant genomes. Previous short-read assembly for M. pharaonis is marked on the plot. (D) Genome collinearity of short-read and PacBio long-read assemblies shows that PacBio assembly exhibits better coverage of high GC content regions and repeat sequences. Blue indicates genes assembled by both sequencing methods; red indicates genes that are incomplete in short-read assembly but complete in PacBio assembly.
Summary of M. pharaonis genome features
| Reads | PacBio assembly | Hi-C assembly |
|---|---|---|
| Genome assembly size (bp) | 312,903,204 | 313,026,204 |
| No. of scaffolds | 193 | 274 |
| Scaffold N50 (bp) | 3,854,274 | 27,237,342 |
| Scaffold N90 (bp) | 800,084 | 20,211,500 |
| Maximum scaffold length (bp) | 18,497,097 | 48,563,521 |
| No. of contigs | 301 | 628 |
| Contig N50 (bp) | 2,769,621 | 2,456,926 |
| Contig N90 (bp) | 573,845 | 430,526 |
| Maximum contig length (bp) | 9,733,832 | 9,249,838 |
| GC content (%) | 36.39 | 36.39 |
| BUSCO assessment (n = 4 ,415) | C: 98.4%, D: 2.1%, F: 1.1% |
C: complete BUSCOs; D: duplicated BUSCOs, F: fragmented BUSCOs.
Statistics of functional annotation of protein-coding genes in pharaoh ant
| Statistic | Number | Percent (%) |
|---|---|---|
| Total | 15,327 | |
| InterPro | 13,831 | 90.24 |
| COG | 4,739 | 30.92 |
| GO | 8,562 | 55.86 |
| KEGG | 12,817 | 83.62 |
| SwissProt | 10,659 | 69.54 |
| TrEMBL | 15,229 | 99.36 |
| Annotated | 15,242 | 99.45 |
| Unannotated | 85 | 0.55 |
Figure 2:Genome collinearity and gene synteny of M. pharaonis (Mpha). (A) Genome collinearity of chromosome-level–assembled pharaoh ant and clonal raider ant (Ooceraea biroi [Obir]), showing marked genome rearrangements during genome evolution of the 2 species. (B) Synteny of flanking region of fem and csd across 11 ant species using recently produced reference genomes from the Global Ant Genomics Alliance (GAGA) and across 2 wasp species downloaded from the NCBI. Showing the rearrangements of fem and csd during ant genome evolution.csd and fem are marked in red, and other colors represent their neighbor genes in PacBio-assembled ant and ancestor wasp species.
Figure 3:Comparison of RNA-seq and ISO-seq gene annotations. (A) UTRs newly annotated in ISO-seq annotation. (B) Genes annotated incompletely by missing exons in RNA-seq annotation. (C) One gene was misannotated to multiple genes in RNA-seq annotation. (D) Two genes were misannotated as a combined gene in RNA-seq version but were correctly annotated in ISO-seq data. CDS, coding sequences; UTR, untranslated region.
Figure 4:Characterization of M. pharaonis isoforms from PacBio ISO-seq in 4 castes. (A) Saturation analysis of PacBio ISO-seq data on consensus transcripts, genome coverage, total number of isoforms, detectable genes, alternative splicing (AS) events, and detectable genes with AS. Consensus transcripts were yielded from multiple full-length non-chimeric reads in a single zero-mode waveguide by transcript clustering analysis. Because many isoforms could not be mapped to the reference genome owing to either sequencing errors or artificial transcripts, the total number of isoforms, which represents isoforms finally confirmed by mapping to the reference genome, was lower than the consensus transcripts.(B) Distribution of AS events in 4 ant castes. AltA, alternative acceptor site; AltD, alternative donor site; AltP, alternative position; ES, exon skipping; IR, intron retention.
Figure 5:Characterization of lncRNAs. (A) Comparisons of lncRNA length distribution among 4 species and 2 sequencing methods. (B) Classification of lncRNAs in pharaoh ant. (C) Heat map shows differentially expressed lncRNAs between worker and gyne brains. Each row represents 1 lncRNA, and each column represents 1 replicate of the corresponding caste. Relative lncRNA expression is depicted according to color scale. (D) Example of a highly conserved differentially expressed lncRNA between worker and gyne brains. The black line in the box indicates the median. TPM: transcripts per million.
Statistics of predicted lncRNAs in 4 castes
| Length (bp) | ||||
|---|---|---|---|---|
| Sample | No. lncRNAs | Minimum | Maximum | Average |
| Worker | 531 | 942 | 25,018 | 4,438 |
| Gyne | 543 | 982 | 19,746 | 4,648 |
| Queen | 360 | 1,344 | 30,849 | 5,675 |
| Male | 149 | 923 | 12,182 | 3,456 |