| Literature DB >> 32641764 |
Mingyue Wang1,2,3, Yang Liu1,2,3, Tinggang Wen2,3, Weiwei Liu4,5, Qionghua Gao4,5, Jie Zhao4,5, Zijun Xiong2,3, Zhifeng Wang2,3, Wei Jiang2,3, Yeya Yu2,3,6, Liang Wu1,2,3, Yue Yuan1,2,3, Xiaoyu Wei1,2,3, Jiangshan Xu1,2,3, Mengnan Cheng1,2,3, Pei Zhang2,3, Panyi Li2,3, Yong Hou2,3, Huanming Yang1,2,7, Guojie Zhang3,4,5,8, Qiye Li2,3, Chuanyu Liu9,10, Longqi Liu11,12,13.
Abstract
The emergence of social organization (eusociality) is a major event in insect evolution. Although previous studies have investigated the mechanisms underlying caste differentiation and social behavior of eusocial insects including ants and honeybees, the molecular circuits governing sociality in these insects remain obscure. In this study, we profiled the transcriptome and chromatin accessibility of brain tissues in three Monomorium pharaonis ant castes: queens (including mature and un-mated queens), males and workers. We provide a comprehensive dataset including 16 RNA-sequencing and 16 assay for transposase accessible chromatin (ATAC)-sequencing profiles. We also demonstrate strong reproducibility of the datasets and have identified specific genes and open chromatin regions in the genome that may be associated with the social function of these castes. Our data will be a valuable resource for further studies of insect behaviour, particularly the role of brain in the control of eusociality.Entities:
Mesh:
Substances:
Year: 2020 PMID: 32641764 PMCID: PMC7343836 DOI: 10.1038/s41597-020-0556-x
Source DB: PubMed Journal: Sci Data ISSN: 2052-4463 Impact factor: 6.444
Fig. 1Overview of the experimental and data analysis workflow. (a) Four different adult groups from a Monomorium pharaonis colony were collected for RNA-sequencing and ATAC-sequencing profiling. (b) Analysis workflow for RNA-sequencing and ATAC-sequencing profiles.
RNA-seq metadata and mapping statistics.
| Sample ID | Number of raw reads | Number of clean reads | Percentage of clean reads | GC% (Clean reads) | Clean_Reads_Q20(%) | Number of mapped reads | Percentage of mapped reads |
|---|---|---|---|---|---|---|---|
| Gyne_RNA_1 | 119,235,814 | 102,841,766 | 86% | 41% | 95.80 | 67,659,598 | 65.79% |
| Gyne_RNA_2 | 172,470,928 | 146,346,992 | 85% | 40% | 95.89 | 99,164,722 | 67.76% |
| Gyne_RNA_3 | 203,319,094 | 171,549,152 | 84% | 40% | 95.98 | 116,893,592 | 68.14% |
| Gyne_RNA_4 | 181,881,936 | 154,635,550 | 85% | 40% | 95.54 | 103,327,476 | 66.82% |
| Male_RNA_1 | 222,617,504 | 166,697,332 | 75% | 41% | 96.41 | 107,553,118 | 64.52% |
| Male_RNA_2 | 171,733,948 | 134,612,370 | 78% | 40% | 95.57 | 85,721,158 | 63.68% |
| Male_RNA_3 | 163,818,754 | 130,940,972 | 80% | 40% | 95.33 | 84,941,408 | 64.87% |
| Male_RNA_4 | 194,226,758 | 152,877,424 | 79% | 40% | 95.65 | 100,715,648 | 65.88% |
| Queen_RNA_1 | 132,821,280 | 111,244,700 | 84% | 41% | 95.63 | 71,886,325 | 64.62% |
| Queen_RNA_2 | 172,918,396 | 141,405,076 | 82% | 41% | 95.75 | 93,129,383 | 65.86% |
| Queen_RNA_3 | 205,540,136 | 168,019,436 | 82% | 40% | 96.07 | 111,816,936 | 66.55% |
| Queen_RNA_4 | 211,761,094 | 173,207,908 | 82% | 40% | 96.16 | 116,828,734 | 67.45% |
| Worker_RNA_1 | 197,792,502 | 161,019,502 | 81% | 41% | 96.53 | 110,733,112 | 68.77% |
| Worker_RNA_2 | 182,845,616 | 152,743,314 | 84% | 41% | 96.04 | 102,536,588 | 67.13% |
| Worker_RNA_3 | 200,398,174 | 167,348,656 | 84% | 40% | 95.93 | 112,659,116 | 67.32% |
| Worker_RNA_4 | 187,128,572 | 158,590,210 | 85% | 40% | 95.39 | 106,556,762 | 67.19% |
Fig. 2RNA-sequencing data quality metrics. (a) Mean quality values across each base position in the reads of RNA-sequencing datasets. (b) The GC content across the whole length of each sequence in read files of the RNA-sequencing datasets. (c) PCA plot of all 16 RNA-seq profiles. (d) Heatmap clustering of correlation coefficients across all 16 samples RNA-sequencing profiles. (e) Scatter plots showing the Pearson correlations between biological replicates. (f) Scatter plots showing the Pearson correlations between Qiu, B. et al. published datasets and our RNA-seq profiles.
ATAC-seq metadata and mapping statistics.
| Sample ID | Number of total reads | Number of mapped reads | Percentage of mapped reads | Number of usable reads | Percentage of usable reads | IDR peaks |
|---|---|---|---|---|---|---|
| Gyne_ATAC_1 | 172,172,820 | 163,448,845 | 94.93% | 101,679,214 | 62.21% | 38,585 |
| Gyne_ATAC_2 | 137,815,754 | 130,970,332 | 95.03% | 80,777,684 | 61.68% | 38,585 |
| Gyne_ATAC_3 | 198,553,750 | 189,962,529 | 95.67% | 121,923,602 | 64.18% | 38,585 |
| Gyne_ATAC_4 | 124,501,796 | 115,458,488 | 92.74% | 67,259,356 | 58.25% | 38,585 |
| Male_ATAC_1 | 48,790,218 | 42,106,287 | 86.30% | 11,758,208 | 27.93% | 16,685 |
| Male_ATAC_2 | 53,764,102 | 47,327,040 | 88.03% | 12,623,986 | 26.67% | 16,685 |
| Male_ATAC_3 | 45,813,866 | 40,610,146 | 88.64% | 11,407,130 | 28.09% | 16,685 |
| Male_ATAC_4 | 42,554,344 | 38,399,918 | 90.24% | 13,183,374 | 34.33% | 16,685 |
| Queen_ATAC_1 | 164,009,260 | 150,776,896 | 91.93% | 59,420,000 | 39.41% | 21,511 |
| Queen_ATAC_2 | 91,740,402 | 82,496,005 | 89.92% | 28,134,824 | 34.10% | 21,511 |
| Queen_ATAC_3 | 83,372,050 | 74,384,447 | 89.22% | 20,508,800 | 27.57% | 21,511 |
| Queen_ATAC_4 | 175,570,098 | 163,084,283 | 92.89% | 61,121,772 | 37.48% | 21,511 |
| Worker_ATAC_1 | 21,994,036 | 19,617,446 | 89.19% | 8,703,090 | 44.36% | 17,557 |
| Worker_ATAC_2 | 83,557,276 | 77,937,726 | 93.27% | 39,371,144 | 50.52% | 17,557 |
| Worker_ATAC_3 | 202,419,104 | 191,023,775 | 94.37% | 122,065,556 | 63.90% | 17,557 |
| Worker_ATAC_4 | 57,754,220 | 53,530,040 | 92.69% | 25,656,598 | 47.93% | 17,557 |
Fig. 3ATAC-sequencing data quality metrics. (a) The ATAC-sequencing signal enrichment around (2 K) the TSS for four representative samples (Gyne, Male, Queen, Worker). (b) Insert size distribution of ATAC-sequencing profiles for the same samples shown in 2a. (c) Scatter plots showing the Pearson correlations between biological replicates. (d) PCA plot of all 16 ATAC-sequencing profiles.
Fig. 4Identification of DEGs and chromatin accessible elements. (a) Histogram showing the number of DEGs (the queen is the control). (b) Histogram showing the number of DARs for the same groups shown in 4a. (c) Genome browser views of RNA-sequencing and ATAC-sequencing signals for the indicated genes and chromatin accessible-elements.
| Measurement(s) | mRNA • open_chromatin_region • brain |
| Technology Type(s) | RNA sequencing • ATAC-seq |
| Factor Type(s) | caste |
| Sample Characteristic - Organism | Monomorium pharaonis |