| Literature DB >> 35708643 |
Annabel Perry1, Suzanne E McGaugh2, Alex C Keene1, Heath Blackmon1.
Abstract
The growing use of genomics in diverse organisms provides the basis for identifying genomic and transcriptional differences across species and experimental conditions. Databases containing genomic and functional data have played critical roles in the development of numerous genetic models but most emerging models lack such databases. The Mexican tetra, Astyanax mexicanus exists as 2 morphs: surface-dwelling and cave-dwelling. There exist at least 30 cave populations, providing a system to study convergent evolution. We have generated a web-based analysis suite that integrates datasets from different studies to identify how gene transcription and genetic markers of selection differ between populations and across experimental contexts. Results of diverse studies can be analyzed in conjunction with other genetic data (e.g. Gene Ontology information), to enable biological inference from cross-study patterns and identify future avenues of research. Furthermore, the framework that we have built for A. mexicanus can be adapted for other emerging model systems.Entities:
Keywords: zzm321990 Astyanax mexicanuszzm321990 ; zzm321990 arpinzzm321990 ; database; genomics; model organism
Mesh:
Year: 2022 PMID: 35708643 PMCID: PMC9339328 DOI: 10.1093/g3journal/jkac132
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.542
Fig. 1.Design and web interface for CaveCrawler. a) The repository and module framework for the CaveCrawler model organism genomics database. Lines show the connections between different types of data stored in the repository and the user modules that draw on each data type. b) Example of the Transcription module with the results of searching for top 10% of genes that are downregulated in Pachón relative to Río Choy surface fish. c) Example of the Population Genetics module with the results of searching for the Pachón-Rascon surface fish and Rascon-Tinaja dXY values of genes associated with brain development GO IDs and visualizing these values on Scaffold 24.
Locations of specific datasets used by CaveCrawler modules.
| Publication | Specific dataset(s) | Modules |
|---|---|---|
|
|
| Gene Search Transcription |
|
| Extended Data | Gene Search Population Genetics |
|
| McGaugh.et.al.2019.Sleep.Dep.Sup.Mat.xlsx | Gene Search Transcription |
|
|
| Gene Search Population Genetics |
The first column describes the publication from which the data came, the middle column lists the file(s) from that publication which were used in CaveCrawler, and the rightmost column lists the CaveCrawler modules which integrate the indicated data. These publications are also cited in the References of this paper and on the Data Sources module of the CaveCrawler GUI.
Genes identified as outliers for FST and transcriptional regulation over the circadian cycle between surface fish and 3 different cavefish populations.
| Gene name | Gene description | Comparison | Double outlier |
| logFC |
|
|---|---|---|---|---|---|---|
|
| NA | Pachón vs Río Choy | No | 0.2780 | 0.1390 | 0.0558 |
| Molino vs Río Choy | Yes | 0.8828 | 0.2253 | 0.0035 | ||
| Tinaja vs Río Choy | No | 0.3003 | −0.0121 | 0.8625 | ||
|
| Suppressor of cytokine signaling 6b (Source: ZFIN;Acc:ZDB-GENE-030131-1670) | Pachón vs Río Choy | No | 0.8266 | −0.0433 | 0.8329 |
| Molino vs Río Choy | No | 0.8363 | 0.1074 | 0.5477 | ||
| Tinaja vs Río Choy | Yes | 0.7565 | −0.4082 | 0.0156 | ||
|
| Cytochrome P450, family 26, subfamily A, polypeptide 1 (Source: ZFIN;Acc:ZDB-GENE-990415-44) | Pachón vs Río Choy | Yes | 0.5323 | −0.6150 | 0.0074 |
| Molino vs Río Choy | No | 0.7924 | −0.1403 | 0.5548 | ||
| Tinaja vs Río Choy | Yes | 0.5365 | −0.5950 | 0.0097 | ||
|
| Actin-related protein 2/3 complex inhibitor (Source: HGNC Symbol;Acc:HGNC:28782) | Pachón vs Río Choy | Yes | 0.8612 | −0.5847 | 0.0004 |
| Molino vs Río Choy | Yes | 0.7343 | −0.4623 | 0.0001 | ||
| Tinaja vs Río Choy | Yes | 0.5762 | −0.8816 | 1.33E−10 |
F ST and logFC values for all genes which were found to be outliers for both FST and logFC in at least 1 cave-Río Choy comparison.
Fig. 2.Overlap between FST values and logFC values across multiple studies. Plots of cave-specific FST vs logFC values for all 83 genes which CaveCrawler found to have FST values, logFC values, and FST outlier (lowest 5% of divergence) designations. Double-outliers for the cave-Río Choy comparison indicated by the axes are colored in red, while the gene (arpin) which was a double-outlier across in all 3 cave-Río Choy comparisons is encircled in red. Transcription data comes from a study describing differences in regulation of circadian-related genes between morphs (Mack ). a) Pachón vs Río Choy; b) Molino vs Río Choy; and c) Tinaja vs Río Choy.