| Literature DB >> 30942871 |
Matthew A Conte1, Rajesh Joshi2, Emily C Moore3, Sri Pratima Nandamuri1, William J Gammerdinger1, Reade B Roberts3, Karen L Carleton1, Sigbjørn Lien2, Thomas D Kocher1.
Abstract
BACKGROUND: African cichlid fishes are well known for their rapid radiations and are a model system for studying evolutionary processes. Here we compare multiple, high-quality, chromosome-scale genome assemblies to elucidate the genetic mechanisms underlying cichlid diversification and study how genome structure evolves in rapidly radiating lineages.Entities:
Keywords: African cichlids; chromosome evolution; comparative genomics; genetic maps; genome assembly; genome rearrangements; inversion; karyotype; recombination; transposable elements
Mesh:
Substances:
Year: 2019 PMID: 30942871 PMCID: PMC6447674 DOI: 10.1093/gigascience/giz030
Source DB: PubMed Journal: Gigascience ISSN: 2047-217X Impact factor: 6.524
Anchoring comparison of O_niloticus_UMD1 and O_niloticus_UMD_NMBU
| LG | O_niloticus_UMD1 LG (bp) | O_niloticus_UMD_NMBU LG (bp) | Change (bp) |
|---|---|---|---|
| LG1 | 38,372,991 | 40,673,430 | 2,300,439 |
| LG2 | 35,256,741 | 36,523,203 | 1,266,462 |
| LG3 | 68,550,753 | 87,567,345 | 19,016,592 |
| LG4 | 38,038,224 | 35,549,522 | −2,488,702 |
| LG5 | 34,628,617 | 39,714,817 | 5,086,200 |
| LG6 | 44,571,662 | 42,433,576 | −2,138,086 |
| LG7 | 62,059,223 | 64,772,279 | 2,713,056 |
| LG8 | 30,802,437 | 30,527,416 | −275,021 |
| LG9 | 27,519,051 | 35,850,837 | 8,331,786 |
| LG10 | 32,426,571 | 34,704,454 | 2,277,883 |
| LG11 | 36,466,354 | 39,275,952 | 2,809,598 |
| LG12 | 41,232,431 | 38,600,464 | −2,631,967 |
| LG13 | 32,337,344 | 34,734,273 | 2,396,929 |
| LG14 | 39,264,731 | 40,509,636 | 1,244,905 |
| LG15 | 36,154,882 | 39,688,505 | 3,533,623 |
| LG16 | 43,860,769 | 36,041,493 | −7,819,276 |
| LG17 | 40,919,683 | 38,839,487 | −2,080,196 |
| LG18 | 37,007,722 | 38,636,442 | 1,628,720 |
| LG19 | 31,245,232 | 30,963,196 | −282,036 |
| LG20 | 36,767,035 | 37,140,374 | 373,339 |
| LG22 | 37,011,614 | 39,199,643 | 2,188,029 |
| LG23 | 44,097,196 | 45,655,644 | 1,558,448 |
| Total anchored (%) | 868,591,263 (86.0%) | 907,601,988 (90.2%) | 39,010,725 (4.2%) |
FALCON assembly results for M. zebra. NG50 and LG50 are based on an estimated genome size of 1 Gbp pair [45]. N50 and L50 sizes are provided for a-contigs and haplotigs because the size for the alternate haplotype is not known
Anchoring of the M. zebra assembly with four different genetic linkage maps. The FALCON assembly was anchored to each map separately, and the total bases anchored are shown for each LG and map. The anchored map LGs that were used for the M_zebra_UMD2 anchoring are indicated in boldface. The L. fuelleborni×T. “red cheek” map had four LGs that were combined into two (LG10a/LG10b and LG13a/LG13b). Selection of particular LGs for the final anchoring is based on accuracy and not necessarily overall length. The total lengths including unanchored contigs differ slightly because the number of gaps (100 bp) inserted were different for each anchoring
| LG |
|
|
|
| M_zebra_UMD2 |
|---|---|---|---|---|---|
| LG1 | 31,191,433 | 32,150,205 |
| 36,192,366 | 38,662,702 |
| LG2 | 25,783,542 | 28,952,651 |
| 33,362,328 | 32,647,892 |
| LG3 | 18,498,838 | 14,707,016 |
| 24,847,713 | 37,309,556 |
| LG4 | 28,418,370 | 24,424,243 |
| 23,743,562 | 30,507,480 |
| LG5 | 29,725,229 | 34,008,850 |
| 30,984,548 | 36,154,892 |
| LG6 | 15,868,181 | 32,717,361 |
| 32,438,073 | 39,760,669 |
| LG7 | 29,333,014 | 57,016,972 |
| 50,973,986 | 64,889,811 |
| LG8 | 19,307,854 | 16,999,744 |
| 18,082,738 | 23,959,896 |
| LG9 |
| 22,620,859 | 18,771,712 | 24,011,483 | 21,018,370 |
| LG10 | 25,942,318 | 26,176,893 |
| 25,149,136 | 32,346,187 |
| LG11 |
| 30,903,800 | 34,404,464 | 31,577,152 | 32,434,411 |
| LG12 | 23,231,402 | 31,401,442 |
| 31,595,605 | 34,077,077 |
| LG13 | 25,893,161 | 24,034,634 |
| 28,831,406 | 32,061,881 |
| LG14 | 32,750,971 | 32,025,991 |
| 30,978,148 | 37,855,742 |
| LG15 | 28,015,059 | 28,462,857 |
| 28,405,563 | 34,537,245 |
| LG16 | 24,665,172 | 26,935,058 |
| 29,158,962 | 34,727,877 |
| LG17 | 28,473,329 | 31,631,813 |
| 31,607,415 | 35,766,785 |
| LG18 | 19,927,984 | 23,757,304 |
| 30,047,761 | 29,494,144 |
| LG19 | 24,076,222 | 19,992,035 |
| 22,726,673 | 25,955,740 |
| LG20 | 28,281,247 | 30,800,769 | 24,975,175 |
| 29,774,176 |
| LG22 | 27,460,019 | 31,372,369 |
| 30,512,954 | 34,717,234 |
| LG23 | 27,069,552 | 27,967,022 |
| 37,848,175 | 42,076,657 |
| Total anchored (%) | 567,185,154 (59.3%) | 629,059,888 (65.7%) | 755,869,861 (79.0%) | 662,849,923 (69.3%) | 760,736,424 (79.5%) |
| Total including unanchored | 957,158,042 | 957,163,242 | 957,185,442 | 957,167,042 | 957,200,631 |
Putative inter-chromosomal differences as identified by map anchoring comparison. The number of markers aligned to each contig for each LG is indicated in parentheses. “NA” indicates that a particular map had no markers aligned to that contig
| Contig name | Contig size |
|
|
|
| Notes |
|---|---|---|---|---|---|---|
| 000084F_pilon|quiver | 2,383,905 | LG1 (1) | LG3 (3) | LG3 (6) | LG3 (3) | |
| 000105F_pilon|quiver_1_1312536 | 1,312,536 | NA | LG10a (1) | LG2 (1) | LG2 (3) | |
| 000201F_pilon|quiver | 1,489,552 | LG3 (1) | LG1 (3) | LG3 (3) | LG3 (1) | |
| 000223F_pilon|quiver | 1,452,516 | LG8 (4) | LG8 (8) | LG3 (2) | LG8 (4) | Repetitive markers on LG3 |
| 000256F_pilon|quiver | 1,241,607 | LG20 (1) | LG20 (1) | NA | LG9 (1) | |
| 000414F_pilon|quiver | 805,874 | LG5 (1) | LG5 (1) | NA | LG3 (1) | |
| 000521F_pilon|quiver | 566,343 | LG15 (2) | NA | LG17 (1) | NA | Repetitive marker on LG17 |
| 000541F_pilon|quiver | 515,490 | NA | LG2 (1) | LG3 (1) | NA | |
| 000671F_pilon|quiver | 374,096 | LG23 (1) | NA | LG23 (1) | LG22 (1) |
Figure 1:A) Chromosome mapping of ONSATA DNA reproduced and modified with permission from Ferreira et al. [16]. The SATA sequences are labelled in yellow against the background staining with propidium iodide. B) Giemsa-stained karyograms of the Lake Malawi M. lombardoi reproduced and modified with permission from Clark et al. [34]. LG3 in O. niloticus (A) and LG7 in Metriaclima (B) are labeled based on Mazzuchelli et al. [37].
Figure 2:Comparative alignment of LG23 in M. zebra and O. niloticus. Centromere repeats in each assembly are indicated by large black triangles. Anchored contigs in each assembly are shown as red arrows indicating the orientation of each contig.
Figure 3:Comparison of the four genetic maps relative to M_zebra_UMD2 for LG7, LG11, LG20, and LG23. Maps for all LGs are provided in Additional File G.
Correspondence between O. niloticus and O. latipes chromosomes. Alignment lengths are provided for chromosomes with large fusion/translocation events
| O_niloticus_UMD_NMBU chromosome | Primary medaka HSOK chromosome (alignment length) | Secondary medaka HSOK chromosome (alignment length) |
|---|---|---|
| LG1 | 3 | |
| LG2 | 10 | |
| LG3 | 18 | |
| LG4 | 8 | |
| LG5 | 5 | |
| LG6 | 1 | |
| LG7 | 6 (32 Mbp) | 12 (31 Mbp) |
| LG8 | 19 | |
| LG9 | 20 | |
| LG10 | 14 | |
| LG11 | 16 | |
| LG12 | 9 | |
| LG13 | 15 | |
| LG14 | 13 | |
| LG15 | 24 (31 Mbp) | 4 (5 Mbp) |
| LG16 | 21 | |
| LG17 | 23 (23 Mbp) | 4 (12 Mbp) |
| LG18 | 17 | |
| LG19 | 22 | |
| LG20 | 7 | |
| LG22 | 11 | |
| LG23 | 2 (23 Mbp) | 4 (17 Mbp) |
Figure 4:O_niloticus_UMD_NMBU LG7 is an ancient cichlid-specific fusion corresponding to medaka HSOK 12 and 6. Female (red) and male O. niloticus recombination curves are shown along with LD (r2 > 0.97) in black. Alignment of LG7 to medaka HSOK 12 and 6 is shown on the bottom.
Figure 5:O_niloticus_UMD_NMBU LG23 is an ancient cichlid-specific fusion corresponding to medaka HSOK 2 and part of medaka HSOK 4. Female (red) and male O. niloticus recombination curves are shown along with LD (r2 > 0.97) in black. Alignment of LG23 to medaka HSOK 2 and 4 is shown on the bottom.
Figure 6:Summary of large structural changes in African cichlid genomes. (a) Chromosome fusion events on LG7 and LG23. (b) Expansion of repetitive LG3 in the Oreochromis lineage likely in conjunction with its role as ZW sex chromosome. (c) Putative inversions in Aulonocara on LG11 and LG20. Chromosomes that have undergone a large (>6 Mbp) structural change are displayed. Other chromosomes that have not undergone a large change in the seven cichlid species studied are not shown. Likely changes in meta-/sub-metacentric (“m/sm”) and subtelomeric/acrocentric (“st/a”) chromosomes from Malawi and O. niloticus are labeled. Recombination rates are shown as LOESS smoothed curves. Male and female recombination rate curves are shown for O. niloticus. Typical recombination rate curves for Lake Malawi cichlids are usually represented by the M. mbenjii × A. koningsi map. Recombination curves in crosses involving Aulonocara are shown for LG11 and LG20 to highlight large differences in recombination on those particular chromosomes. Several rearrangements, such as LG2, are more complex than depicted in this figure. Refer to Additional File D for detailed whole-genome alignments and Additional Files F and G for detailed recombination plots. Divergence times were obtained from Kumar et al. [52].
Figure 7:Comparison of the repeat landscape in the M. zebra and O. niloticus genome assemblies.
Annotation improvement of the M_zebra_UMD2 assembly gathered from RefSeq annotation reports [61, 62]
| Feature | M_zebra_UMD1 | M_zebra_UMD2 | Difference (%) |
|---|---|---|---|
|
| 27,328 | 32,471 | 5,143 (18.8) |
| Protein coding | 24,290 | 25,898 | 1,608 (6.6) |
| Non-coding | 2,468 | 5,149 | 2,681 (108.6) |
| Pseudogenes | 443 | 1,238 | 795 (179.5) |
|
| 44,123 | 46,160 | 2,037 (4.6) |
| Fully supported | 41,957 | 43,159 | 1,202 (2.9) |
| Partial | 1,184 | 655 | −529 (−44.7) |
| With filled gaps | 796 | 246 | −550 (−69.1) |
| Known RefSeq (NM_) | 9 | 12 | 3 (33.3) |
| Model RefSeq (XM_) | 44,114 | 46,148 | 2,034 (4.6) |
|
| 3,192 | 6,209 | 3,017 (94.5) |
| Fully supported | 2,228 | 4,047 | 1,819 (81.6) |
| Model RefSeq (XR_) | 2,518 | 4,851 | 2,333 (92.7) |
|
| 44,263 | 46,358 | 2,095 (4.7) |
| Fully supported | 41,957 | 43,159 | 1,202 (2.9) |
| Partial | 1,055 | 654 | −401 (−38.0) |
| With major corrections | 358 | 478 | 120 (33.5) |
| Known RefSeq (NP_) | 9 | 12 | 3 (33.3) |
| Model RefSeq (XP_) | 44,127 | 46,161 | 2,034 (4.6) |